Windows Vista Forums
Vista Forums Home Join Vista Forums Windows 7 Forum Vista Tutorials Tags
Welcome to Windows Vista Forums. Our forum is dedicated to helping you find solutions with any problems, errors or issues you are experiencing with Windows Vista. The Vista forum also covers news and updates and has an extensive Windows Vista tutorial section that covers a wide range of tips and tricks.

Go Back   Vista Forums > Misc Newsgroups > .NET General

Vista - String with UTF8 in it?

Reply
 
Old 03-25-2009   #1 (permalink)
Jeff


 
 

String with UTF8 in it?

Hello,

I have a string returned from a third party library.

It is of the form "Here is some text but there is \u0026#39; in the middle
of it".

Is there an easy way for me to convert this \u part into unicode? I've tried
using the encoding classes but \u0026#39; remains intact.

Many thanks in advance!

Jeff.

--



__________ Information from ESET NOD32 Antivirus, version of virus signature database 3961 (20090325) __________

The message was checked by ESET NOD32 Antivirus.

http://www.eset.com




My System SpecsSystem Spec
Old 03-25-2009   #2 (permalink)
Jeroen Mostert


 
 

Re: String with UTF8 in it?

Jeff wrote:
Quote:

> I have a string returned from a third party library.
>
> It is of the form "Here is some text but there is \u0026#39; in the
> middle of it".
>
> Is there an easy way for me to convert this \u part into unicode? I've
> tried using the encoding classes but \u0026#39; remains intact.
>
\u0026#39; is intended to correspond with ' (the \u0026 is & is escaped
according to C# string literal conventions, apparently) and ' in turn
corresponds with ' (apostrophe) in HTML and XML escaping. It makes very
little sense to escape strings this way, so I'm guessing a few wires got
crossed. It has nothing to do with UTF-8, in any case.

As far as I know, there are no standard framework classes for unescaping
either of these mechanisms, except indirectly (the XML parser can unescape
XML encoding, obviously, and the C# compiler knows about Unicode escapes,
but using these would be overkill).

If you're certain your library consistently escapes strings this way, you
can undo it by first replacing "\u[0-9a-f]{4}" sequences and then replacing
"&#[0-9]+;" (and possibly "&#x[0-9a-f]+;") using regexes. However, I would
first look into the mechanisms that are causing these escapings if possible.
They don't appear to make sense in the first place.

--
J.
My System SpecsSystem Spec
Reply

Thread Tools


Similar Threads
Thread Forum
Add-Content -Encoding UTF8 and -Encoding Unicode Powershell bugs PowerShell
Find a string within a variable string PowerShell
problems with $var | select-string -pattern $string -q PowerShell
How export-csv deals with string versus string[] PowerShell
String PRODUCT_NAME was not found in string table Vista General


Vista Forums is an independent web site and has not been authorized,
sponsored, or otherwise approved by Microsoft Corporation.
"Windows Vista", the Start Orb, and related materials are trademarks of Microsoft Corp.
© Designer Media Ltd

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46