![]() |
![]() | ![]() | ![]() | ![]() | ![]() | ![]() | ![]() |
| Welcome to Windows Vista Forums. Our forum is dedicated to helping you find solutions with any problems, errors or issues you are experiencing with Windows Vista. The Vista forum also covers news and updates and has an extensive Windows Vista tutorial section that covers a wide range of tips and tricks. |
| |||||||
![]() |
| |
| | #1 (permalink) |
| | Request.Form + XML + UTF-8 problem Hi, In ASP/vbScript I'm facing a problem of what seems to be double encoded UTF-8 when I serialize an XML file with some content read from the Request.Form collection. I attach a simple script showing this behaviour. Here's what it does: - There's a form with one field ("body"): let's type in the character é (UTF-8 : é) and submit the form. - This value is read with Request.Form("body") and is inserted as a text node in a simple (UTF-8) XML document (serialized as A.xml) - A similar XML document is created with the same content (é) hard- coded in the script. Serialized as B.xml Both XML document should be the identical, but when you open them in an XML viewer (that interprets UTF-8), you get the following: A.xml : <?xml version="1.0" encoding="utf-8" ?><doc>é</doc> B.xml : <?xml version="1.0" encoding="utf-8" ?><doc>é</doc> The byte value of the é character in A.xml is é and in B.xml : é So it appears that the Request objet is doing something with the encoding. I'd be very happy if someone could explain this issue and help me solve it? Cheers, Nicolas <% @LANGUAGE = "VBScript" %> <% Option Explicit %> <% Response.Buffer = True %> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http:// www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta content="text/html; charset=utf-8" http-equiv="Content-Type" /> </head> <body> <% Dim XMLDoc1 If Not(Request.Form("Submit") = Empty) Then Set XMLDoc1 = Server.CreateObject("MSXML2.DOMDocument") XMLDoc1.Async = False XMLDoc1.LoadXML("<?xml version=""1.0"" encoding=""utf-8""?><doc>" & Request.Form("body") & "</doc>") XMLDoc1.Save(Server.MapPath("A.xml")) XMLDoc1.LoadXML("<?xml version=""1.0"" encoding=""utf-8""?><doc>" & "é" & "</doc>") XMLDoc1.Save(Server.MapPath("B.xml")) Set XMLDoc1 = Nothing End If %> <form action="test.asp" method="post"> <fieldset> <textarea name="body"></textarea> <input type="submit" name="Submit" value="OK"> </fieldset> </form> </body> </html> |
My System Specs![]() |
| | #2 (permalink) |
| | Re: Request.Form + XML + UTF-8 problem OK, I'll answer myself. Add: Response.CodePage = 65001 Never heard of it before, but the line must be added to specify that the strings within the intrinsic objects are to encoded as UTF-8. Request.Form is URLEncoded, but when an item of the collection is read, it is converted by default to ANSI. A UTF-8 character encoded with 2 bytes such as é in my example was thus treated as 2 distincts characters, subsequently encoded to UTF-8 when placed in the XML document. |
My System Specs![]() |
![]() |
| Thread Tools | |
| |
Similar Threads | ||||
| Thread | Forum | |||
| Data type mismatch in criteria expression (with Request.form) | VB Script | |||
| Form Problem | PowerShell | |||
| problem with creating form with check boxes | Vista General | |||
| Hotfix Request Web Submission Form | Vista General | |||
| Http request problem | PowerShell | |||