![]() |
![]() | ![]() | ![]() | ![]() | ![]() | ![]() | ![]() |
| Welcome to Windows Vista Forums. Our forum is dedicated to helping you find solutions with any problems, errors or issues you are experiencing with Windows Vista. The Vista forum also covers news and updates and has an extensive Windows Vista tutorial section that covers a wide range of tips and tricks. |
| |||||||
![]() |
| |
| | #1 (permalink) |
| | Fetch special characters like "Ñ" and absolute URL from href attribute of anchors Hi there I used the following codes to fetch the source codes from the web page (assigned to url2 in the following codes) but got two painful problems. 1. odd characters or character missed, e.g. the name "ALBARIÑO ORGANISTRUN" become "ALBARI? ORGANISTRUN" if display in Notepad++ or "ALBARI ORGANISTRUN" if display in Notepad. waht I want is "ALBARIÑO ORGANISTRUN". 2. relative url, e.g. the href value of the anchor "Blancos" is "prodtype.asp?PT_ID=107&numRecordPosition=1&strPageHistory=cat&strKeywords=&strSearchCriteria=" , But I really want is its absolute address like "http://www.elcatavinos.com/tienda/prodtype.asp?PT_ID=107&numRecordPosition=1&strPageHistory=cat&strKeywords=&strSearchCriteria=" Can any one help me sort them out? Thanks in advance! Jason __________________________________________________________ codes I used: dim url1 dim url2 dim xmlhttp dim datafile dim FS dim dataFileTs dim i dim cookie url1 = "http://www.elcatavinos.com/tienda/store/dynamicIndex.asp?sm=b1" url2 = "http://www.elcatavinos.com/tienda/product.asp?numRecordPosition=5&P_ID=25473&strPageHistory=cat&strKeywords=&SearchFor=&PT_ID=107" datafile = "c:\temp\test.dat" set FS = Wscript.CreateObject("Scripting.FileSystemObject") set datafileTs = FS.CreateTextFile(datafile, True, True) set xmlHTTP = Wscript.CreateObject("MSXML2.XMLHTTP.3.0") xmlHTTP.Open "HEAD",url1, false xmlHTTP.Send i = 0 do until xmlHTTP.readyState = 4 Wscript.Sleep 100 i = i + 1 if i > 1000 then exit do Loop cookie = xmlhttp.getResponseHeader("set-cookie") xmlHTTP.Open "GET",url2, false xmlHTTP.SetRequestHeader "set-cookie",cookie xmlHTTP.SetRequestHeader "Content-Type","text/html; charset=iso-8859-1" xmlHTTP.SetRequestHeader "Content-Location","absoluteURI" xmlHTTP.Send i = 0 do until xmlHTTP.readyState = 4 Wscript.Sleep 100 i = i + 1 if i > 1000 then exit do Loop datafileTs.Writeline xmlhttp.responseText |
My System Specs![]() |
| | #2 (permalink) |
| | Re: Fetch special characters like "Ñ" and absolute URL from href attribute of anchors "jason" <atechmark@xxxxxx> wrote in message news:%23G427DZCJHA.1228@xxxxxx Quote: > Hi there > > I used the following codes to fetch the source codes from the web page > (assigned to url2 in the following codes) but got two painful problems. > > 1. odd characters or character missed, e.g. the name "ALBARIÑO Quote: > become "ALBARI? ORGANISTRUN" if display in Notepad++ or "ALBARI Quote: > if display in Notepad. waht I want is "ALBARIÑO ORGANISTRUN". > > 2. relative url, e.g. the href value of the anchor "Blancos" is > strSearchCriteria=" Quote: > , But I really want is its absolute address like > 1&strPageHistory=cat&strKeywords=&strSearchCriteria=" Quote: > > Can any one help me sort them out? > > Thanks in advance! > > Jason > > __________________________________________________________ > codes I used: > > dim url1 > dim url2 > dim xmlhttp > dim datafile > dim FS > dim dataFileTs > dim i > dim cookie > > url1 = "http://www.elcatavinos.com/tienda/store/dynamicIndex.asp?sm=b1" > url2 = > 3&strPageHistory=cat&strKeywords=&SearchFor=&PT_ID=107" Quote: > datafile = "c:\temp\test.dat" > > set FS = Wscript.CreateObject("Scripting.FileSystemObject") > set datafileTs = FS.CreateTextFile(datafile, True, True) > set xmlHTTP = Wscript.CreateObject("MSXML2.XMLHTTP.3.0") > xmlHTTP.Open "HEAD",url1, false > xmlHTTP.Send Quote: > i = 0 > do until xmlHTTP.readyState = 4 > Wscript.Sleep 100 > i = i + 1 > if i > 1000 then exit do > Loop isn't going to change after .send, if isn't 4 after the call its never going to be. Quote: > cookie = xmlhttp.getResponseHeader("set-cookie") > > xmlHTTP.Open "GET",url2, false > xmlHTTP.SetRequestHeader "set-cookie",cookie Quote: > xmlHTTP.SetRequestHeader "Content-Type","text/html; charset=iso-8859-1" Quote: > xmlHTTP.SetRequestHeader "Content-Location","absoluteURI" value, it indicates that a response my supply an alternate absolute URI to the resource being sent by the server. Quote: > xmlHTTP.Send > i = 0 > do until xmlHTTP.readyState = 4 > Wscript.Sleep 100 > i = i + 1 > if i > 1000 then exit do > Loop Quote: > datafileTs.Writeline xmlhttp.responseText > header it is sending OR the charset is specifies doesn't match the actual encoding sent. Does the HTML returned contain a meta tag specifying the content-type/charset? Do you administer the site you are accessing? -- Anthony Jones - MVP ASP/ASP.NET |
My System Specs![]() |
| | #3 (permalink) |
| | Re: Fetch special characters like "Ñ" and absolute URL from href attribute of anchors "jason" <atechmark@xxxxxx> wrote in message news:%23G427DZCJHA.1228@xxxxxx Quote: > Hi there > > I used the following codes to fetch the source codes from the web page > (assigned to url2 in the following codes) but got two painful problems. > > 1. odd characters or character missed, e.g. the name "ALBARIÑO > ORGANISTRUN" become "ALBARI? ORGANISTRUN" if display in Notepad++ or > "ALBARI ORGANISTRUN" if display in Notepad. waht I want is "ALBARIÑO > ORGANISTRUN". > > 2. relative url, e.g. the href value of the anchor "Blancos" is > "prodtype.asp?PT_ID=107&numRecordPosition=1&strPageHistory=cat&strKeywords=&strSearchCriteria=" > , But I really want is its absolute address like > "http://www.elcatavinos.com/tienda/prodtype.asp?PT_ID=107&numRecordPosition=1&strPageHistory=cat&strKeywords=&strSearchCriteria=" > > Can any one help me sort them out? > > Thanks in advance! > > Jason > > __________________________________________________________ > codes I used: > > dim url1 > dim url2 > dim xmlhttp > dim datafile > dim FS > dim dataFileTs > dim i > dim cookie > > url1 = "http://www.elcatavinos.com/tienda/store/dynamicIndex.asp?sm=b1" > url2 = > "http://www.elcatavinos.com/tienda/product.asp?numRecordPosition=5&P_ID=25473&strPageHistory=cat&strKeywords=&SearchFor=&PT_ID=107" > datafile = "c:\temp\test.dat" > > set FS = Wscript.CreateObject("Scripting.FileSystemObject") > set datafileTs = FS.CreateTextFile(datafile, True, True) > set xmlHTTP = Wscript.CreateObject("MSXML2.XMLHTTP.3.0") > xmlHTTP.Open "HEAD",url1, false > xmlHTTP.Send > i = 0 > do until xmlHTTP.readyState = 4 > Wscript.Sleep 100 > i = i + 1 > if i > 1000 then exit do > Loop > cookie = xmlhttp.getResponseHeader("set-cookie") > > xmlHTTP.Open "GET",url2, false > xmlHTTP.SetRequestHeader "set-cookie",cookie > xmlHTTP.SetRequestHeader "Content-Type","text/html; charset=iso-8859-1" > xmlHTTP.SetRequestHeader "Content-Location","absoluteURI" > xmlHTTP.Send > i = 0 > do until xmlHTTP.readyState = 4 > Wscript.Sleep 100 > i = i + 1 > if i > 1000 then exit do > Loop > datafileTs.Writeline xmlhttp.responseText When I manually open a browser window and navigate to your url2, I get a web page that displays ALBARIÑO ORGANISTRUN. I can select and copy that two-word phrase and paste it into Notepad, where it displays properly. If I save it as ansi text, things become a little strange. If I open that ansi.txt file in notepad, I get chinese-like characters, but on another WXPSP2 computer I get exactly ten boxes. If I open it in Wordpad or IE, it displays properly. If I save as Unicode instead of ansi text, then Notepad displays it properly when I reopen the file. I'm thinking that you have an encoding problem. The statement: datafileTs.Writeline xmlhttp.responseText gets response.Text into a local unnamed variant, and then passes that to datafileTs.Writeline. Somewhere in this statement, there is a locale/encoding mismatch which is giving you a problem. Perhaps you could force datafileTs to be Unicode. -Paul Randall |
My System Specs![]() |
![]() |
| Thread Tools | |
| |
Similar Threads | ||||
| Thread | Forum | |||
| Shortcut changes when changing attribute from "Normal window" to "Maximized" | Vista General | |||
| Invisible special XP folders like "My Music" and My Videos" in Vista. | Vista General | |||
| How to restore the special "Music" and "Pictures" icons. | Vista General | |||
| How to insert the "modified time" attribute in "date taken" attribute in batch mode-in vista or theough a software? | Vista file management | |||
| How to insert the "modified time" attribute in "date taken" attrib | Vista music pictures video | |||