Windows Vista Forums
Vista Forums Home Join Vista Forums Windows 7 Forum Vista Tutorials Tags
Welcome to Windows Vista Forums. Our forum is dedicated to helping you find solutions with any problems, errors or issues you are experiencing with Windows Vista. The Vista forum also covers news and updates and has an extensive Windows Vista tutorial section that covers a wide range of tips and tricks.

Go Back   Vista Forums > Misc Newsgroups > VB Script

Vista - problem download HTML in VBScript

Reply
 
Old 08-31-2009   #1 (permalink)


Vista Home Premium
 
 

problem download HTML in VBScript

'//***********************
'// Test of downloading HTML DOM object
'// Gets title OK as shown
'// Does not get title if WScript.echo is commented out
'// running on VISTA Home Premium Service Pack 1
'//***********************

dim fso
Set fso = CreateObject("Scripting.FileSystemObject")
dim logFile
Set logFile = fso.CreateTextFile("logTestHTML.txt", true)
dim URL
dim domHtml
'Set URL = "http://google.com"
logFile.writeLine("getting HTML: http://google.com")
Set domHtml= WScript.getObject("http://google.com")
WScript.echo("got HTML")
logFile.writeLine("title: " & domHtml.title)

My System SpecsSystem Spec
Old 08-31-2009   #2 (permalink)
LJB


 
 

Re: problem download HTML in VBScript


"captain kris" <guest@xxxxxx-email.com> wrote in message
news:a4c4b49330939dbed2b2d61ffd395a01@xxxxxx-gateway.com...
Quote:

>
> '//***********************
> '// Test of downloading HTML DOM object
> '// Gets title OK as shown
> '// Does not get title if WScript.echo is commented out
> '// running on VISTA Home Premium Service Pack 1
> '//***********************
>
> dim fso
> Set fso = CreateObject("Scripting.FileSystemObject")
> dim logFile
> Set logFile = fso.CreateTextFile("logTestHTML.txt", true)
> dim URL
> dim domHtml
> 'Set URL = "http://google.com"
> logFile.writeLine("getting HTML: http://google.com")
> Set domHtml= WScript.getObject("http://google.com")
> WScript.echo("got HTML")
> logFile.writeLine("title: " & domHtml.title)
>
>
> --
> captain kris
This worked for me

Dim fso
Set fso = CreateObject("Scripting.FileSystemObject")
Dim logFile
Set logFile = fso.CreateTextFile("logTestHTML.txt", True)
Dim URL
Dim domHtml
URL = "http://google.com"
logFile.WriteLine ("getting HTML: " & URL)
Set domHtml = WScript.GetObject(URL)
logFile.WriteLine ("title: " & domHtml.Title)


My System SpecsSystem Spec
Old 08-31-2009   #3 (permalink)
LJB


 
 

Re: problem download HTML in VBScript


"LJB" <postmaster@xxxxxx> wrote in message
news:uM1HBdnKKHA.1252@xxxxxx
Quote:

>
> "captain kris" <guest@xxxxxx-email.com> wrote in message
> news:a4c4b49330939dbed2b2d61ffd395a01@xxxxxx-gateway.com...
Quote:

>>
>> '//***********************
>> '// Test of downloading HTML DOM object
>> '// Gets title OK as shown
>> '// Does not get title if WScript.echo is commented out
>> '// running on VISTA Home Premium Service Pack 1
>> '//***********************
>>
>> dim fso
>> Set fso = CreateObject("Scripting.FileSystemObject")
>> dim logFile
>> Set logFile = fso.CreateTextFile("logTestHTML.txt", true)
>> dim URL
>> dim domHtml
>> 'Set URL = "http://google.com"
>> logFile.writeLine("getting HTML: http://google.com")
>> Set domHtml= WScript.getObject("http://google.com")
>> WScript.echo("got HTML")
>> logFile.writeLine("title: " & domHtml.title)
>>
>>
>> --
>> captain kris
>
> This worked for me
>
> Dim fso
> Set fso = CreateObject("Scripting.FileSystemObject")
> Dim logFile
> Set logFile = fso.CreateTextFile("logTestHTML.txt", True)
> Dim URL
> Dim domHtml
> URL = "http://google.com"
> logFile.WriteLine ("getting HTML: " & URL)
> Set domHtml = WScript.GetObject(URL)
> logFile.WriteLine ("title: " & domHtml.Title)
>
>
I spoke too soon. Its not what you wanted.


My System SpecsSystem Spec
Old 08-31-2009   #4 (permalink)
Tom Lavedas


 
 

Re: problem download HTML in VBScript

On Aug 31, 2:34*pm, captain kris <gu...@xxxxxx-email.com> wrote:
Quote:

> '//***********************
> '// Test of downloading HTML DOM object
> '// Gets title OK as shown
> '// Does not get title if WScript.echo is commented out
> '// running on VISTA Home Premium Service Pack 1
> '//***********************
>
> dim fso
> Set fso = CreateObject("Scripting.FileSystemObject")
> dim logFile
> Set logFile = fso.CreateTextFile("logTestHTML.txt", true)
> dim URL
> dim domHtml
> 'Set URL = "http://google.com"
> logFile.writeLine("getting HTML:http://google.com")
> Set domHtml= WScript.getObject("http://google.com") * *
> WScript.echo("got HTML")
> logFile.writeLine("title: " & domHtml.title)
>
> --
> captain kris
It seems to me that your mistaking a .Net functionality with something
that Wscrit's functions cannot do. Though it appears an HTMLDocument
object is being created by the GetObject, as far as I can tell, it is
an empty document.

A common way to access the content of a web page in WScript is through
the Microsoft.XMLHTTP class, something like this ...

With CreateObject("Microsoft.XMLHTTP")
.open "GET",sURL
.send
sHTMLText = .responseText
end with

Another way is to instantiate IE and navigate to the page. Then the
complete DOM for the page can be accessed via DHTML, something like
this (to find all INPUT elements and collect information about
each) ...

sURL = "www.somewhere.com/somefolder/some.html"
with CreateObject("InternetExplorer.Application")
.Navigate("http://" & sURL)
Do until .ReadyState = 4 : WScript.Sleep 100 : Loop
With .document
set cControls = .all.tags("input")
nControls = cControls.length
nIdx = 0
for each control in cControls
s = s & nIdx & ", " & control.type & vbNewLine
next
wsh.echo "Number of Controls:", nControls, vbNewline, s
End With ' document
End With ' IE

But it really depends what you are trying to accomplish.
_____________________
Tom Lavedas
My System SpecsSystem Spec
Old 08-31-2009   #5 (permalink)
Paul Randall


 
 

Re: problem download HTML in VBScript


"captain kris" <guest@xxxxxx-email.com> wrote in message
news:a4c4b49330939dbed2b2d61ffd395a01@xxxxxx-gateway.com...
Quote:

>
> '//***********************
> '// Test of downloading HTML DOM object
> '// Gets title OK as shown
> '// Does not get title if WScript.echo is commented out
> '// running on VISTA Home Premium Service Pack 1
> '//***********************
>
> dim fso
> Set fso = CreateObject("Scripting.FileSystemObject")
> dim logFile
> Set logFile = fso.CreateTextFile("logTestHTML.txt", true)
> dim URL
> dim domHtml
> 'Set URL = "http://google.com"
> logFile.writeLine("getting HTML: http://google.com")
> Set domHtml= WScript.getObject("http://google.com")
> WScript.echo("got HTML")
> logFile.writeLine("title: " & domHtml.title)
There are a lot of possibilities in play here.
1) You say:
'// Test of downloading HTML DOM object
but VBScript doesn't download objects for its use in this way.

I think think you mean that you intend to create an internet explorer object
that loads http://google.com and then use that object to obtain info using
the IE object model.

If you have ever viewed complex web pages over a dialup connection, you
would know that it takes time, sometimes a long time, to download and
display a web page. You don't say how you invoke your script, but if it is
done with WScript rather than CScript, then the pause that occurs when you
click OK on the echo box probably allows enough time for the IE Object Model
to parse the downloaded web page and have the title available for your use.

To prove it to yourself, make the following changes near the end of your
script:

....
Set domHtml= WScript.getObject("http://google.com")

Dim sMsg
While domHtml.title = ""
sMsg = sMsg & "Title not ready" & vbCrLf
WScript.Sleep 1
Wend
sMsg = sMsg & "Title is " & domHtml.title
WScript.echo(sMsg)
logFile.writeLine("title: " & domHtml.title)

Note that the DOM for the object obtained with getObject is similar to, but
is significantly less complete than the DOM for the object obtained by using
CreateObject("InternetExplorer.Application") and then navigating that object
to the URL of interest. The latter has many more methods, properties,
events, etc. available for use by your script.

-Paul Randall


My System SpecsSystem Spec
Old 09-01-2009   #6 (permalink)
Tom Lavedas


 
 

Re: problem download HTML in VBScript

On Aug 31, 5:41*pm, "Paul Randall" <Paulr...@xxxxxx> wrote:
Quote:

> "captain kris" <gu...@xxxxxx-email.com> wrote in message
>
> news:a4c4b49330939dbed2b2d61ffd395a01@xxxxxx-gateway.com...
>
>
>
>
>
Quote:

> > '//***********************
> > '// Test of downloading HTML DOM object
> > '// Gets title OK as shown
> > '// Does not get title if WScript.echo is commented out
> > '// running on VISTA Home Premium Service Pack 1
> > '//***********************
>
Quote:

> > dim fso
> > Set fso = CreateObject("Scripting.FileSystemObject")
> > dim logFile
> > Set logFile = fso.CreateTextFile("logTestHTML.txt", true)
> > dim URL
> > dim domHtml
> > 'Set URL = "http://google.com"
> > logFile.writeLine("getting HTML:http://google.com")
> > Set domHtml= WScript.getObject("http://google.com")
> > WScript.echo("got HTML")
> > logFile.writeLine("title: " & domHtml.title)
>
> There are a lot of possibilities in play here.
> 1) You say:
> '// Test of downloading HTML DOM object
> but VBScript doesn't download objects for its use in this way.
>
> I think think you mean that you intend to create an internet explorer object
> that loadshttp://google.comand then use that object to obtain info using
> the IE object model.
>
> If you have ever viewed complex web pages over a dialup connection, you
> would know that it takes time, sometimes a long time, to download and
> display a web page. *You don't say how you invoke your script, but if it is
> done with WScript rather than CScript, then the pause that occurs when you
> click OK on the echo box probably allows enough time for the IE Object Model
> to parse the downloaded web page and have the title available for your use.
>
> To prove it to yourself, make the following changes near the end of your
> script:
>
> ...
> Set domHtml= WScript.getObject("http://google.com")
>
> Dim sMsg
> While domHtml.title = ""
> *sMsg = sMsg & "Title not ready" & vbCrLf
> *WScript.Sleep 1
> Wend
> sMsg = sMsg & "Title is " & domHtml.title
> WScript.echo(sMsg)
> logFile.writeLine("title: " & domHtml.title)
>
> Note that the DOM for the object obtained with getObject is similar to, but
> is significantly less complete than the DOM for the object obtained by using
> CreateObject("InternetExplorer.Application") and then navigating that object
> to the URL of interest. *The latter has many more methods, properties,
> events, etc. available for use by your script.
>
> -Paul Randall
You are indeed correct about the wait. I retract my original
response. The document is in fact available through this interface.
I am gob-smacked, as the English say. Why hasn't this been revealed
up to now. I've been at this for about ten years now and have never
seen this approach proposed or used.

Concerning your comment about the DOM, I'm not certain that is the
most substantial difference, as I have found that though the
documentation suggests a more limited interface, the properties of
individual elements within the retrieved document are still available,
as in ...

dim fso
'Set fso = CreateObject("Scripting.FileSystemObject")
dim logFile
'Set logFile = fso.CreateTextFile("logTestHTML.txt", true)
dim URL
dim domHtml
URL = "http://www.google.com"
wsh.echo "getting HTML: " & URL
Set domHtml= WScript.getObject(URL)
for i = 1 to 10
wsh.sleep 100
if domHtml.title <> "" then exit for
next
if domHtml.title = "" then
wsh.echo "Nope"
wsh.quit 1
else
WScript.echo "got HTML"
end if
wsh.echo "title:", domHtml.title, "Domain:",domHtml.Domain
wsh.echo "Anchors:"
for each el in domHtml.All
if el.tagName = "A" then
wsh.echo el.href
end if
next
wsh.echo "Images"
for each el in domHtml.All
if el.tagName = "IMG" then
wsh.echo el.href
end if
next

This example is best run at the command prompt with cscript.exe.

The BIG difference I see in doing it this way is that there is no
console to contain the document. Therefore, anything related to
displaying the content is unavailable, such as the displayed sizes of
elements (unless they are explicitly defined in the tag, I suppose).
Also, no events are instantiated.

However, if the point is to collect a copy of the page for parsing,
this is appears to me to be a completely valid and useful approach.
Some incomplete testing seems to indicate the approach is not likely
to be too useful for collecting binary objects, such as the Image
files, but I still find it amazing that this has been there all this
time and no one told me.
_____________________
Tom Lavedas
My System SpecsSystem Spec
Old 09-01-2009   #7 (permalink)
mayayana


 
 

Re: problem download HTML in VBScript



You are indeed correct about the wait. I retract my original
response. The document is in fact available through this interface.
I am gob-smacked, as the English say. Why hasn't this been revealed
up to now.
Quote:

>
I never noticed it either, but looking it up
in Dino Esposito's book I see that he talked
about it. (p. 50) It probably wasn't noticed
because it really isn't anything new. One can
use WScript.GetObject on any file where the
file extension is associated with a program
that provides an automation object with access
to the file as an object. In other words,

WScript.GetObject("C:\somefile.html")
or
WScript.GetObject("http://www.somewhere.com/somefile.html")

are just shorthand versions of crating IE and
navigating to the page. I haven't tested it
to confirm Paul's note that it returns a limited
document object. According to Dino Esposito
it returns a normal HTMLDocument object. Either
way, it doesn't seem to be of much value. It saves
a few lines of code but it also has at least 3
disadvantages:

* It cuts off access to the running IE process that
opened the page.

* It results in confusing code, insofar as it's not
explicitly clear that IE is involved.

* It will only work in cases where the given file
extension is registered to IE as the default program.
I don't know how that works with something like
"google.com", but WScript must be interpreting
that as the default webpage file type, since it knows
enough to open it with IE. So if Firefox or Opera are
the default browser then presumably the code won't work.


My System SpecsSystem Spec
Old 09-01-2009   #8 (permalink)
Paul Randall


 
 

Re: problem download HTML in VBScript


"Tom Lavedas" <tglbatch@xxxxxx> wrote in message
news:379ab556-8c4c-4a7c-a9bc-5b6f0c55a4a9@xxxxxx
On Aug 31, 5:41 pm, "Paul Randall" <Paulr...@xxxxxx> wrote:
Quote:

> "captain kris" <gu...@xxxxxx-email.com> wrote in message
>
> news:a4c4b49330939dbed2b2d61ffd395a01@xxxxxx-gateway.com...
>
>
>
>
>
Quote:

> > '//***********************
> > '// Test of downloading HTML DOM object
> > '// Gets title OK as shown
> > '// Does not get title if WScript.echo is commented out
> > '// running on VISTA Home Premium Service Pack 1
> > '//***********************
>
Quote:

> > dim fso
> > Set fso = CreateObject("Scripting.FileSystemObject")
> > dim logFile
> > Set logFile = fso.CreateTextFile("logTestHTML.txt", true)
> > dim URL
> > dim domHtml
> > 'Set URL = "http://google.com"
> > logFile.writeLine("getting HTML:http://google.com")
> > Set domHtml= WScript.getObject("http://google.com")
> > WScript.echo("got HTML")
> > logFile.writeLine("title: " & domHtml.title)
>
> There are a lot of possibilities in play here.
> 1) You say:
> '// Test of downloading HTML DOM object
> but VBScript doesn't download objects for its use in this way.
>
> I think think you mean that you intend to create an internet explorer
> object
> that loadshttp://google.comand then use that object to obtain info using
> the IE object model.
>
> If you have ever viewed complex web pages over a dialup connection, you
> would know that it takes time, sometimes a long time, to download and
> display a web page. You don't say how you invoke your script, but if it is
> done with WScript rather than CScript, then the pause that occurs when you
> click OK on the echo box probably allows enough time for the IE Object
> Model
> to parse the downloaded web page and have the title available for your
> use.
>
> To prove it to yourself, make the following changes near the end of your
> script:
>
> ...
> Set domHtml= WScript.getObject("http://google.com")
>
> Dim sMsg
> While domHtml.title = ""
> sMsg = sMsg & "Title not ready" & vbCrLf
> WScript.Sleep 1
> Wend
> sMsg = sMsg & "Title is " & domHtml.title
> WScript.echo(sMsg)
> logFile.writeLine("title: " & domHtml.title)
>
> Note that the DOM for the object obtained with getObject is similar to,
> but
> is significantly less complete than the DOM for the object obtained by
> using
> CreateObject("InternetExplorer.Application") and then navigating that
> object
> to the URL of interest. The latter has many more methods, properties,
> events, etc. available for use by your script.
>
> -Paul Randall
You are indeed correct about the wait. I retract my original
response. The document is in fact available through this interface.
I am gob-smacked, as the English say. Why hasn't this been revealed
up to now. I've been at this for about ten years now and have never
seen this approach proposed or used.

Concerning your comment about the DOM, I'm not certain that is the
most substantial difference, as I have found that though the
documentation suggests a more limited interface, the properties of
individual elements within the retrieved document are still available,
as in ...

dim fso
'Set fso = CreateObject("Scripting.FileSystemObject")
dim logFile
'Set logFile = fso.CreateTextFile("logTestHTML.txt", true)
dim URL
dim domHtml
URL = "http://www.google.com"
wsh.echo "getting HTML: " & URL
Set domHtml= WScript.getObject(URL)
for i = 1 to 10
wsh.sleep 100
if domHtml.title <> "" then exit for
next
if domHtml.title = "" then
wsh.echo "Nope"
wsh.quit 1
else
WScript.echo "got HTML"
end if
wsh.echo "title:", domHtml.title, "Domain:",domHtml.Domain
wsh.echo "Anchors:"
for each el in domHtml.All
if el.tagName = "A" then
wsh.echo el.href
end if
next
wsh.echo "Images"
for each el in domHtml.All
if el.tagName = "IMG" then
wsh.echo el.href
end if
next

This example is best run at the command prompt with cscript.exe.

The BIG difference I see in doing it this way is that there is no
console to contain the document. Therefore, anything related to
displaying the content is unavailable, such as the displayed sizes of
elements (unless they are explicitly defined in the tag, I suppose).
Also, no events are instantiated.

However, if the point is to collect a copy of the page for parsing,
this is appears to me to be a completely valid and useful approach.
Some incomplete testing seems to indicate the approach is not likely
to be too useful for collecting binary objects, such as the Image
files, but I still find it amazing that this has been there all this
time and no one told me.
_____________________
Tom Lavedas
---------------------------------------------------------
Hi, Tom
I think you have to be in the right place at the right time, or do a
geek-read of script56.chm to hear about getting access to the DOM of a URL
this way. I remember one thread in this newsgroup:
http://groups.google.com/g/78d7fef5/...3564e2759f7ede

I am gob-smacked that my groups.google.com search for:
getObject "paul randall" "michael harris" group:*.scripting.vbscript
yields an empty list of hits when sorted by date, but a list of five hits
when sorted by relevance.

-Paul Randall


My System SpecsSystem Spec
Old 09-04-2009   #9 (permalink)


Vista Home Premium
 
 

Re: problem download HTML in VBScript

Thanks Paul and Tom - I have been internetless for a couple of days and just caught up with your comments. Using WScript.sleep to delay solved the problem. I had tried another way to delay for some reason it didn't work. However, waiting an arbitrary amount seems a hack (the full script pulls several hundred pages from the internet and searches them for data to put into a spreadsheet). I tried testing readyState without success. I am running the script using cscript now that I don't need the WScript.echo to cause a delay. I am going to try XMLHTTP when I get time.
Appreciate the help
Captain Kris living aboard a sailing catamaran in Grenada
My System SpecsSystem Spec
Old 09-04-2009   #10 (permalink)
Tom Lavedas


 
 

Re: problem download HTML in VBScript

On Sep 4, 4:01*pm, captain kris <gu...@xxxxxx-email.com> wrote:
Quote:

> Thanks Paul and Tom - I have been internetless for a couple of days and
> just caught up with your comments. Using WScript.sleep to delay solved
> the problem. I had tried another way to delay for some reason it didn't
> work. However, waiting an arbitrary amount seems a hack (the full script
> pulls several hundred pages from the internet and searches them for data
> to put into a spreadsheet). I tried testing readyState without success.
> I am running the script using cscript now that I don't need the
> WScript.echo to cause a delay. I am going to try XMLHTTP when I get
> time.
> Appreciate the help
> Captain Kris living aboard a sailing catamaran in Grenada
>
> --
> captain kris
You need to test that the domHTML.readyState = "complete" because the
object represents the document, not the browser.
___________________
Tom Lavedas
My System SpecsSystem Spec
Reply

Thread Tools


Similar Threads
Thread Forum
problem download HTML in JavaScript VB Script
Scripting.FileSystemObject with VBScript in HTML page VB Script
Some solutions I have constructed (HTA HTML vbScript) VB Script
HTML and VBscript printer management page VB Script
working on html objects using HTML DOM, VBscript VB Script


Vista Forums is an independent web site and has not been authorized,
sponsored, or otherwise approved by Microsoft Corporation.
"Windows Vista", the Start Orb, and related materials are trademarks of Microsoft Corp.
© Designer Media Ltd

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46