Windows Vista Forums
Vista Forums Home Join Vista Forums Donate Vista Tutorials Tags

Welcome to Vista Forums we are your forum to discuss Windows Vista x64 and x86 systems. Whether you need help or just want to post an idea you have on Vista, this is the forum for you.
Register at Vista forums...the world biggest Windows Vista resource Join Vista Forums Now

Go Back   Vista Forums > Microsoft Technical Newsgroups > PowerShell

Extract string from web page

Update your Vista Drivers Update Your Drivers Now!!
Closed Thread
 
Thread Tools Display Modes
Old 01-10-2007   #1 (permalink)
Brian Hoort
Guest


 

Extract string from web page

I'm trying to extract the MD5 sum from a web page into a var. I'm
close! So far I have:

# Piping the output directly into select-string didn't work for some
reason... so use temp file
(new-object System.Net.WebClient).DownloadString
"http://www.symantec.com/avcenter/download/pages/US-SAVCE.html" >
out.txt

# Note that the regex [0-9A-F]{32} didn't take, so PSH must not handle
extended regex?
get-content P:\bin\out.txt | select-string -quiet -List -pattern
[0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F]

This gets you down to the line I want, but I only want the MD5. If I
put the get-content line into a var $a, then I can do a $a.pattern --
but that returns my regex and not the MD5 sum. I know this is a stupid
question, but I can't figure out how to break out the MD5 sum out of
the line!

Thanks for your time,

bh


My System SpecsSystem Spec
Old 01-10-2007   #2 (permalink)
Keith Hill [MVP]
Guest


 

Re: Extract string from web page

"Brian Hoort" <brian.hoort@gmail.com> wrote in message news:1168465846.594914.288390@p59g2000hsd.googlegroups.com...
> I'm trying to extract the MD5 sum from a web page into a var. I'm
> close! So far I have:
>
> # Piping the output directly into select-string didn't work for some
> reason... so use temp file
> (new-object System.Net.WebClient).DownloadString
> "http://www.symantec.com/avcenter/download/pages/US-SAVCE.html" >
> out.txt
>
> # Note that the regex [0-9A-F]{32} didn't take, so PSH must not handle
> extended regex?
> get-content P:\bin\out.txt | select-string -quiet -List -pattern
> [0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F]
>
> This gets you down to the line I want, but I only want the MD5. If I
> put the get-content line into a var $a, then I can do a $a.pattern --
> but that returns my regex and not the MD5 sum. I know this is a stupid
> question, but I can't figure out how to break out the MD5 sum out of
> the line!


This will get you closer:

169> (new-object Net.WebClient).DownloadString("http://www.symantec.com/avcenter/download/pages/US-SAVCE.html") -match "(?<md5hash>[0-9a-fA-F]{32})"
True
170> $matches

Name Value
---- -----
md5hash DAE22DC8610A895DCF81BEBAC3F2EEDF
0 DAE22DC8610A895DCF81BEBAC3F2EEDF

Unfortunately that only matches the first hash. So what's the regex trick to get this to match all instances of hashes on this page?

--
Keith
My System SpecsSystem Spec
Old 01-11-2007   #3 (permalink)
Rob Campbell
Guest


 

Re: Extract string from web page

Try this:

$md5_regex =[regex] "(?<md5hash>[0-9a-fA-F]{32})+"
$page = (new-object
Net.WebClient).DownloadString("http://www.symantec.com/avcenter/download/pages/US-SAVCE.html")

$md5 = $md5_regex.match($page)

while ($md5.success)
{
$md5.value
$md5 = $md5.nextmatch()
}




"Keith Hill [MVP]" wrote:

> "Brian Hoort" <brian.hoort@gmail.com> wrote in message news:1168465846.594914.288390@p59g2000hsd.googlegroups.com...
> > I'm trying to extract the MD5 sum from a web page into a var. I'm
> > close! So far I have:
> >
> > # Piping the output directly into select-string didn't work for some
> > reason... so use temp file
> > (new-object System.Net.WebClient).DownloadString
> > "http://www.symantec.com/avcenter/download/pages/US-SAVCE.html" >
> > out.txt
> >
> > # Note that the regex [0-9A-F]{32} didn't take, so PSH must not handle
> > extended regex?
> > get-content P:\bin\out.txt | select-string -quiet -List -pattern
> > [0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F]
> >
> > This gets you down to the line I want, but I only want the MD5. If I
> > put the get-content line into a var $a, then I can do a $a.pattern --
> > but that returns my regex and not the MD5 sum. I know this is a stupid
> > question, but I can't figure out how to break out the MD5 sum out of
> > the line!

>
> This will get you closer:
>
> 169> (new-object Net.WebClient).DownloadString("http://www.symantec.com/avcenter/download/pages/US-SAVCE.html") -match "(?<md5hash>[0-9a-fA-F]{32})"
> True
> 170> $matches
>
> Name Value
> ---- -----
> md5hash DAE22DC8610A895DCF81BEBAC3F2EEDF
> 0 DAE22DC8610A895DCF81BEBAC3F2EEDF
>
> Unfortunately that only matches the first hash. So what's the regex trick to get this to match all instances of hashes on this page?
>
> --
> Keith

My System SpecsSystem Spec
Old 01-11-2007   #4 (permalink)
Nick Howell
Guest


 

Re: Extract string from web page

Or, as a one-line:

([regex]"(?<md5hash>[0-9a-fA-F]{32})").Matches((new-object
System.Net.WebClient).DownloadString("http://www.symantec.com/avcenter/download/pages
/US-SAVCE.html")) |% { $_.Value }

Nick



Rob Campbell wrote:
> Try this:
>
> $md5_regex =[regex] "(?<md5hash>[0-9a-fA-F]{32})+"
> $page = (new-object
> Net.WebClient).DownloadString("http://www.symantec.com/avcenter/download/pages/US-SAVCE.html")
>
> $md5 = $md5_regex.match($page)
>
> while ($md5.success)
> {
> $md5.value
> $md5 = $md5.nextmatch()
> }
>
>
>
>
> "Keith Hill [MVP]" wrote:
>
>> "Brian Hoort" <brian.hoort@gmail.com> wrote in message news:1168465846.594914.288390@p59g2000hsd.googlegroups.com...
>>> I'm trying to extract the MD5 sum from a web page into a var. I'm
>>> close! So far I have:
>>>
>>> # Piping the output directly into select-string didn't work for some
>>> reason... so use temp file
>>> (new-object System.Net.WebClient).DownloadString
>>> "http://www.symantec.com/avcenter/download/pages/US-SAVCE.html" >
>>> out.txt
>>>
>>> # Note that the regex [0-9A-F]{32} didn't take, so PSH must not handle
>>> extended regex?
>>> get-content P:\bin\out.txt | select-string -quiet -List -pattern
>>> [0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F][0-9A-F]
>>>
>>> This gets you down to the line I want, but I only want the MD5. If I
>>> put the get-content line into a var $a, then I can do a $a.pattern --
>>> but that returns my regex and not the MD5 sum. I know this is a stupid
>>> question, but I can't figure out how to break out the MD5 sum out of
>>> the line!

>> This will get you closer:
>>
>> 169> (new-object Net.WebClient).DownloadString("http://www.symantec.com/avcenter/download/pages/US-SAVCE.html") -match "(?<md5hash>[0-9a-fA-F]{32})"
>> True
>> 170> $matches
>>
>> Name Value
>> ---- -----
>> md5hash DAE22DC8610A895DCF81BEBAC3F2EEDF
>> 0 DAE22DC8610A895DCF81BEBAC3F2EEDF
>>
>> Unfortunately that only matches the first hash. So what's the regex trick to get this to match all instances of hashes on this page?
>>
>> --
>> Keith

My System SpecsSystem Spec
Closed Thread

Thread Tools
Display Modes



Similar Threads
Thread Thread Starter Forum Replies Last Post
Extract folder from dir path (String) Gregor PowerShell 1 05-17-2008 06:09 AM
Extract data from web page cmyers PowerShell 9 02-12-2008 01:43 PM
Extract information from web-page Nikhil R. Bhandari PowerShell 1 10-10-2007 05:56 PM
How export-csv deals with string versus string[] Marco Shaw PowerShell 2 07-13-2007 12:18 PM
String PRODUCT_NAME was not found in string table Extracampine Vista General 3 02-12-2007 06:15 AM


Update your Vista Drivers Update Your Drivers Now!!

Vistax64.com is an independent web site and has not been authorized,
sponsored, or otherwise approved by Microsoft Corporation.
"Windows Vista", the Start Orb, and related materials are trademarks of Microsoft Corp.
© Designer Media 2005-2008