![]() |
![]() | ![]() | ![]() | ![]() | ![]() | ![]() | ![]() |
| Welcome to Windows Vista Forums. Our forum is dedicated to helping you find solutions with any problems, errors or issues you are experiencing with Windows Vista. The Vista forum also covers news and updates and has an extensive Windows Vista tutorial section that covers a wide range of tips and tricks. |
| |||||||
![]() |
| |
| | #1 (permalink) |
| | Extracting Links from an HTML document using a Script Folks: I have an HTML document that is about 100 pages long. I assembled this document from the "Articles By This Author" section of the following web page: http://www.tigersharktrading.com/authors/23/Harry-Boxer Scattered throughout this document are many links to the web. The links of interest to me all start with the ">>" characters, as seen at TigerSharkTrading, then the name of the article is given as a link. * How can I quickly extract these links and transfer same to a new file ? * Is there some type of script that can quickly accomplish this task ? Thanks, JoJo. |
My System Specs![]() |
| | #2 (permalink) |
| | Re: Extracting Links from an HTML document using a Script JoJo wrote: Quote: > Folks: > > I have an HTML document that is about 100 pages long. I assembled this > document from the "Articles By > This Author" section of the following web page: > http://www.tigersharktrading.com/authors/23/Harry-Boxer > > Scattered throughout this document are many links to the web. The links of > interest to me all start with the ">>" characters, as seen > at TigerSharkTrading, then the name of the article is given as a link. > > * How can I quickly extract these links and transfer same to a new file > ? > * Is there some type of script that can quickly accomplish this task ? > I suggest using the "all" collection (of the document object). Let's say that your links appear in an "anchor" (A) tag. Then you could get your collection of anchor tags like this: document.all.tags("A") To get the tags you want, you could "walk-the-list" with some sort of a loop (your choice, try "For Each"). The individual items would be addressed as: document.all.tags("A")(i) ' where i is your index And the number of items would be: document.all.tags("A").Length In your discussion, you mentioned the URL's, which are probably appearing as the "href" attribute of the "A" tag. My guess is that you can get the URL as: document.all.tags("A")(i).href cheers, jw ____________________________________________________________ You got questions? WE GOT ANSWERS!!! ..(but, no guarantee the answers will be applicable to the questions) |
My System Specs![]() |
| | #3 (permalink) |
| | Re: Extracting Links from an HTML document using a Script "JoJo" <swiftTrades@xxxxxx> wrote Quote: > I have an HTML document that is about 100 pages long. I assembled this > document from the "Articles By > This Author" section of the following web page: > http://www.tigersharktrading.com/authors/23/Harry-Boxer > > Scattered throughout this document are many links to the web. The links of > interest to me all start with the ">>" characters, as seen > at TigerSharkTrading, then the name of the article is given as a link. > > * How can I quickly extract these links and transfer same to a new file > ? > * Is there some type of script that can quickly accomplish this task ? As indicated by mr_unreliable, you will probable want to use the DOM objects to parse the document. I was just going to add that it appears all the links of interest are contained in SPAN objects that have a class name of 'title'. So, instead of grabbing 'all' anchors, you could grab all 'SPAN' objects and check for a className of title, and then do another grab within that object for all anchors (of which there is only one, the one you want) Something like: (warning - air code) For each sp in document.all.tags("SPAN") If sp.className = "title" Then For each ref in sp.all.tags("A") ' Save hRef to new file ex... AppendToFile ref.hRef Next End If Next Your own AppendToFile routine night as well make the file an HTML document, so you can load it in a browser and click on any interesting links.... Have fun! LFS |
My System Specs![]() |
![]() |
| Thread Tools | |
| |
Similar Threads | ||||
| Thread | Forum | |||
| html script | Chillout Room | |||
| Error message when clicking on links and extra pop up html window. | Vista mail | |||
| Html links not formatted correctly in windows mail | Vista mail | |||
| how to get the body of an html document ? | PowerShell | |||
| HTML links in Mail | Vista General | |||