Home
| Calendar
| Mail Lists
| List Archives
| Desktop SIG
| Hardware Hacking SIG
Wiki | Flickr | PicasaWeb | Video | Maps & Directions | Installfests | Keysignings Linux Cafe | Meeting Notes | Linux Links | Bling | About BLU |
On Thu, May 22, 2003 at 12:20:47PM -0400, Eric Schwartz wrote: > Hello all, > Thank you for your help in this matter. I have decided to move > forward with using "html scraping" I am using this code from a book on > perl, and i cant seem to get it to work. I tried to modify it to search > specifically for estimated pages remaining, and I want it to look for the > group of numbers that is right after, but i dont seem to be doing anything > right. When I run this code it prints "here it is" and nothing > else. Maybe because it just finds a blank space after the designated > search, im not sure. Here is a small clip of the html i am looking at: > > <td width="90%"> > <p align="left"><font face="Arial,Helvetica" size="1" color="#000000"> > Estimated Pages Remaining: > </font></p> > </td> > <td width="10%"> > <p align="right"><font face="Arial,Helvetica" size="1" color="#000000"> > 6052 > </font></p> > </td> > </tr> Keep it simple: Pipe your html screen into this perl script (Or modify the script to read in the lines itself) Note - this relys on your html file ALWAYS having the above format. ==============CUT HERE ========================= #!/usr/bin/perl while (<STDIN>) { if ( /Estimated Pages Remaining:/ ) { # Read in five more line and print out the fifth one. $_=<STDIN> ; $_=<STDIN> ; $_=<STDIN> ; $_=<STDIN> ; $_=<STDIN> ; print "here it is: "; print $_ ; } } ==============CUT HERE ========================= Uncompiled, untested, unwarranteed. etc .. Use at your own Risk, FWIW, IANAL , IMHO blah blah blah. -- Jeff Kinz, Open-PC, Emergent Research, Hudson, MA. jkinz at kinz.org copyright 2003. Use is restricted. Any use is an acceptance of the offer at http://www.kinz.org/policy.html. Don't forget to change your password often. ----- End forwarded message ----- -- Jeff Kinz, Open-PC, Emergent Research, Hudson, MA. jkinz at kinz.org copyright 2003. Use is restricted. Any use is an acceptance of the offer at http://www.kinz.org/policy.html. Don't forget to change your password often.
BLU is a member of BostonUserGroups | |
We also thank MIT for the use of their facilities. |