Boston Linux & Unix (BLU) Home | Calendar | Mail Lists | List Archives | Desktop SIG | Hardware Hacking SIG
Wiki | Flickr | PicasaWeb | Video | Maps & Directions | Installfests | Keysignings
Linux Cafe | Meeting Notes | Blog | Linux Links | Bling | About BLU

BLU Discuss list archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Perl question



On Thu, May 22, 2003 at 12:20:47PM -0400, Eric Schwartz wrote:
> Hello all,
>          Thank you for your help in this matter.  I have decided to move 
> forward with using "html scraping"  I am using this code from a book on 
> perl, and i cant seem to get it to work.  I tried to modify it to search 
> specifically for estimated pages remaining, and I want it to look for the 
> group of numbers that is right after, but i dont seem to be doing anything 
> right.  When I run this code it prints "here it is" and nothing 
> else.  Maybe because it just finds a blank space after the designated 
> search, im not sure.  Here is a small clip of the html i am looking at:
> 
> <td width="90%">
> <p align="left"><font face="Arial,Helvetica" size="1" color="#000000">
> Estimated Pages Remaining:
> </font></p>
> </td>
> <td width="10%">
> <p align="right"><font face="Arial,Helvetica" size="1" color="#000000">
> 6052
> </font></p>
> </td>
> </tr>

Keep it simple:

Pipe your html screen into this perl script
(Or modify the script to read in the lines itself)
Note - this relys on your html file ALWAYS having the above format.


==============CUT HERE =========================
#!/usr/bin/perl


while (<STDIN>)  {
        if  ( /Estimated Pages Remaining:/ ) {

#	Read in five more line and print out the fifth one.

                $_=<STDIN> ;
                $_=<STDIN> ;
                $_=<STDIN> ;
                $_=<STDIN> ;
                $_=<STDIN> ;
		print "here it is: ";
                print $_ ;
        }
}
==============CUT HERE =========================

Uncompiled, untested, unwarranteed. etc .. Use at your own Risk,
FWIW, IANAL , IMHO   blah blah blah.


-- 
Jeff Kinz, Open-PC, Emergent Research,  Hudson, MA.  jkinz at kinz.org
copyright 2003.  Use is restricted. Any use is an 
acceptance of the offer at http://www.kinz.org/policy.html.
Don't forget to change your password often.

----- End forwarded message -----

-- 
Jeff Kinz, Open-PC, Emergent Research,  Hudson, MA.  jkinz at kinz.org
copyright 2003.  Use is restricted. Any use is an 
acceptance of the offer at http://www.kinz.org/policy.html.
Don't forget to change your password often.




BLU is a member of BostonUserGroups
BLU is a member of BostonUserGroups
We also thank MIT for the use of their facilities.

Valid HTML 4.01! Valid CSS!



Boston Linux & Unix / webmaster@blu.org