Boston Linux & Unix (BLU) Home | Calendar | Mail Lists | List Archives | Desktop SIG | Hardware Hacking SIG
Wiki | Flickr | PicasaWeb | Video | Maps & Directions | Installfests | Keysignings
Linux Cafe | Meeting Notes | Linux Links | Bling | About BLU

BLU Discuss list archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Perl question



Hello all,
         Thank you for your help in this matter.  I have decided to move 
forward with using "html scraping"  I am using this code from a book on 
perl, and i cant seem to get it to work.  I tried to modify it to search 
specifically for estimated pages remaining, and I want it to look for the 
group of numbers that is right after, but i dont seem to be doing anything 
right.  When I run this code it prints "here it is" and nothing 
else.  Maybe because it just finds a blank space after the designated 
search, im not sure.  Here is a small clip of the html i am looking at:

<td width="90%">
<p align="left"><font face="Arial,Helvetica" size="1" color="#000000">
Estimated Pages Remaining:
</font></p>
</td>
<td width="10%">
<p align="right"><font face="Arial,Helvetica" size="1" color="#000000">
6052
</font></p>
</td>
</tr>


MY CODE:

my $html = get("ipaddress");
$html =~ m{Estimated Pages Remaining:<td width="90%"> ([\d,]+) </font><br>};
my $blkpgsrem=$1;
#$blkpgsrem =~ tr[,][]d;
print "here it is:$blkpgsrem\n";


PS:
I also have found this code in the book i am looking into, but I cant seem 
to understand it.

$text = qq(<a href="file.html"><b>Dog</b></a>Woof\nWoof</p>);
($file, $title, $summary) =
     $text =~ m{<a href="(.*?)"><b>(.*?)</b></a>\s*(.*?)</p>};
It looks like it is searching for multiple things and assigning different 
variables to each seach parameter, however i do not know how to make this 
apply to my HTML.

Again, I appreciate all the help you guys have been giving me,  Thank you.

Eric


At 03:15 PM 5/22/2003 +0000, dsr at tao.merseine.nu wrote:
>On Thu, May 22, 2003 at 12:00:58AM -0400, Bill Bogstad wrote:
> >
> >
> > Assuming Eric is continuing the previous line then #2 above is false.
> > In a scalar context, the (implicit) match operator returns the
> > number of strings captured by the regexp.  In a list context (this
> > case), it returns a list consisting of all of the matched strings.
> > Eric is doing the equivalent of assigning $1 to $etapagerem after the
> > regexp matching completes.
> >
> > As for the regexp issue, I agree that his regexp is unlikely to do
> > anything useful.  However, I believe that it is well formed.  I
> > interpret it as follows:
>
>Thank you, I stand corrected.
>
>-dsr-





BLU is a member of BostonUserGroups
BLU is a member of BostonUserGroups
We also thank MIT for the use of their facilities.

Valid HTML 4.01! Valid CSS!



Boston Linux & Unix / webmaster@blu.org