Boston Linux & Unix (BLU) Home | Calendar | Mail Lists | List Archives | Desktop SIG | Hardware Hacking SIG
Wiki | Flickr | PicasaWeb | Video | Maps & Directions | Installfests | Keysignings
Linux Cafe | Meeting Notes | Linux Links | Bling | About BLU

BLU Discuss list archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Perl question



fyi --

What you are trying to do is "web scraping"... if I am telling you something
you already know, my apologies. :-)
If you didn't know the term, well, now you'll have an easier time if you
need to Google for more code examples.

-Scott




----- Original Message ----- 
From: "Eric Schwartz" <schwartz at ll.mit.edu>
To: <discuss at blu.org>
Sent: Wednesday, May 21, 2003 4:52 PM
Subject: Perl question


> Hello all,
>
> I have a perl programing question, any help you guys could offer would be
> greatly appreciated.
> I am trying to search an HTML page for specific data.  I have bolded the
> text that I want to separate.  The problem is that the next piece of HTML
> after this is for Cyan cartridge.  And I want to save BLACK CARTRIDGE:
> Estimated pages Remaining: 6798, then after that do the Cyan and so
> forth.  I don't know how to modify the code you wrote for me, to save
Black
> Cartridge, then save estimated pages remaining and the number after it,
> without confusing it for the "cyan" estimated pages remaining.  I hope
this
> question is not too confusing, as I am new to programing, I appreciate all
> your help.  Thanks again
>
> Here is an excerpt of the HTML:
>
> <td valign=     op"><font face="Arial,Helvetica" size="1">BLACK
> CARTRIDGE<br>HP Part Number:     HP C9720A</font>
>                                                  </td>
>                                                  <td valign=     op"><font
> face="Arial,Helvetica" size="1">86%<br></font>
>                                                  </td>
>                                          </tr>
>                                          <tr>
>                                                  <td width="10%"><table
> border="1" width="100%"><tr><td><img src="images/Empty_Supply_Gif.gif"
alt="">
>                                                  </td></tr></table></td>
>                                                  <td width="80%"><table
> border="1" cellspacing="0" width="100%">
> <tr>
> <td>
> <table  border="0" cellspacing="0" width="86%" height="24"
> bgcolor="#000000" valign="top">
> <tr>
> <td>&nbsp;</td>
> </tr>
> </table>
> </td>
> </tr>
> </table>
>                                                  </td>
>                                                  <td width="10%"><table
> border="1" width="100%"><tr><td><img src="images/Full_Supply_Gif.gif"
alt="">
>                                                  </td></tr></table></td>
>                                          </tr>
>                                  </table></td></tr>
>                                  <tr><td><table border="0"
> bgcolor="#FFFFFF" width="100%" cellpadding="0" cellspacing="0"
> summary="This table details consumable supply information">
>                                          <tr>
> <td width="90%">
> <p align="left"><font face="Arial,Helvetica" size="1" color="#000000">
> Estimated Pages Remaining:
> </font></p>
> </td>
> <td width="10%">
> <p align="right"><font face="Arial,Helvetica" size="1" color="#000000">
> 6798
> </font></p>
> </td>
> </tr>
>
> --------------------------------------------------------------------------
--------------------------
> Here is a piece of the perl that I am using, and does not seem to be
> working.  Remember I need to pull all of the bolded stuff in order.
>
> $buffer = get('http://ipaddress);
>
> print "\nHP 4600 PRINTER STATUS\n\n";
>
>
> ($etapagerem) = $buffer
>          =~ /BLACK CARTRIDGE\s*(?:<.*?>\s*)/s;
>
> print "Estimated Pages Remaining: $etapagerem\n";
>
>
> Thanks for all your help.
>
> Eric
> schwartz at ll.mit.edu
>
> _______________________________________________
> Discuss mailing list
> Discuss at blu.org
> http://www.blu.org/mailman/listinfo/discuss





BLU is a member of BostonUserGroups
BLU is a member of BostonUserGroups
We also thank MIT for the use of their facilities.

Valid HTML 4.01! Valid CSS!



Boston Linux & Unix / webmaster@blu.org