Home
| Calendar
| Mail Lists
| List Archives
| Desktop SIG
| Hardware Hacking SIG
Wiki | Flickr | PicasaWeb | Video | Maps & Directions | Installfests | Keysignings Linux Cafe | Meeting Notes | Linux Links | Bling | About BLU |
Hello all, Thank you for your help in this matter. I have decided to move forward with using "html scraping" I am using this code from a book on perl, and i cant seem to get it to work. I tried to modify it to search specifically for estimated pages remaining, and I want it to look for the group of numbers that is right after, but i dont seem to be doing anything right. When I run this code it prints "here it is" and nothing else. Maybe because it just finds a blank space after the designated search, im not sure. Here is a small clip of the html i am looking at: <td width="90%"> <p align="left"><font face="Arial,Helvetica" size="1" color="#000000"> Estimated Pages Remaining: </font></p> </td> <td width="10%"> <p align="right"><font face="Arial,Helvetica" size="1" color="#000000"> 6052 </font></p> </td> </tr> MY CODE: my $html = get("ipaddress"); $html =~ m{Estimated Pages Remaining:<td width="90%"> ([\d,]+) </font><br>}; my $blkpgsrem=$1; #$blkpgsrem =~ tr[,][]d; print "here it is:$blkpgsrem\n"; PS: I also have found this code in the book i am looking into, but I cant seem to understand it. $text = qq(<a href="file.html"><b>Dog</b></a>Woof\nWoof</p>); ($file, $title, $summary) = $text =~ m{<a href="(.*?)"><b>(.*?)</b></a>\s*(.*?)</p>}; It looks like it is searching for multiple things and assigning different variables to each seach parameter, however i do not know how to make this apply to my HTML. Again, I appreciate all the help you guys have been giving me, Thank you. Eric At 03:15 PM 5/22/2003 +0000, dsr at tao.merseine.nu wrote: >On Thu, May 22, 2003 at 12:00:58AM -0400, Bill Bogstad wrote: > > > > > > Assuming Eric is continuing the previous line then #2 above is false. > > In a scalar context, the (implicit) match operator returns the > > number of strings captured by the regexp. In a list context (this > > case), it returns a list consisting of all of the matched strings. > > Eric is doing the equivalent of assigning $1 to $etapagerem after the > > regexp matching completes. > > > > As for the regexp issue, I agree that his regexp is unlikely to do > > anything useful. However, I believe that it is well formed. I > > interpret it as follows: > >Thank you, I stand corrected. > >-dsr-
BLU is a member of BostonUserGroups | |
We also thank MIT for the use of their facilities. |