Boston Linux & Unix (BLU) Home | Calendar | Mail Lists | List Archives | Desktop SIG | Hardware Hacking SIG
Wiki | Flickr | PicasaWeb | Video | Maps & Directions | Installfests | Keysignings
Linux Cafe | Meeting Notes | Blog | Linux Links | Bling | About BLU

BLU Discuss list archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Discuss] What's the best site-crawler utility?

Daniel Barrett wrote:
> For instance, you can write a simple script to hit Special:AllPages
> (which links to every article on the wiki), and dump each page to HTML
> with curl or wget. (Special:AllPages displays only N links at a time,

Yes, but that's not humanly-readable. It's a dynamically generated 
jambalaya of HTML, JavaScript, PHP, CSS, and Ghu only knows what else.

Converting to PDF is even less useful.

Rich P.

BLU is a member of BostonUserGroups
BLU is a member of BostonUserGroups
We also thank MIT for the use of their facilities.

Valid HTML 4.01! Valid CSS!

Boston Linux & Unix /