[Discuss] What's the best site-crawler utility?

Richard Pieri richard.pieri at gmail.com
Tue Jan 7 22:22:23 EST 2014


Daniel Barrett wrote:
> For instance, you can write a simple script to hit Special:AllPages
> (which links to every article on the wiki), and dump each page to HTML
> with curl or wget. (Special:AllPages displays only N links at a time,

Yes, but that's not humanly-readable. It's a dynamically generated 
jambalaya of HTML, JavaScript, PHP, CSS, and Ghu only knows what else.

Converting to PDF is even less useful.

-- 
Rich P.



More information about the Discuss mailing list