[Discuss] What's the best site-crawler utility?

Richard Pieri richard.pieri at gmail.com
Tue Jan 7 20:43:05 EST 2014


Matthew Gillen wrote:
>    wget -k -m -np http://mysite

I've tried this. It's messy at best. Wiki pages aren't static HTML. 
They're dynamically generated and they come with all sorts of style 
sheets and embedded scripts. Yes, you can get the text but it'll be text 
as rendered by a wiki. It takes a lot of work to turn it into something 
usable.

-- 
Rich P.



More information about the Discuss mailing list