[Discuss] What's the best site-crawler utility?
Greg Rundlett (freephile)
greg at freephile.com
Tue Jan 7 23:02:15 EST 2014
Also, I just discovered a MediaWiki extension written by Tim Starling that
may suit your needs. As the name implies, its for dumping to HTML.
http://www.mediawiki.org/wiki/Extension:DumpHTML
As for processing the XML produced by "export" or MediaWiki dump tools,
here is info on that XML schema
http://meta.wikimedia.org/wiki/Help:Export#Export_format
And, some of the tools you can use to process MediaWiki XML
http://wikipapers.referata.com/wiki/List_of_data_processing_tools
Greg Rundlett
More information about the Discuss
mailing list