Home
| Calendar
| Mail Lists
| List Archives
| Desktop SIG
| Hardware Hacking SIG
Wiki | Flickr | PicasaWeb | Video | Maps & Directions | Installfests | Keysignings Linux Cafe | Meeting Notes | Linux Links | Bling | About BLU |
Hi Bill, GPL - licensed HTTrack Website Copier works well (http://www.httrack.com/). I have not tried it on a MediaWiki site, but it's pretty adept at copying websites including dynamically generated websites. They say: "It allows you to download a World Wide Web site from the Internet to a local directory, building recursively all directories, getting HTML, images, and other files from the server to your computer. HTTrack arranges the original site's relative link-structure. Simply open a page of the "mirrored" website in your browser, and you can browse the site from link to link, as if you were viewing it online. HTTrack can also update an existing mirrored site, and resume interrupted downloads. HTTrack is fully configurable, and has an integrated help system. WinHTTrack is the Windows 2000/XP/Vista/Seven release of HTTrack, and WebHTTrack the Linux/Unix/BSD release which works in your browser. There is also a command-line version 'httrack'. HTTrack is actually similar in it's result to the wget -k -m -np http://mysite that Matt mentions, but may be easier in general to use and offers a GUI to drive the options that you want. Using the MediaWiki API to export pages is another option if you have specific needs that can not be addressed by a "mirror" operation (e.g. your wiki has namespaced contents that you want to treat differently.) If you end up exporting via "Special:Export" or the API, then you will be faced with the option to convert your XML to HTML. I have some notes about wiki format conversions at https://freephile.org/wiki/index.php/Format_conversion There's pandoc. "If you need to convert files from one markup format into another, pandoc is your swiss-army knife." http://johnmacfarlane.net/pandoc/ ~ Greg Greg Rundlett On Tue, Jan 7, 2014 at 6:49 PM, Bill Horne <bill at horne.net> wrote: > I need to copy the contents of a wiki into static pages, so please > recommend a good web-crawler that can download an existing site into static > content pages. It needs to run on Debian 6.0. > > Bill > > -- > Bill Horne > 339-364-8487 > > _______________________________________________ > Discuss mailing list > Discuss at blu.org > http://lists.blu.org/mailman/listinfo/discuss >
BLU is a member of BostonUserGroups | |
We also thank MIT for the use of their facilities. |