Home
| Calendar
| Mail Lists
| List Archives
| Desktop SIG
| Hardware Hacking SIG
Wiki | Flickr | PicasaWeb | Video | Maps & Directions | Installfests | Keysignings Linux Cafe | Meeting Notes | Linux Links | Bling | About BLU |
Plus one for HTTrack. I used it a couple of months ago to convert a terrible Joomla hacked site to HTML. It was a pain to use at first, like having to use Firefox, but it worked as advertised. Hope that helps. On Tue, Jan 7, 2014 at 10:34 PM, Greg Rundlett (freephile) <greg at freephile.com> wrote: > Hi Bill, > > GPL - licensed HTTrack Website Copier works well (http://www.httrack.com/). > I have not tried it on a MediaWiki site, but it's pretty adept at copying > websites including dynamically generated websites. > > They say: "It allows you to download a World Wide Web site from the > Internet to a local directory, building recursively all directories, > getting HTML, images, and other files from the server to your computer. > HTTrack arranges the original site's relative link-structure. Simply open a > page of the "mirrored" website in your browser, and you can browse the site > from link to link, as if you were viewing it online. HTTrack can also > update an existing mirrored site, and resume interrupted downloads. HTTrack > is fully configurable, and has an integrated help system. > > WinHTTrack is the Windows 2000/XP/Vista/Seven release of HTTrack, and > WebHTTrack the Linux/Unix/BSD release which works in your browser. There is > also a command-line version 'httrack'. > > HTTrack is actually similar in it's result to the wget -k -m -np > http://mysite that Matt mentions, but may be easier in general to use and > offers a GUI to drive the options that you want. > > Using the MediaWiki API to export pages is another option if you have > specific needs that can not be addressed by a "mirror" operation (e.g. your > wiki has namespaced contents that you want to treat differently.) If you > end up exporting via "Special:Export" or the API, then you will be faced > with the option to convert your XML to HTML. I have some notes about wiki > format conversions at https://freephile.org/wiki/index.php/Format_conversion > > There's pandoc. "If you need to convert files from one markup format into > another, pandoc is your swiss-army knife." > http://johnmacfarlane.net/pandoc/ > > ~ Greg > > Greg Rundlett > > > On Tue, Jan 7, 2014 at 6:49 PM, Bill Horne <bill at horne.net> wrote: > >> I need to copy the contents of a wiki into static pages, so please >> recommend a good web-crawler that can download an existing site into static >> content pages. It needs to run on Debian 6.0. >> >> Bill >> >> -- >> Bill Horne >> 339-364-8487 >> >> _______________________________________________ >> Discuss mailing list >> Discuss at blu.org >> http://lists.blu.org/mailman/listinfo/discuss >> > _______________________________________________ > Discuss mailing list > Discuss at blu.org > http://lists.blu.org/mailman/listinfo/discuss -- Eric Chadbourne 617.249.3377 http://theMnemeProject.org/ http://WebnerSolutions.com/
BLU is a member of BostonUserGroups | |
We also thank MIT for the use of their facilities. |