We have a very old wiki which has basically never been updated for the past
decade and which was proving stubbornly resistant to updating several years
ago. And now the owner of the server has drifted away, but we do still
have control over the domain name itself. The best way that we can think
of to update everything is to scrape all of the pages/file, add them to a
brand new updated wiki on a new server, then point the domain to that new
server. Yes, user accounts will be broken, but we feel that this is the
most feasible solution unless someone else has another idea.
However, there's a lot of pages on
meritbadge.org -- which is the wiki I'm
talking about. Any suggestions for how to automate this scraping process?
I can scrape the HTML off every page, but what I really want is to get the
wikitext off of every page.
Bart Humphries
bart.humphries(a)gmail.com
(909)529-BART(2278)