So, hi again, everybody. I'm working on a project to create a free, complete, up-to-date and reliable _travel_guide_. It's similar in philosophy to Wikipedia or Wiktionary -- providing a Wiki-edited reference text with a free license -- but has the specific goal of making a travel guide rather than an encyclopedia.
For example, the Montreal entry for Wikitravel would have lists of hotels, restaurants, and bars with their addresses and phone numbers, prices and times for important transportation or sightseeing venues, etc., etc. All the type of thing that probably wouldn't belong in the Wikipedia Montreal entry.
Anyway, we're just getting started, and my first task is to slurp up the CIA factbook 2002 for maps and basic info on countries and get them into Wikitravel. I've downloaded the factbook, but it's in plain HTML and doesn't seem to have much semantic markup to make it easy to figure out what's what in the page.
I know that one of the first tasks for Wikipedia was slurping the factbook for 2000. I seem to remember seeing a script around in CVS for doing that, but I might have been mistaking the yearbook/ directory for factbook info.
Any suggestions? Sample code? Pointers to more structured data for the factbook?
TIA for MIA at the CIA,
~ESP
wikitech-l@lists.wikimedia.org