So, hi again, everybody. I'm working on a project to create a free, complete, up-to-date and reliable _travel_guide_. It's similar in philosophy to Wikipedia or Wiktionary -- providing a Wiki-edited reference text with a free license -- but has the specific goal of making a travel guide rather than an encyclopedia.
For example, the Montreal entry for Wikitravel would have lists of hotels, restaurants, and bars with their addresses and phone numbers, prices and times for important transportation or sightseeing venues, etc., etc. All the type of thing that probably wouldn't belong in the Wikipedia Montreal entry.
Anyway, we're just getting started, and my first task is to slurp up the CIA factbook 2002 for maps and basic info on countries and get them into Wikitravel. I've downloaded the factbook, but it's in plain HTML and doesn't seem to have much semantic markup to make it easy to figure out what's what in the page.
I know that one of the first tasks for Wikipedia was slurping the factbook for 2000. I seem to remember seeing a script around in CVS for doing that, but I might have been mistaking the yearbook/ directory for factbook info.
Any suggestions? Sample code? Pointers to more structured data for the factbook?
TIA for MIA at the CIA,
~ESP
Evan Prodromou wrote:
So, hi again, everybody. I'm working on a project to create a free, complete, up-to-date and reliable _travel_guide_. It's similar in
Sounds like www.capitancook.com
I notice that you're currently using the Creative Commons by-sa (attribution share-alike) license. That's a good license, nearly identical in spirit to the GNU FDL but not, in my opinion, strictly compatible.
Since you're only a few days in, you could still switch to GNU FDL easily enough -- this might be best for you, if you envision a lot of content sharing between your site and wikipedia (which seems natural).
I'm enthusiastic about your project, and I think the idea is excellent.
I am planning to campaign with RMS to change the GNU FDL to be cut-and-paste compatible somehow with CC ATT-SA, but it's hard to say if that would ever happen.
--Jimbo
wikitech-l@lists.wikimedia.org