In the interest of not duplicating similar efforts, might I ask what
your project is?
I am a medical doctor with a long term "hoppyist" faible for information technology and its application for the improvement of medical products and services. I dont know of any business then health care where it can be of such a vital importance to have valuable information at hands in a timely manner. Google Health and other approaches do not really satisfy me. So I decided to do something on my own.
I designed a multi-language terminology-enhanced search engine plus online information management system for (professionals in) Medicine and Health Care which I called MESHine. You can find an early alpha-version at: http://www.meshine.info.
MESHine lets you search Google, Yahoo and/or Pubmed/Medline (Google Scholar, Scirus) and harvest results directly into your personal infosphere, where you can store, collect, annotate personal information collections. Metatags (MeSH terms, keywords) can be reused instantanously to re-search again using the above search engines. If you like, you can republish your Web Collections to the public or selected user groups. Moreover, browsing the web, MESHine bookmarklet lets you add more links to existing or new Web Collections (just like delicio.us or other bookmarking services).
MESHine is an alternative browser for the MeSH thesaurus. It helps you to choose the right search words from the MeSH in your language (currently in english, german, french and italian). Basically, you can browse the MeSH Thesaurus, harvest appropriate search terms (with boolean operators) by simple box-checking into the search-box and send your query to the Web-Apis of Google, Yahoo and Pubmed/Medline which will return links to valuable information in the medical field. In principle, you dont have to type a single word (from medical terminology) by hand but you build up your complex searches by simply clicking them together...(sorry for my english..)
Now, browsing the MeSH, MESHine is designed to display links and definitions from Wikipedia, which relate to the actual MeSH term. Everytime a MeSH term is clicked, content is retrieved from Wikipedia via a rather low-performing simple scraping mechanism for Wikipedia pages. Thus, Web Collections of related Wikipedia articles are continously created and updated. More (always 10) related Wikipedia links can be retrieved and added to the actual Wikipedia Collection a 1-click mechanism which displays next Google results that relate to the actual MeSH term (Web Collection).
You will have noticed that scraping Wikipedia pages in real time results in prolonged waiting periods and user annoyance...For the reason I decided to get the Wikipedia dumps and do the whole MeSH-to-Wikipedia-Mapping offline. This approach has many advantages: A fully MeSH-interlinked Wikipedia will be available at once. Relevance ranking can be done based on transparent algorithms. Medical Wikipedia becomes semantic..Professional information from bibliographic databases meets/is explained/abridged by/ Wikipedia content. etcetc
So, what I am going to do now is the following:
1. Get actual Wikiedia-dumps (done) 2. Create full Wikipedia content-images by importing the dumps into Mediawiki (done) 3. Modify existing Mediawiki tables, so that they fit the needs, creating additional tables if necessary. (thats what I am currently doing) 4. Map MeSH thesaurus entirely on Wikipedia, getting all articles that contain MeSH terms and/or their term-families (broader/wider/neighbour terms) in title, text, category, wikilink or link. 5. Thus, getting an entirely new view on Wikipedia content, structured by hierarchical controlled medical vocabulary (MeSH), ordered by relevancy. If you will: a Medical Wikipedia is created.
(Help !!! Who is going to help me ? ;-) ) Step 4 will take me some weeks of continous computer processing, especially if step 3 is not made successfully, i.e.the indexes (!) are not set properly (and timely).
Any suggestions ? Any cooperations ?
Alex Hoelzel, http://www.meshine.info
Alexander Hölzel CEO EUTROPA AG
========================= EUTROPA Aktiengesellschaft Oelmüllerstrasse 9, D-82166 Gräfelfing, Tel 089 87130900, Fax 089 87130902 =========================