In the interest of not duplicating similar efforts,
might I ask what
your project is?
I am a medical doctor with a long term "hoppyist" faible for information
technology and its application for the improvement of medical products
and services. I dont know of any business then health care where it can
be of such a vital importance to have valuable information at hands in a
timely manner. Google Health and other approaches do not really satisfy
me. So I decided to do something on my own.
I designed a multi-language terminology-enhanced search engine plus
online information management system for (professionals in) Medicine and
Health Care which I called MESHine. You can find an early alpha-version
MESHine lets you search Google, Yahoo and/or Pubmed/Medline (Google
Scholar, Scirus) and harvest results directly into your personal
infosphere, where you can store, collect, annotate personal information
collections. Metatags (MeSH terms, keywords) can be reused
instantanously to re-search again using the above search engines. If you
like, you can republish your Web Collections to the public or selected
user groups. Moreover, browsing the web, MESHine bookmarklet lets you
add more links to existing or new Web Collections (just like delicio.us
or other bookmarking services).
MESHine is an alternative browser for the MeSH thesaurus. It helps you
to choose the right search words from the MeSH in your language
(currently in english, german, french and italian). Basically, you can
browse the MeSH Thesaurus, harvest appropriate search terms (with
boolean operators) by simple box-checking into the search-box and send
your query to the Web-Apis of Google, Yahoo and Pubmed/Medline which
will return links to valuable information in the medical field. In
principle, you dont have to type a single word (from medical
terminology) by hand but you build up your complex searches by simply
clicking them together...(sorry for my english..)
Now, browsing the MeSH, MESHine is designed to display links and
definitions from Wikipedia, which relate to the actual MeSH term.
Everytime a MeSH term is clicked, content is retrieved from Wikipedia
via a rather low-performing simple scraping mechanism for Wikipedia
pages. Thus, Web Collections of related Wikipedia articles are
continously created and updated. More (always 10) related Wikipedia
links can be retrieved and added to the actual Wikipedia Collection a
1-click mechanism which displays next Google results that relate to the
actual MeSH term (Web Collection).
You will have noticed that scraping Wikipedia pages in real time results
in prolonged waiting periods and user annoyance...For the reason I
decided to get the Wikipedia dumps and do the whole
MeSH-to-Wikipedia-Mapping offline. This approach has many advantages: A
fully MeSH-interlinked Wikipedia will be available at once. Relevance
ranking can be done based on transparent algorithms. Medical Wikipedia
becomes semantic..Professional information from bibliographic databases
meets/is explained/abridged by/ Wikipedia content. etcetc
So, what I am going to do now is the following:
1. Get actual Wikiedia-dumps (done)
2. Create full Wikipedia content-images by importing the dumps into
3. Modify existing Mediawiki tables, so that they fit the needs,
creating additional tables if necessary. (thats what I am currently doing)
4. Map MeSH thesaurus entirely on Wikipedia, getting all articles that
contain MeSH terms and/or their term-families (broader/wider/neighbour
terms) in title, text, category, wikilink or link.
5. Thus, getting an entirely new view on Wikipedia content, structured
by hierarchical controlled medical vocabulary (MeSH), ordered by
relevancy. If you will: a Medical Wikipedia is created.
(Help !!! Who is going to help me ? ;-) ) Step 4 will take me some
weeks of continous computer processing, especially if step 3 is not made
successfully, i.e.the indexes (!) are not set properly (and timely).
Any suggestions ? Any cooperations ?
Alex Hoelzel, http://www.meshine.info
CEO EUTROPA AG
Oelmüllerstrasse 9, D-82166 Gräfelfing,
Tel 089 87130900, Fax 089 87130902