On Tue, Jul 20, 2010 at 5:10 AM, Daniel Kinzler daniel@brightbyte.dewrote:
Hi all
A central place for managing Bibliographic data for use with Citations is something that has been discussed by the German community for a long time. To me, it consists of two parts: a project for managing the structured data, and a machanism for uzsing that data on the wikis.
I have been working on the latter recently, and there's a working prototype: on http://prototype.wikimedia.org/wmde-sandbox-1/Wikipedia:DataTransclusion you can see how data records can be included from external sources. A demo for the actual on-wiki use can be found at http://prototype.wikimedia.org/wmde-sandbox-1/Ameisenigel#Literatur, where {{ISBN|0868400467}} is used to show the bibliographic info for that book. (side note: the prototype wikis are slow. sorry about that).
Fetching and showing the data is done using http://www.mediawiki.org/wiki/Extension:DataTransclusion. Care has been taken to make this secure and scalable.
For a first demo, I'm using teh ISBN as the key, but any kind of key could be used to reference resources other than books.
For demoing managing the data by ourselves, I have set up ab SMW instance. An example bib record is at http://prototype.wikimedia.org/wmde-bib/ISBN:0451526538, it's used across wikis at http://prototype.wikimedia.org/wmde-sandbox-1/Wikipedia:DataTransclusion. Note that changes will show delayed, as the data is cached for a while.
When discussing these things, please keep in mind that there are two components: fetching and displaying external data records, and managing structured data in a wiki style. The former is much simpler than the latter. I think we should really aim at getting both, but we can start off with transclusing external data much faster, if we allow no-so-wiki data sources. For ISBN-based queries, we could simply fetch information from http://openlibrary.org - or the open knowledge foundation's http://bibliographica.org, once it's working.
In the context of bibdex, I recommend to also have a look at http://bibsonomy.org - it's a university research project, open source, and is quite similar to bibdex (and to what citeulike used to be).
As to managing structured data ourselves: I have talked a lot with Erik Möller and Markus Krötzsch about this, and I'm in touch with the people wo make DBpedia and OntoWiki. Everyone wants this. But it's not simple at all to get it right (efficient versioning of multilingual data in a document oriented database, anyone? want inference? reasoning, even? yay...). So the plan is currently to hatch a concrete plan for this. And I imagine that bibliographical and biographical info will be among the first used cases.
Hi Daniel,
Have you considered that Lucene is the perfect backend for this kind of project? What kinds of faults do you see with it? At least in my mind, we can mold it to our needs here. It has the core capabilities found in Semantic MediaWiki, and it is fast and scalable.
I say this as a serious user of Semantic MediaWiki. I have seen that it can't scale well without an alternate backend, and I wonder what kind of monumental effort will be required to make it scale to tens or hundreds of millions of documents, each of which containing 20-50 properties. Lucene can already do this, SMW, not so much ;-)
Brian
cheers, daniel
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
wikimedia-l@lists.wikimedia.org