Hi Reid,

My responses to your responses are inline.

-------- Message original --------
Sujet: Re: [Wiki-research-l] Proposal: build a wiki literature review wiki-style
De : Reid Priedhorsky <reid@reidster.net>
Pour : wiki-research-l@lists.wikimedia.org
Date : March-22-11 11:56:24 AM

(I don't know the details of MediaWiki search well, so some of the
following may not be quite right.) What MediaWiki would give us is
fulltext search. So while it would be easy to search for "John Smith",
and that query would find papers authored by John Smith plus perhaps
other stuff; however, one cannot search for "author = John Smith" and
get only results where the author field matches John Smith and no others.

However, it does seem like Semantic MediaWiki has this type of search
and otherwise behaves much like plain MediaWiki.

I actually wasn't familiar with the full functionality of Semantic MediaWiki (http://semantic-mediawiki.org/wiki/Semantic_MediaWiki) until I looked it up after your comments. From what I can see, it certainly seems to have the capabilities to maintain all the key metadata that would be necessary for myself and I assume most other researchers (e.g. authors, dates, publication source, URLs to HTML or PDF versions, etc.).

There also appear to be various options for Semantic MediaWiki hosting:
Wikia, Referata, etc. It would be nice to not have to deal with the
sysadmin aspects of the project.

I agree that going with a reliable host would be the way to go. I think that for the nature of our project, choosing a paid Referata plan would probably be better than going for Wikia. I for one could probably easily find grant funding to keep it going.

One final note on bibliographic software: many of these claim to do
automatic import of a reference simply by pointing the software at the
publisher's web page for the references. But I have never seen this work
correctly; always, the imported data needs significant cleanup, enough
that personally I'd rather type it in manually anyway. For example,
titles of ACM papers aren't even correctly cased on the official ACM
pages (e.g., http://dx.doi.org/10.1145/1753326.1753615)!

My only experience with "scraping" pages is with Zotero, and it does it beautifully. I assume (but don't know) that the current generation of other bibliography software would also do a good job. Anyway, Zotero has a huge support community, and scrapers for major sources (including Google Scholar for articles and Amazon for books) are kept very well up to date for the most part.

Bibliographic software then also typically does not include the proper 
metadata for automatically lower-casing titles in citations. For 
example, the title "Path Selection: Novel Interaction Technique for 
Wikipedia" should be lower-cased as "Path selection: Novel interaction 
technique for Wikipedia". But so often I see papers with "Path 
selection: novel interaction technique for wikipedia". It's embarrassing.

That's definitely a software design flaw; Zotero is certainly rather bad at this point.

But, if we were writing our own (e.g.) MediaWiki -> BibTeX export
script, we could automatically note that "Novel" should be capitalized
(because it begins the subtitle) as well as provide for people to
indicate explicitly title words that should remain capitalized. (In this
instance, the proper BibTeX export syntax would be "Path Selection:
{Novel} Interaction Technique for {Wikipedia}".)

I like the idea of including export facilities in our SMW version, giving users the option of what they would like to export to.

Would it be feasible to have both, and use them concurrently so that
 researchers could use one or the other, or both, as they prefer? I'm
 thinking of something like this (for purpose of illustration, let's
call the chosen MediaWiki instance MW and the chosen dedicated online
shared bibliographic tool BT):

Bi-directional synchronization is hard to get right, particularly when 
the two sides have different data models. I think we are much
better off declaring one or the other to be the master and the rest
should remain read-only (i.e. export rather than synchronization).

I like this idea; with SMW as the primary, editable source, a read-only Zotero library imported from the SMW would work well. The problem, though, is that duplicate detection would need to prevent imports from adding existing articles. A complete overwrite would not work, since this would break article IDs for word processor integration. Zotero has been slow on implementing duplicate detection, but they finally have a very impressive solution in alpha (http://www.zotero.org/blog/new-release-multilingual-zotero-with-duplicates-detection/).

Thanks, Reid, for your great suggestions. I hope this can become a reality.

~ Chitu