Hi Reid,
My responses to your responses are inline.
-------- Message original --------
Sujet: Re: [Wiki-research-l] Proposal: build a wiki literature review wiki-style
De : Reid Priedhorsky <reid(a)reidster.net>
Pour : wiki-research-l(a)lists.wikimedia.org
Date : March-22-11 11:56:24 AM
(I don't know the details of MediaWiki search well, so some of the
following may not be quite right.) What MediaWiki would give us is
fulltext search. So while it would be easy to search for "John Smith",
and that query would find papers authored by John Smith plus perhaps
other stuff; however, one cannot search for "author = John Smith" and
get only results where the author field matches John Smith and no others.
However, it does seem like Semantic MediaWiki has this type of search
and otherwise behaves much like plain MediaWiki.
I actually wasn't familiar with the full functionality of Semantic MediaWiki
(
http://semantic-mediawiki.org/wiki/Semantic_MediaWiki) until I looked it up after your
comments. From what I can see, it certainly seems to have the capabilities to maintain all
the key metadata that would be necessary for myself and I assume most other researchers
(e.g. authors, dates, publication source, URLs to HTML or PDF versions, etc.).
There also appear to be various options for Semantic
MediaWiki hosting:
Wikia, Referata, etc. It would be nice to not have to deal with the
sysadmin aspects of the project.
I agree that going with a reliable host would be
the way to go. I think that for the nature of our project, choosing a paid Referata plan
would probably be better than going for Wikia. I for one could probably easily find grant
funding to keep it going.
One final note on bibliographic software: many of
these claim to do
automatic import of a reference simply by pointing the software at the
publisher's web page for the references. But I have never seen this work
correctly; always, the imported data needs significant cleanup, enough
that personally I'd rather type it in manually anyway. For example,
titles of ACM papers aren't even correctly cased on the official ACM
pages (e.g.,
http://dx.doi.org/10.1145/1753326.1753615)!
My only experience with
"scraping" pages is with Zotero, and it does it beautifully. I assume (but
don't know) that the current generation of other bibliography software would also do a
good job. Anyway, Zotero has a huge support community, and scrapers for major sources
(including Google Scholar for articles and Amazon for books) are kept very well up to date
for the most part.
Bibliographic software then also typically does not
include the proper
metadata for automatically lower-casing titles in citations. For
example, the title "Path Selection: Novel Interaction Technique for
Wikipedia" should be lower-cased as "Path selection: Novel interaction
technique for Wikipedia". But so often I see papers with "Path
selection: novel interaction technique for wikipedia". It's embarrassing.
That's definitely a software design flaw; Zotero is certainly rather bad at this
point.
But, if we were writing our own (e.g.) MediaWiki ->
BibTeX export
script, we could automatically note that "Novel" should be capitalized
(because it begins the subtitle) as well as provide for people to
indicate explicitly title words that should remain capitalized. (In this
instance, the proper BibTeX export syntax would be "Path Selection:
{Novel} Interaction Technique for {Wikipedia}".)
I like the idea of including
export facilities in our SMW version, giving users the option of what they would like to
export to.
Would it be
feasible to have both, and use them concurrently so that
researchers could use one or the other, or both, as they prefer? I'm
thinking of something like this (for purpose of illustration, let's
call the chosen MediaWiki instance MW and the chosen dedicated online
shared bibliographic tool BT):
Bi-directional synchronization is hard to get
right, particularly when
the two sides have different data models. I think we are much
better off declaring one or the other to be the master and the rest
should remain read-only (i.e. export rather than synchronization).
I like this
idea; with SMW as the primary, editable source, a read-only Zotero library imported from
the SMW would work well. The problem, though, is that duplicate detection would need to
prevent imports from adding existing articles. A complete overwrite would not work, since
this would break article IDs for word processor integration. Zotero has been slow on
implementing duplicate detection, but they finally have a very impressive solution in
alpha
(
http://www.zotero.org/blog/new-release-multilingual-zotero-with-duplicates-…).
Thanks, Reid, for your great suggestions. I hope this can become a reality.
~ Chitu