Somehow I lost this thread - this is great, Finn, I agree that a shared bibliographic resource need not be restricted to conferences, journals, etc, although specific meta-reviews might be.
The main obstacle for this problem of reviewing WP lit seems to be agreeing on a common method for assembling our disparate efforts into something bigger. In another thread I echoed Reid's ideas about using a wiki to accomplish this, a mediawiki instance would be ideal.
Andrea
On Fri, Mar 18, 2011 at 10:37 AM, Finn Aarup Nielsen fn@imm.dtu.dk wrote:
- Create a public Mediawiki instance.
- Decide on a relatively standardized format of reviewing each paper
(metadata formats, an infobox, how to write reviews of each, etc.) 3. Upload your existing Zotero database into this new wiki (I would be happy to write a script to do this). 4. Proceed with paper readings, with the goal that every single paper is looked at by human eyes. 5. Use this content to produce one or more review articles.
There has been some talk of a wiki for papers - also on this list as far as I remember. There is Bibdex (http://www.bibdex.com/), AcaWiki (http://acawiki.org) and I have the "Brede Wiki" (http://neuro.imm.dtu.dk/wiki/). The AcaWiki use Semantic Mediawiki (AFAIK) and I use MediaWiki templates. You can see an example here:
http://neuro.imm.dtu.dk/wiki/Putting_Wikipedia_to_the_test:_a_case_study
There is an infobox with citation information and sections on "related studies" and "critique".
It is a question though whether such more general targeted wikis are appropriate for composing a collaborative paper.
I have also begun a small Wikipedia review that I upload to our server yesterday:
http://www2.imm.dtu.dk/pubdb/views/edoc_download.php/6012/pdf/imm6012.pdf
I think I will never be able to do an exhaustive review of all papers, but my idea was to give an overview of as many aspect as possible. I think that some research published outside journals and conferences are interesting, e.g., surveys and some of the statistics performed by Erik Zachte. I don't think that Pew's survey has be peer-reviewed, so "just" including journal and conference papers is in my opinion not quite enough to give a complete picture.
/Finn
Finn Aarup Nielsen, DTU Informatics, Denmark Lundbeck Foundation Center for Integrated Molecular Brain Imaging http://www.imm.dtu.dk/~fn/ http://nru.dk/staff/fnielsen/ ___________________________________________________________________
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
-- :: Andrea Forte :: Assistant Professor :: College of Information Science and Technology, Drexel University :: http://www.andreaforte.net
I was glad to see this thread on wikiresearch-l as I have been recently discussing a similar proposal with other members of the Wikimedia Research Committee.
To make a long story short: I see major problems about *maintaining* a shared reference pool (along with a lit review system) on wiki pages, no matter how standard the format we may come up with to do so. There are excellent free and standards-based services out there designed precisely to allow groups of researchers to collaboratively import, maintain and annotate scholarly references. Zotero is one of them, others are: CiteULike, Bibsonomy, Mendeley, Connotea. My feeling is that the majority of people on this list are already using one of these services to maintain their individual reference library.
The reason why these services are superior to a wiki page is that they can both produce human-readable reference lists as well as export references in any possible format one may need for writing (JSON, XML, BibTeX, JabRef etc). They all have provisions for posting reviews, tags, notes etc., aggregate these annotations from several users (unless they are private) and export them.
If we were to keep our shared reference pool hosted on any of these services we could still: * embed or republish a list of references elsewhere (e.g. in a wiki) * make sure the list of references in the wiki is kept up-to-date via the external service * allow people to access bibliographic metadata in a format suitable for writing and in an environment they are already familiar with * allow people to write reviews and annotations both on the wiki and via the external service itself
If we think there is added value in hosting reviews on a wiki, what needs to be implemented is a connector between MediaWiki and any of these services (they all have open APIs).
Such a connector would presumably: * pull bibliographic metadata from the external service * define a unique ID for each publication (based on a DOI when available) * create a wiki page per publication using the unique ID as a title and populating it with the imported metadata * retrieve live user annotations and comments from the external service * allow to host further comments and annotations on the wiki via the article page
It would be sad to see a lot of effort put into creating yet another static wiki-based bibliography just to see it become obsolete because no one is actively maintaining it or because the output it produces is in a format that does not allow it to be easily queried, reused or republished.
Dario
On 18 Mar 2011, at 15:09, Andrea Forte wrote:
Somehow I lost this thread - this is great, Finn, I agree that a shared bibliographic resource need not be restricted to conferences, journals, etc, although specific meta-reviews might be.
The main obstacle for this problem of reviewing WP lit seems to be agreeing on a common method for assembling our disparate efforts into something bigger. In another thread I echoed Reid's ideas about using a wiki to accomplish this, a mediawiki instance would be ideal.
Andrea
On Fri, Mar 18, 2011 at 10:37 AM, Finn Aarup Nielsen fn@imm.dtu.dk wrote:
- Create a public Mediawiki instance.
- Decide on a relatively standardized format of reviewing each paper
(metadata formats, an infobox, how to write reviews of each, etc.) 3. Upload your existing Zotero database into this new wiki (I would be happy to write a script to do this). 4. Proceed with paper readings, with the goal that every single paper is looked at by human eyes. 5. Use this content to produce one or more review articles.
There has been some talk of a wiki for papers - also on this list as far as I remember. There is Bibdex (http://www.bibdex.com/), AcaWiki (http://acawiki.org) and I have the "Brede Wiki" (http://neuro.imm.dtu.dk/wiki/). The AcaWiki use Semantic Mediawiki (AFAIK) and I use MediaWiki templates. You can see an example here:
http://neuro.imm.dtu.dk/wiki/Putting_Wikipedia_to_the_test:_a_case_study
There is an infobox with citation information and sections on "related studies" and "critique".
It is a question though whether such more general targeted wikis are appropriate for composing a collaborative paper.
I have also begun a small Wikipedia review that I upload to our server yesterday:
http://www2.imm.dtu.dk/pubdb/views/edoc_download.php/6012/pdf/imm6012.pdf
I think I will never be able to do an exhaustive review of all papers, but my idea was to give an overview of as many aspect as possible. I think that some research published outside journals and conferences are interesting, e.g., surveys and some of the statistics performed by Erik Zachte. I don't think that Pew's survey has be peer-reviewed, so "just" including journal and conference papers is in my opinion not quite enough to give a complete picture.
/Finn
Finn Aarup Nielsen, DTU Informatics, Denmark
Lundbeck Foundation Center for Integrated Molecular Brain Imaging http://www.imm.dtu.dk/~fn/ http://nru.dk/staff/fnielsen/ ___________________________________________________________________
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
-- :: Andrea Forte :: Assistant Professor :: College of Information Science and Technology, Drexel University :: http://www.andreaforte.net
-- :: Andrea Forte :: Assistant Professor :: College of Information Science and Technology, Drexel University :: http://www.andreaforte.net
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Hi all,
I've been following this discussion with interest. Please let me add some comments inline, complementing Dario's answer.
----- Mensaje original ----
De: Dario Taraborelli dtaraborelli@wikimedia.org Para: aforte@gatech.edu; Research into Wikimedia content and communities wiki-research-l@lists.wikimedia.org Enviado: vie,18 marzo, 2011 17:30 Asunto: Re: [Wiki-research-l] Fwd: Proposal: build a wiki literature review wiki-style (was: Re: Wikipedia literature review - include or exclude conference articles)
I was glad to see this thread on wikiresearch-l as I have been recently discussing a similar proposal with other members of the Wikimedia Research Committee.
To make a long story short: I see major problems about *maintaining* a shared reference pool (along with a lit review system) on wiki pages, no matter how standard the format we may come up with to do so. There are excellent free and standards-based services out there designed precisely to allow groups of researchers to collaboratively import, maintain and annotate scholarly references. Zotero is one of them, others are: CiteULike, Bibsonomy, Mendeley, Connotea. My feeling is that the majority of people on this list are already using one of these services to maintain their individual reference library.
===========
This has been a frequent request, as well as a very old discussion (at least, I can trace it back to WikiSym 2007, and continuing in the Workshop on Interdisciplinary Research on Wikipedia, a.k.a. WIRW, in WikiSym 2008).
In fact, after our discussion in Porto we proposed to create a Wikiresearch Portal (we initially tried to call it a planet, which was a very bad idea....). I think we finally deleted the domain and removed the virtual machine from our servers at Libresoft a few months ago, after years of inactivity.
I think Jakob Voss was the first (or one of the first) providing a comprehensive compilation of research literature related to Wikipedia, including tags to categorize content and keyword search. I remember I got many references from Jakob's repo when I was starting to work in my dissertation, and it was quite useful. It is no longer available, AFAIK, and I'm not sure if all that info was migrated to newer repositories.
After that, I can remember that there was a French? university that also offered a searchable compilation of Wikipedia research papers. Unfortunately, I think I lost the link, and I cannot find it any more.
For the last session of Wikipedia research in Wikimania 2010, I worked with Benjamin Mako and Jodi Schneider to filter out available references. The initial pool exceeded 3,000 references, so you can imagine this is a really daunting task (that's why we stressed the disclaimer that it wasn't a comprehensive or complete review).
Mako and Jodi introduced me to AcaWiki. I think the idea is very good, and it also reminds me of similar initiatives in other areas (like PLoS ONE: http://www.plosone.org/home.action). However, I think the number of references reviewed there is still low.
A very positive point with AcaWiki is that it is free licensed. Zotero would be a good alternative, but I had to uninstall it from my Firefox, since it was taking ages to start the browser. There are plans for a standalone version, and also to improve the UI. The rest of web services are good for maintaining compilations (though each one has its own caveats) but usually bad for direct exchange of metadata (you always need to use intermediate formats like BibTex to migrate your info). Mendeley has thrilling features, but I learned that it is proprietary (from the EULA of the standalone version), and honestly I'm not sure if they will start to charge for the service at some point, or modify their API or service agreement (just see what's happening with Twitter).
===========
The reason why these services are superior to a wiki page is that they can both produce human-readable reference lists as well as export references in any possible format one may need for writing (JSON, XML, BibTeX, JabRef etc). They all have provisions for posting reviews, tags, notes etc., aggregate these annotations from several users (unless they are private) and export them.
If we were to keep our shared reference pool hosted on any of these services we could still: * embed or republish a list of references elsewhere (e.g. in a wiki) * make sure the list of references in the wiki is kept up-to-date via the external service * allow people to access bibliographic metadata in a format suitable for writing and in an environment they are already familiar with * allow people to write reviews and annotations both on the wiki and via the external service itself
If we think there is added value in hosting reviews on a wiki, what needs to be implemented is a connector between MediaWiki and any of these services (they all have open APIs).
Such a connector would presumably: * pull bibliographic metadata from the external service * define a unique ID for each publication (based on a DOI when available) * create a wiki page per publication using the unique ID as a title and populating it with the imported metadata * retrieve live user annotations and comments from the external service * allow to host further comments and annotations on the wiki via the article page
It would be sad to see a lot of effort put into creating yet another static wiki-based bibliography just to see it become obsolete because no one is actively maintaining it or because the output it produces is in a format that does not allow it to be easily queried, reused or republished.
Dario
======
I can confirm that the #1 complain I get from colleagues when I point them to the static Wikiresearch bibliography pages in meta is: "it is not easily searchable" (by keywords, content, year, author, etc.). If we plan to have annotations or reviews, in addition to this, I believe a standard wiki (such as MediaWiki) is simply not the way to go (disclaimer: despite I'm a great fan and advocate of wikis, and I like many features in MediaWiki).
So, my suggestions are:
1. Use a platform allowing extensive search capabilities (perhaps semantic wikis, but I haven't tested many of them, yet). In this case, I do think that new NoSQL alternatives might be in place to search through text in reviews. 2. Understand multiple formats to introduce new refereneces (including importing from major existing compilations, and web content like CiteULike). 3. Include feature to rate papers according to different criteria (number of positive reviews, number of citations, or combinations of several search conditions).
Of course, I'd be very glad to help with this initiative if it is finally launched (once again).
Best, Felipe.
======
On 18 Mar 2011, at 15:09, Andrea Forte wrote:
Somehow I lost this thread - this is great, Finn, I agree that a shared bibliographic resource need not be restricted to conferences, journals, etc, although specific meta-reviews might be.
The main obstacle for this problem of reviewing WP lit seems to be agreeing on a common method for assembling our disparate efforts into something bigger. In another thread I echoed Reid's ideas about using a wiki to accomplish this, a mediawiki instance would be ideal.
Andrea
On Fri, Mar 18, 2011 at 10:37 AM, Finn Aarup Nielsen fn@imm.dtu.dk wrote:
- Create a public Mediawiki instance.
- Decide on a relatively standardized format of reviewing each paper
(metadata formats, an infobox, how to write reviews of each, etc.) 3. Upload your existing Zotero database into this new wiki (I would be happy to write a script to do this). 4. Proceed with paper readings, with the goal that every single paper is looked at by human eyes. 5. Use this content to produce one or more review articles.
There has been some talk of a wiki for papers - also on this list as far as I remember. There is Bibdex (http://www.bibdex.com/), AcaWiki (http://acawiki.org) and I have the "Brede Wiki" (http://neuro.imm.dtu.dk/wiki/). The AcaWiki use Semantic Mediawiki (AFAIK) and I use MediaWiki templates. You can see an example here:
http://neuro.imm.dtu.dk/wiki/Putting_Wikipedia_to_the_test:_a_case_study
There is an infobox with citation information and sections on "related studies" and "critique".
It is a question though whether such more general targeted wikis are appropriate for composing a collaborative paper.
I have also begun a small Wikipedia review that I upload to our server yesterday:
http://www2.imm.dtu.dk/pubdb/views/edoc_download.php/6012/pdf/imm6012.pdf
I think I will never be able to do an exhaustive review of all papers, but my idea was to give an overview of as many aspect as possible. I think that some research published outside journals and conferences are interesting, e.g., surveys and some of the statistics performed by Erik Zachte. I don't think that Pew's survey has be peer-reviewed, so "just" including journal and conference papers is in my opinion not quite enough to give a complete picture.
/Finn
Finn Aarup Nielsen, DTU Informatics, Denmark
Lundbeck Foundation Center for Integrated Molecular Brain Imaging http://www.imm.dtu.dk/~fn/ http://nru.dk/staff/fnielsen/ ___________________________________________________________________
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
-- :: Andrea Forte :: Assistant Professor :: College of Information Science and Technology, Drexel University :: http://www.andreaforte.net
-- :: Andrea Forte :: Assistant Professor :: College of Information Science and Technology, Drexel University :: http://www.andreaforte.net
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
_______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Hi everyone,
I see two distinct lines forming from this thread. Most people are advocating a MediaWiki instance dedicated to semantically rich shared summaries of scholarly articles (primarily Wikipeda-focused, but such a platform could obviously be used for any kind of scholarly article). However, Dario and Felipe seem to be arguing (Dario more than Felipe) that a dedicated bibliographic software tool would be far more useful and easier to maintain long term.
Personally, with the energy I see on the list, I believe that a MediaWiki instance (e.g. AcaWiki or BredeWiki) would be viable, if we tried it again, this time. I for one would be very happy to make such a resource my repository of research summaries--I am currently building a collection on my personal website, but a shared resource would obviously be 1,000 times better. If one MediaWiki instance could be agreed on by this community, I believe we could definitely build a very valuable, ongoing resource. It would be far easier to edit and contribute to such an instance than to any chosen dedicated bibliographic tool.
However, I do strongly share Dario's feelings that a dedicated bibliographic tool would be far more useful for a variety of uses. One of my most important functionalities would be automatic citations into papers that I'm working with. I haven't used a wide variety of citation managers, but the functionality in EndNote and Zotero is what I'm talking about; I just don't see how a MediaWiki instance could do that, unless some standardized bibliographic information be embedded into each article page to begin with. Moreover, as Dario and Felipe explained, while far from perfect, the search capabilities of dedicated bibliography managers is far superior to what I presently see in MediaWiki.
Given that there is the need for both resources (MediaWiki instance and a dedicated bibliographic tool), I think it is much easier to automatically and regularly export from a shared resource like Zotero (for example) to the MediaWiki tool than to go the other way around. For this reason, I tend to favour Dario's proposal.
Would it be feasible to have both, and use them concurrently so that researchers could use one or the other, or both, as they prefer? I'm thinking of something like this (for purpose of illustration, let's call the chosen MediaWiki instance MW and the chosen dedicated online shared bibliographic tool BT):
* All the articles would be represented on both MW and BT. * Automatic exports would be enabled via custom scripts from BT to MW; there would be no automatic script for going the other direction. * Each MW article would have two sections: -- The top section would be automatically generated from BT, and would not be user editable from within MW; it could only be edited by the equally open BT. -- The bottom section would be standard MW editable text. Users would be expected to not duplicate information already in the top BT section. * Users who use both MW and BT could copy contributions from the MW bottom (editable) section into BT. Then after these are automatically exported (perhaps once a week), MW users could remove the duplicate information from MW. * Thus, MW would always have a superset of BT information (at least, after the weekly export).
The benefit is that those who need full BT functionality would have it, along with most of the summary information included; those who don't need the functionality and don't want to deal with the disadvantages of the BT technology would have all the BT information (including metadata) within MW, and would also be able to contribute to MW, which would always have at least as much information as BT.
Does this make sense? Would it be useful? Does it sound feasible?
Regards, Chitu
-------- Message original -------- Sujet: Re: [Wiki-research-l] Fwd: Proposal: build a wiki literature review wiki-style (was: Re: Wikipedia literature review - include or exclude conference articles) De : Felipe Ortega glimmer_phoenix@yahoo.es Pour : Research into Wikimedia content and communities wiki-research-l@lists.wikimedia.org Date : 18/03/2011 2:29 PM
Hi all,
I've been following this discussion with interest. Please let me add some comments inline, complementing Dario's answer.
----- Mensaje original ----
De: Dario Taraborellidtaraborelli@wikimedia.org Para: aforte@gatech.edu; Research into Wikimedia content and communities wiki-research-l@lists.wikimedia.org Enviado: vie,18 marzo, 2011 17:30 Asunto: Re: [Wiki-research-l] Fwd: Proposal: build a wiki literature review wiki-style (was: Re: Wikipedia literature review - include or exclude conference articles)
I was glad to see this thread on wikiresearch-l as I have been recently discussing a similar proposal with other members of the Wikimedia Research Committee.
To make a long story short: I see major problems about *maintaining* a shared reference pool (along with a lit review system) on wiki pages, no matter how standard the format we may come up with to do so. There are excellent free and standards-based services out there designed precisely to allow groups of researchers to collaboratively import, maintain and annotate scholarly references. Zotero is one of them, others are: CiteULike, Bibsonomy, Mendeley, Connotea. My feeling is that the majority of people on this list are already using one of these services to maintain their individual reference library.
=========
This has been a frequent request, as well as a very old discussion (at least, I can trace it back to WikiSym 2007, and continuing in the Workshop on Interdisciplinary Research on Wikipedia, a.k.a. WIRW, in WikiSym 2008).
In fact, after our discussion in Porto we proposed to create a Wikiresearch Portal (we initially tried to call it a planet, which was a very bad idea....). I think we finally deleted the domain and removed the virtual machine from our servers at Libresoft a few months ago, after years of inactivity.
I think Jakob Voss was the first (or one of the first) providing a comprehensive compilation of research literature related to Wikipedia, including tags to categorize content and keyword search. I remember I got many references from Jakob's repo when I was starting to work in my dissertation, and it was quite useful. It is no longer available, AFAIK, and I'm not sure if all that info was migrated to newer repositories.
After that, I can remember that there was a French? university that also offered a searchable compilation of Wikipedia research papers. Unfortunately, I think I lost the link, and I cannot find it any more.
For the last session of Wikipedia research in Wikimania 2010, I worked with Benjamin Mako and Jodi Schneider to filter out available references. The initial pool exceeded 3,000 references, so you can imagine this is a really daunting task (that's why we stressed the disclaimer that it wasn't a comprehensive or complete review).
Mako and Jodi introduced me to AcaWiki. I think the idea is very good, and it also reminds me of similar initiatives in other areas (like PLoS ONE: http://www.plosone.org/home.action). However, I think the number of references reviewed there is still low.
A very positive point with AcaWiki is that it is free licensed. Zotero would be a good alternative, but I had to uninstall it from my Firefox, since it was taking ages to start the browser. There are plans for a standalone version, and also to improve the UI. The rest of web services are good for maintaining compilations (though each one has its own caveats) but usually bad for direct exchange of metadata (you always need to use intermediate formats like BibTex to migrate your info). Mendeley has thrilling features, but I learned that it is proprietary (from the EULA of the standalone version), and honestly I'm not sure if they will start to charge for the service at some point, or modify their API or service agreement (just see what's happening with Twitter).
=========
The reason why these services are superior to a wiki page is that they can both produce human-readable reference lists as well as export references in any possible format one may need for writing (JSON, XML, BibTeX, JabRef etc). They all have provisions for posting reviews, tags, notes etc., aggregate these annotations from several users (unless they are private) and export them.
If we were to keep our shared reference pool hosted on any of these services we could still:
- embed or republish a list of references elsewhere (e.g. in a wiki)
- make sure the list of references in the wiki is kept up-to-date via the
external service
- allow people to access bibliographic metadata in a format suitable for writing
and in an environment they are already familiar with
- allow people to write reviews and annotations both on the wiki and via the
external service itself
If we think there is added value in hosting reviews on a wiki, what needs to be implemented is a connector between MediaWiki and any of these services (they all have open APIs).
Such a connector would presumably:
- pull bibliographic metadata from the external service
- define a unique ID for each publication (based on a DOI when available)
- create a wiki page per publication using the unique ID as a title and
populating it with the imported metadata
- retrieve live user annotations and comments from the external service
- allow to host further comments and annotations on the wiki via the article
page
It would be sad to see a lot of effort put into creating yet another static wiki-based bibliography just to see it become obsolete because no one is actively maintaining it or because the output it produces is in a format that does not allow it to be easily queried, reused or republished.
Dario
====
I can confirm that the #1 complain I get from colleagues when I point them to the static Wikiresearch bibliography pages in meta is: "it is not easily searchable" (by keywords, content, year, author, etc.). If we plan to have annotations or reviews, in addition to this, I believe a standard wiki (such as MediaWiki) is simply not the way to go (disclaimer: despite I'm a great fan and advocate of wikis, and I like many features in MediaWiki).
So, my suggestions are:
- Use a platform allowing extensive search capabilities (perhaps semantic
wikis, but I haven't tested many of them, yet). In this case, I do think that new NoSQL alternatives might be in place to search through text in reviews. 2. Understand multiple formats to introduce new refereneces (including importing from major existing compilations, and web content like CiteULike). 3. Include feature to rate papers according to different criteria (number of positive reviews, number of citations, or combinations of several search conditions).
Of course, I'd be very glad to help with this initiative if it is finally launched (once again).
Best, Felipe.
====
Hi Chitu,
Some reactions inline.
One of my most important functionalities would be automatic citations into papers that I'm working with. I haven't used a wide variety of citation managers, but the functionality in EndNote and Zotero is what I'm talking about; I just don't see how a MediaWiki instance could do that, unless some standardized bibliographic information be embedded into each article page to begin with.
Agreed. What I envision is that we write a script which would export the bibliographic data into whatever formats people prefer: BibTeX (my own preferred format), Zotero, EndNote, whatever. I'm happy to write this script for relatively sane formats. This would then let people create citations in the way they usually do.
Moreover, as Dario and Felipe explained, while far from perfect, the search capabilities of dedicated bibliography managers is far superior to what I presently see in MediaWiki.
(I don't know the details of MediaWiki search well, so some of the following may not be quite right.) What MediaWiki would give us is fulltext search. So while it would be easy to search for "John Smith", and that query would find papers authored by John Smith plus perhaps other stuff; however, one cannot search for "author = John Smith" and get only results where the author field matches John Smith and no others.
However, it does seem like Semantic MediaWiki has this type of search and otherwise behaves much like plain MediaWiki.
Maybe you or others could say more about what types of search are important to you?
AcaWiki uses Semantic MediaWiki, I believe. However, I have some reservations about it: * Is it sufficiently stable? (e.g., the FAQ says they "just launched" but the page has not been edited in two years. * The focus on "summaries" worries me, and the target audience is laypeople? We're talking about an annotated bibliography targeted at researchers, which is a different audience. * I don't care for the user interface (this is a mix of personal opinion and professional opinion as an HCI researcher).
As-is, I'm not very interested in AcaWiki. But, if there is an opportunity to make significant changes, then it seem plausible. I would want to know about hosting, backups, etc. make sure that it is a reliable platform.
There also appear to be various options for Semantic MediaWiki hosting: Wikia, Referata, etc. It would be nice to not have to deal with the sysadmin aspects of the project.
One final note on bibliographic software: many of these claim to do automatic import of a reference simply by pointing the software at the publisher's web page for the references. But I have never seen this work correctly; always, the imported data needs significant cleanup, enough that personally I'd rather type it in manually anyway. For example, titles of ACM papers aren't even correctly cased on the official ACM pages (e.g., http://dx.doi.org/10.1145/1753326.1753615)!
Bibliographic software then also typically does not include the proper metadata for automatically lower-casing titles in citations. For example, the title "Path Selection: Novel Interaction Technique for Wikipedia" should be lower-cased as "Path selection: Novel interaction technique for Wikipedia". But so often I see papers with "Path selection: novel interaction technique for wikipedia". It's embarrassing.
But, if we were writing our own (e.g.) MediaWiki -> BibTeX export script, we could automatically note that "Novel" should be capitalized (because it begins the subtitle) as well as provide for people to indicate explicitly title words that should remain capitalized. (In this instance, the proper BibTeX export syntax would be "Path Selection: {Novel} Interaction Technique for {Wikipedia}".)
Would it be feasible to have both, and use them concurrently so that researchers could use one or the other, or both, as they prefer? I'm thinking of something like this (for purpose of illustration, let's call the chosen MediaWiki instance MW and the chosen dedicated online shared bibliographic tool BT):
Bi-directional synchronization is hard to get right, particularly when the two sides have different data models. I think we are much better off declaring one or the other to be the master and the rest should remain read-only (i.e. export rather than synchronization).
To be clear, I'm offering to do the following things:
1. Help define a reasonable starting summary template for papers. 2. Build the proper MediaWiki infoboxes and whatnot to realize what we decide from #1 (perhaps concurrently to that discussion, to facilitate it). 3. Write a script to import some sane text-based format to this MediaWiki instance. (I assume Zotero can export such a format.) This would be run once or a few times for initial import, not regularly for synchronization. 4. Write a script to export the MediaWiki data to BibTeX and one or two other sane text-based formats, and arrange for it to be run frequently so people's paper citations stay up to date. 5. Read and annotate papers. 6. Help with follow-up work (synthesizing, writing survey, etc. etc.)
Reid
Hi Reid,
My responses to your responses are inline.
-------- Message original -------- Sujet: Re: [Wiki-research-l] Proposal: build a wiki literature review wiki-style De : Reid Priedhorsky reid@reidster.net Pour : wiki-research-l@lists.wikimedia.org Date : March-22-11 11:56:24 AM
(I don't know the details of MediaWiki search well, so some of the following may not be quite right.) What MediaWiki would give us is fulltext search. So while it would be easy to search for "John Smith", and that query would find papers authored by John Smith plus perhaps other stuff; however, one cannot search for "author = John Smith" and get only results where the author field matches John Smith and no others.
However, it does seem like Semantic MediaWiki has this type of search and otherwise behaves much like plain MediaWiki.
I actually wasn't familiar with the full functionality of Semantic MediaWiki (http://semantic-mediawiki.org/wiki/Semantic_MediaWiki) until I looked it up after your comments. From what I can see, it certainly seems to have the capabilities to maintain all the key metadata that would be necessary for myself and I assume most other researchers (e.g. authors, dates, publication source, URLs to HTML or PDF versions, etc.).
There also appear to be various options for Semantic MediaWiki hosting: Wikia, Referata, etc. It would be nice to not have to deal with the sysadmin aspects of the project.
I agree that going with a reliable host would be the way to go. I think that for the nature of our project, choosing a paid Referata plan would probably be better than going for Wikia. I for one could probably easily find grant funding to keep it going.
One final note on bibliographic software: many of these claim to do automatic import of a reference simply by pointing the software at the publisher's web page for the references. But I have never seen this work correctly; always, the imported data needs significant cleanup, enough that personally I'd rather type it in manually anyway. For example, titles of ACM papers aren't even correctly cased on the official ACM pages (e.g., http://dx.doi.org/10.1145/1753326.1753615)!
My only experience with "scraping" pages is with Zotero, and it does it beautifully. I assume (but don't know) that the current generation of other bibliography software would also do a good job. Anyway, Zotero has a huge support community, and scrapers for major sources (including Google Scholar for articles and Amazon for books) are kept very well up to date for the most part.
Bibliographic software then also typically does not include the proper metadata for automatically lower-casing titles in citations. For example, the title "Path Selection: Novel Interaction Technique for Wikipedia" should be lower-cased as "Path selection: Novel interaction technique for Wikipedia". But so often I see papers with "Path selection: novel interaction technique for wikipedia". It's embarrassing.
That's definitely a software design flaw; Zotero is certainly rather bad at this point.
But, if we were writing our own (e.g.) MediaWiki -> BibTeX export script, we could automatically note that "Novel" should be capitalized (because it begins the subtitle) as well as provide for people to indicate explicitly title words that should remain capitalized. (In this instance, the proper BibTeX export syntax would be "Path Selection: {Novel} Interaction Technique for {Wikipedia}".)
I like the idea of including export facilities in our SMW version, giving users the option of what they would like to export to.
Would it be feasible to have both, and use them concurrently so that researchers could use one or the other, or both, as they prefer? I'm thinking of something like this (for purpose of illustration, let's call the chosen MediaWiki instance MW and the chosen dedicated online shared bibliographic tool BT):
Bi-directional synchronization is hard to get right, particularly when the two sides have different data models. I think we are much better off declaring one or the other to be the master and the rest should remain read-only (i.e. export rather than synchronization).
I like this idea; with SMW as the primary, editable source, a read-only Zotero library imported from the SMW would work well. The problem, though, is that duplicate detection would need to prevent imports from adding existing articles. A complete overwrite would not work, since this would break article IDs for word processor integration. Zotero has been slow on implementing duplicate detection, but they finally have a very impressive solution in alpha (http://www.zotero.org/blog/new-release-multilingual-zotero-with-duplicates-d...).
Thanks, Reid, for your great suggestions. I hope this can become a reality.
~ Chitu
On 3/22/11 4:28 PM, Chitu Okoli wrote:
Reid wrote:
There also appear to be various options for Semantic MediaWiki hosting: Wikia, Referata, etc. It would be nice to not have to deal with the sysadmin aspects of the project.
I agree that going with a reliable host would be the way to go. I think that for the nature of our project, choosing a paid Referata plan would probably be better than going for Wikia. I for one could probably easily find grant funding to keep it going.
Sure. If nothing else I'd be happy to chip in personally. I could also ask around for funding here at IBM, but I'm quite pessimistic on that.
Paid plans run from $240 to $960/year, and we could certainly get started for free (http://www.referata.com/wiki/Referata:Features).
I'm not ready to write off AcaWiki, but I have a number of significant concerns. Some of these I've mentioned before. I'd really like someone from that project to comment on these.
* Is the project dead? The mailing list is pretty much empty and the amount of real editing activity in the past 30 days is pretty low.
* It appears that the project self-hosts - this means that the project has to do its own sysadmin work, which appears to have been a problem (e.g., the domain expired earlier this month and no one noticed until the site went down!).
* Is the target audience correct? I think we want to specifically target our annotated bibliography to researchers, but AcaWiki appears to be targeting laypeople as well as researchers (and IMO it would be very tricky to do both well).
* I don't think the focus on "summaries" is right. I think we need a structured infobox plus semi-structured text (e.g. sections for contributions, evidence, weaknesses, questions).
* It doesn't look like a MediaWiki. Since the MW software is so dominant, that means pretty much everyone who knows about editing wikis knows how to use MW - and not looking like MW means there's no immediate "aha! I can edit this". There's a lot of value in familiarity.
I will post an invitation on the AcaWiki mailing to come here and participate.
One final note on bibliographic software: many of these claim to do automatic import of a reference simply by pointing the software at the publisher's web page for the references. But I have never seen this work correctly; always, the imported data needs significant cleanup, enough that personally I'd rather type it in manually anyway. For example, titles of ACM papers aren't even correctly cased on the official ACM pages (e.g.,http://dx.doi.org/10.1145/1753326.1753615)!
My only experience with "scraping" pages is with Zotero, and it does it beautifully. I assume (but don't know) that the current generation of other bibliography software would also do a good job. Anyway, Zotero has a huge support community, and scrapers for major sources (including Google Scholar for articles and Amazon for books) are kept very well up to date for the most part.
Perhaps I'm just unlucky, then - I've only ever tried it on ACM papers (which it failed to do well, so I stopped).
Bi-directional synchronization is hard to get right, particularly when the two sides have different data models. I think we are much better off declaring one or the other to be the master and the rest should remain read-only (i.e. export rather than synchronization).
I like this idea; with SMW as the primary, editable source, a read-only Zotero library imported from the SMW would work well. The problem, though, is that duplicate detection would need to prevent imports from adding existing articles. A complete overwrite would not work, since this would break article IDs for word processor integration. Zotero has been slow on implementing duplicate detection, but they finally have a very impressive solution in alpha (http://www.zotero.org/blog/new-release-multilingual-zotero-with-duplicates-d...).
I don't know anything about how article IDs works in Zotero, but how to build a unique ID for each is an interesting, subtle, and important problem. Others have suggested using opaque IDs such as DOI. I think this is a mistake, because it means that they are utterly meaningless to people when creating citations. For example, consider the following two citations that I might put in my LaTeX code.
\cite{10.1145/1753326.1753615} \cite{Panciera2010Lurking}
The first means nothing to me, but the second is a useful reminder as to the paper I'm citing. That's what CiteULike does, and it's built from first author, year, first meaningful word of title. In the tiny percentage of cases where this is not unique, a disambiguation digit could be added.
I don't know how citation works in Word et al., but I would hope you're not stuck with opaque numeric IDs and/or that Zotero doesn't force you to use integers or something like that.
Reid
On AcaWiki:
This sounds in line with AcaWiki's larger goal, and the small community there is generally open to new ideas about how to structure pages and data. I also think the project would be appropriate as a Wikimedia project, which would address many of the self-hosting issues and tie into similar work on a WikiScholar project. No need to have multiple tiny projects when a single one would do.
I think we want to specifically target our annotated bibliography to researchers, but AcaWiki appears to be targeting laypeople as well as researchers (and IMO it would be very tricky to do both well).
You could allow each biblio page to decide who its audience is. If there is ever a conflict between a lay and a specialist audience, you can have two sets of annotations. I'd like to see this happen in practice before optimizing against it.
- I don't think the focus on "summaries" is right. I think we need a
structured infobox plus semi-structured text (e.g. sections for contributions, evidence, weaknesses, questions).
Again, I think both could be appropriate for a stub bibliography page; and that a great one would include both a summary and structured sections and infobox data. [acawiki does like infobox-style structure]
- It doesn't look like a MediaWiki. Since the MW software is so
This is easy to fix -- people who like the current acawiki look can use their own skin.
On Data-scraping and WikiScholar parallels:
My only experience with "scraping" pages is with Zotero, and it does it beautifully. I assume (but don't know) that the current generation of other bibliography software would also do a good job. Anyway, Zotero has a huge support community, and scrapers for major sources (including Google Scholar for articles and Amazon for books) are kept very well up to date for the most part.
Perhaps I'm just unlucky, then - I've only ever tried it on ACM papers (which it failed to do well, so I stopped).
Brian Mingus, who is working on WikiScholar (another related project which may be suitable) has a great deal of exprience with scraping, both using APIs and otherwise, and that is the foundation of his effort.
I don't know anything about how article IDs works in Zotero, but how to build a unique ID for each is an interesting, subtle, and important problem.
This is important, and has also been discussed elsewhere. Some of this discussion would be appropriate here: http://meta.wikimedia.org/wiki/Talk:WikiScholar
On 3/23/11 1:16 PM, Samuel Klein wrote:
You could allow each biblio page to decide who its audience is. If there is ever a conflict between a lay and a specialist audience, you can have two sets of annotations. I'd like to see this happen in practice before optimizing against it.
I think that is workable if the two sides don't step on each other's toes too much. I am also coming around to the view that we should just try it and see what happens.
- It doesn't look like a MediaWiki. Since the MW software is so
This is easy to fix -- people who like the current acawiki look can use their own skin.
Well, my concern is for newcomers who by definition don't have a skin configured. What I want this this reaction:
<browse to http://acawiki.org/whatever%3E "Hey! This is MediaWiki! I know how to use this!" <edit stuff>
But now I think the following reaction is more likely:
<browse to http://acawiki.org/whatever%3E "Hmmm, what's this?" <browse browse> <leave>
These small barriers to entry matter. My basic argument is that leveraging familiarity by making it look like something people have seen before is more important than branding.
Reid
On 3/18/11 12:30 PM, Dario Taraborelli wrote:
There are excellent free and standards-based services out there designed precisely to allow groups of researchers to collaboratively import, maintain and annotate scholarly references. Zotero is one of them, others are: CiteULike, Bibsonomy, Mendeley, Connotea. My feeling is that the majority of people on this list are already using one of these services to maintain their individual reference library.
My take on these software is different: all of the ones I've tried are really rather bad.
* Zotero - software to install, clunky UI. * CiteULike - clunky, sharing is hard, weird duplication of publications. * Bibdex - seems to be run by a private company which is one guy, no blog activity since April 2010, login required. * Mendeley - non-free software to install, clunky? * Bibsonomy - couldn't figure out how to use it, lots of bibliographic database noise in the interface that gets in the way * Connotea - run by a private company, login required (I didn't create a login so I don't know if the UI is any good), API seems limited???
I for one do not use any of these. It's either a cobbled-together BibTeX document or my own Yabman software, which has a lot of flaws but is at least fast for putting together a paper's ref list and getting a decently formatted BibTeX file.
The main benefit of doing it with Mediawiki is that has a nice clean interface and it's super easy to get started - just go to the website and edit. No login required, nothing to install, no software to learn (other than a very basic knowledge of wiki markup). We know this is a big reason Wikipedia is successful, and that barriers to entry, even if small, really discourage people from getting started, and if they don't get started they don't develop into core contributors.
There is also a rich ecosystem of support software (e.g. It's All Text extension and Emacs wikipedia-mode). Bottom line, we're asking people to commit to spending whole days of work in the system. Would I do that in MediaWiki? Yes, definitely. Any of the other bibliographic software mentioned above? No.
I would be more than happy to use something other than Mediawiki, but thus far nothing that seems acceptable to me has been suggested.
Others in this thread have mentioned projects similar to what I suggest:
* AcaWiki - This is similar to what I suggest, though the template used for papers needs work IMO. Could be a plausible starting point. The fact that it doesn't look like regular Mediawiki is a drawback.
* BredeWiki - Very much along the lines of what I suggest.
I think a key goal here is to not let the perfect become the enemy of the good. We can start a Mediawiki-based bibliography *now* and easily mold it into something which meets our needs quite well. If we want to add on fancy connector later, that's fine; but IMO simple exporters would be plenty for most uses.
Reid
I would recommend using AcaWiki. There are efforts afoot to help make that more mediawiki-like (and to add other template models to it), and people are exploring ways to expand its audience and functionality.
There are also some active wikimedia researchers already using it and, including Mako Hill, who is on this list and occasionally gives a 'literature review' talk at wiki conferences.
SJ
On Mon, Mar 21, 2011 at 11:56 AM, Reid Priedhorsky reid@reidster.net wrote:
On 3/18/11 12:30 PM, Dario Taraborelli wrote:
There are excellent free and standards-based services out there designed precisely to allow groups of researchers to collaboratively import, maintain and annotate scholarly references. Zotero is one of them, others are: CiteULike, Bibsonomy, Mendeley, Connotea. My feeling is that the majority of people on this list are already using one of these services to maintain their individual reference library.
My take on these software is different: all of the ones I've tried are really rather bad.
- Zotero - software to install, clunky UI.
- CiteULike - clunky, sharing is hard, weird duplication of publications.
- Bibdex - seems to be run by a private company which is one guy, no
blog activity since April 2010, login required.
- Mendeley - non-free software to install, clunky?
- Bibsonomy - couldn't figure out how to use it, lots of bibliographic
database noise in the interface that gets in the way
- Connotea - run by a private company, login required (I didn't create a
login so I don't know if the UI is any good), API seems limited???
I for one do not use any of these. It's either a cobbled-together BibTeX document or my own Yabman software, which has a lot of flaws but is at least fast for putting together a paper's ref list and getting a decently formatted BibTeX file.
The main benefit of doing it with Mediawiki is that has a nice clean interface and it's super easy to get started - just go to the website and edit. No login required, nothing to install, no software to learn (other than a very basic knowledge of wiki markup). We know this is a big reason Wikipedia is successful, and that barriers to entry, even if small, really discourage people from getting started, and if they don't get started they don't develop into core contributors.
There is also a rich ecosystem of support software (e.g. It's All Text extension and Emacs wikipedia-mode). Bottom line, we're asking people to commit to spending whole days of work in the system. Would I do that in MediaWiki? Yes, definitely. Any of the other bibliographic software mentioned above? No.
I would be more than happy to use something other than Mediawiki, but thus far nothing that seems acceptable to me has been suggested.
Others in this thread have mentioned projects similar to what I suggest:
- AcaWiki - This is similar to what I suggest, though the template used
for papers needs work IMO. Could be a plausible starting point. The fact that it doesn't look like regular Mediawiki is a drawback.
- BredeWiki - Very much along the lines of what I suggest.
I think a key goal here is to not let the perfect become the enemy of the good. We can start a Mediawiki-based bibliography *now* and easily mold it into something which meets our needs quite well. If we want to add on fancy connector later, that's fine; but IMO simple exporters would be plenty for most uses.
Reid
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
wiki-research-l@lists.wikimedia.org