I'd like to hear from the people on this list on a proposal to create a dedicated namespace to host open (tabular) data and make these datasets persistently identifiable, version controlled and easily embeddable into other wikis.
While this use case is currently not within the scope of Wikidata (and could potentially live on other Wikimedia wikis, like Meta or Commons), I'd appreciate input from the wikidata community on this draft:
https://meta.wikimedia.org/wiki/DataNamespace
Some interesting discussion on the talk page:
http://meta.wikimedia.org/wiki/Talk:DataNamespace
Dario
Hi Dario,
Thanks for sharing this proposal. As posted in the talk page, I believe that it would be a great opportunity to partner with dedicated organizations like Datahub [1] (OKFN/CKAN).
Wikidata could also play an important role in semantically describing how to interpret the information, for instance mapping the fields of raw data to wikidata properties or describing semantically the contents of the file.
The DataNamespace could be used to render selected portions of a large file, which would be too much to handle in real time.
Cheers, Micru
On Tue, Aug 27, 2013 at 8:06 PM, Dario Taraborelli < dtaraborelli@wikimedia.org> wrote:
I'd like to hear from the people on this list on a proposal to create a dedicated namespace to host open (tabular) data and make these datasets persistently identifiable, version controlled and easily embeddable into other wikis.
While this use case is currently not within the scope of Wikidata (and could potentially live on other Wikimedia wikis, like Meta or Commons), I'd appreciate input from the wikidata community on this draft:
https://meta.wikimedia.org/wiki/DataNamespace
Some interesting discussion on the talk page:
http://meta.wikimedia.org/wiki/Talk:DataNamespace
Dario
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
The DataNamespace is interesting. It is already possible to just add tabular data to MediaWikis and render them using one of the tabular rendering extensions. I am using the SimpleTable extension on the Brede Wiki. I use LibreOffice Calc to edit the data, but on-wiki table editing and flexible formatting would be preferable. Computation and visualization of the data from the wiki is done with an off-wiki web service, see, e.g.,
http://neuro.imm.dtu.dk/cgi-bin/brede_bw_metaanalysis?title=Major+Depressive...
I have added the following to the state-of-the-art section on meta:
* In the [http://neuro.imm.dtu.dk/wiki/Main_Page Brede Wiki], Finn Årup Nielsen is using ordinary namespace pages to store comma-separated values including one-row header for scientific data, see, e.g., [http://neuro.imm.dtu.dk/wiki/Bipolar_Disorder_Neuroimaging_Database_-_Amygda... Example on CSV file]. This data can then be transcluded on other pages on the wiki, see, e.g., [http://neuro.imm.dtu.dk/wiki/Bipolar_Disorder_Neuroimaging_Database_-_Amygda... example]. The transclusion uses the 'tab' tag from the 'SimpleTable' extension of Johan the Ghost defined in a template, making a static table rendering (except for the standard sortable style). The data from the CSV pages is read by an external script that performs meta-analysis on the data, see, e.g., [http://neuro.imm.dtu.dk/cgi-bin/brede_bw_metaanalysis?title=Bipolar+Disorder... meta-analysis example]. This script also allows for export of the CSV data in JSON format. The 'semantic' annotation of the column header takes place in standard MediaWiki templates, that are aware of the format of the external script API, see, e.g., [http://neuro.imm.dtu.dk/wiki/Template:Metaanalysis_csv metaanalysis csv template] referenced from [http://neuro.imm.dtu.dk/wiki/BiND#Meta-analysis BiND metaanalysis section]. This simple approach, which requires no modification of a standard installation of MediaWiki beyond the 'SimpleTable' extension enabling, has been described in more detail in a few articles:
[http://www2.imm.dtu.dk/pubdb/views/edoc_download.php/6345/pdf/imm6345.pdf Online open neuroimaging mass meta-analysis with a wiki]
[http://www2.imm.dtu.dk/pubdb/views/edoc_download.php/6302/pdf/imm6302.pdf Online open neuroimaging mass meta-analysis] (shorter paper)
[http://www2.imm.dtu.dk/pubdb/views/edoc_download.php/6346/pdf/imm6346.pdf Brede tools and federating online neuroinformatics databases] (some mentioning of the system)
On 08/28/2013 02:06 AM, Dario Taraborelli wrote:
I'd like to hear from the people on this list on a proposal to create a dedicated namespace to host open (tabular) data and make these datasets persistently identifiable, version controlled and easily embeddable into other wikis.
While this use case is currently not within the scope of Wikidata (and could potentially live on other Wikimedia wikis, like Meta or Commons), I'd appreciate input from the wikidata community on this draft:
https://meta.wikimedia.org/wiki/DataNamespace
Some interesting discussion on the talk page:
http://meta.wikimedia.org/wiki/Talk:DataNamespace
Dario