Wikidata April 2012

wikidata@lists.wikimedia.org

73 participants
111 discussions

Re: [Wikidata-l] Archiving references for facts?

by John Erling Blad

Archiving a page should be pretty safe as long as the archived copy is only for internal use, that means something like OTRS. If the archived copy is republished it _might_ be viewed as a copyright infringement. Still note that the archived copy can be used for automatic verification, ie extract a quote and check that against a stored value, without infringing any copyright. If a publication is withdrawn it might be an indication that something is seriously wrong with the page, and no matter what the archived copy at WebCitation says the page can't be trusted. Its really a very difficult problem. Jeblad On 1. apr. 2012 14.08, "Helder" <helder.wiki(a)gmail.com> wrote:

12 years, 1 month

[Wikidata-l] Help with Phase 1 + Introduction

by Brent Hecht

Hello Wikidata Folks, My name is Brent Hecht, and I've done a great deal of research on the differences and similarities between the language editions. I was really excited to hear about the Wikidata project moving forward, and I think some of my research might be of assistance. I'd enjoy being able to help the community make this important transition. In particular, my experience navigating interlanguage link conflicts might be able to help in Phase 1. Please let me know if there's anything I can do over the short term or long term! Some of my relevant papers: [1] Bao, P., Hecht, B., Carton, S., Quaderi, M., Horn, M. and Gergle, D. 2012. Omnipedia: Bridging the Wikipedia Language Gap. CHI '12: 30th International Conference on Human Factors in Computing Systems (2012). [2] Hecht, B. and Gergle, D. 2010. The Tower of Babel Meets Web 2.0: User-Generated Content and Its Applications in a Multilingual Context. CHI '10: 28th International Conference on Human Factors in Computing Systems (Atlanta, GA, 2010), 291--300. [3] Hecht, B. and Gergle, D. 2009. Measuring Self-Focus Bias in Community-Maintained Knowledge Repositories. Communities and Technologies 2009: 4th International Conference on Communities and Technologies (State College, PA, 2009), 11--19. - Brent Brent Hecht Ph.D. Candidate in Computer Science CollabLab: The Collaborative Technology Laboratory Northwestern University w:http://www.brenthecht.com <http://www.brenthecht.com/> e:brent@u.northwestern.edu <mailto:brent@u.northwestern.edu>

12 years, 1 month

Re: [Wikidata-l] Archiving references for facts?

by John Erling Blad

The problem is that you usually don't know what happened. Often the article lives on behind a paywall, and then you should (must!) use that one. To republish an archived copy from own repo will be a copyvio in many jurisdictions. The interesting thing is (I guess) if we can quote the important fact in such a way that we don't make a copyvio yet make it possible to validate the value. Usually you can quote a fragment and also republish that, but you can't copy the whole article and make that available for others. Sorry for being such a pest and saying this! ;) Jeblad On 1. apr. 2012 21.52, "Platonides" <platonides(a)gmail.com> wrote:

12 years, 1 month

Re: [Wikidata-l] Multilinguistics matters

by JFC Morfin

Dear Lydia, >Hmmm I have to confess I don't understand this completely and therefor >can't give you an answer. Could you give me an example of what you are >talking about and how you see Wikidata fit in? Hmmmmmmm :-) I am somewhat at loss here. Let start from some basic, then. 1) is there a concise consensual definition of what Wikidata is expected to achieve? 2) Has this Wikidata project some Charter or terms of reference (TOR), or dedicated web site, you could give the URL? http://wikidata.org is redirected to http://en.wikipedia.org/wiki/Main_Page 3) Is there a list of the needs of wikimedia projects that wikidata is to address ? 4) What is/are going to be the language(s) of Wikidata? 5) What is the Wikidata's architectural framework? WDE (whole digital ecosystem)? Human communications? The Internet services? Users applications? Wikimedia? 6) What are the normative choices, strategic objectives and constraints imposed by the sponsors and the WMF? 7) How the Wikidata project is to liaise with its potential users' representatives (depending on the response to (5)) Thank you ! jfc

12 years, 1 month

Re: [Wikidata-l] Data model (RDF)

by Jakob Voss

JFC Morfin wrote: > 2. Since we have a W3C expert: what is the best document/book to get > a comprehensive and clear (not too massive) documentation on the > semantic web? You surely don't want to know all about semantic web - especially the Ontology stuff with OWL dialects and entailment regimes is far too academic and won't be part of wikidata because of computational complexity anyway. In short, you should be *very sceptical* and cautious every time you stumple upon anything that requires inference rules. Even trivial inference rules such as those based on owl:sameAs and rdf:type can be problematic in practice! The less inference you assume, the better. I can recommend the "Linked Data Patterns" book by Dodds and Davis: http://patterns.dataincubator.org/book/ Jakob -- Verbundzentrale des GBV (VZG) Digitale Bibliothek - Jakob Voß Platz der Goettinger Sieben 1 37073 Goettingen - Germany +49 (0)551 39-10242 http://www.gbv.de jakob.voss(a)gbv.de

12 years, 1 month

[Wikidata-l] Conservadata

by Denny Vrandečić

Since some claim that Wikidata's model of representing statements with their references and the diversity of knowledge in the world outside is just a way to bring Wikipedia's values to the world of data, Conservapedia has launched an alternative approach to the challenges we are tackling. In good open source manner we hope that we can share code, and we wish Conservadata all the best. <http://blog.tommorris.org/post/20277406012/conservadata> Cheers, Denny -- Project director Wikidata Wikimedia Deutschland e.V. | Eisenacher Straße 2 | 10777 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.

12 years, 1 month

Re: [Wikidata-l] Data_model: Metamodel: Wikipedialink

by Jakob Voss

Hi, I was just about to suggest the same: there is no simple 1-to-1 relationship between articles in different articles. If Wikidata only supports 1-to-1, this relation will be end up more fuzzy, because people will also put "more-or-less" mappings into it. Gregor wrote: > Would changing the model to: > Wikipedialink = (Title, LanguageId, Relation, Badge?) > where relation can be > broader match > close match > exact match > narrower match > > etc. help in expressing these situations? I believe this could prevent > problems later on. I'd better limit relations to "exact", "broader" and "narrower" because "close match" is very fuzzy. Even "exact match" won't be 100% in practice, so where is the border between "exact" and "close" (or additional "far match", "almost exact match" etc.). Two directions "broader" and "narrower" are easier to grasp, also in user interfaces and for retrieval. I took the freedom to modify the current model as it's a Wiki ;-) Cheers Jakob -- Verbundzentrale des GBV (VZG) Digitale Bibliothek - Jakob Voß Platz der Goettinger Sieben 1 37073 Goettingen - Germany +49 (0)551 39-10242 http://www.gbv.de jakob.voss(a)gbv.de

12 years, 1 month

Re: [Wikidata-l] Wiki Subject Indexes

by John McClure

Ivan, Thanks for the reply. Yes as many people should be brought in about subject indexing as can be found. The UDC is a faceted classification; LCC (Lib of Congress Classification) and DDC (Dewey Decimal Classification) are not. UDC is a multilingual taxonomy (facet=language) and these other two are not. To me there's little to no competition with UDC. But that's about what taxonomies are deployed -- I support many and all. It's the implementing technology that's key here and happily doensn't require expert panels and study groups. SKOS can handle multiple facets -- via its Collection objects. I view SKOS as a stepping stone to ISO Topic Maps; I suspect SKOS has been invented by the W3 as a response to Topic Map functionality not in OWL, but perhaps you can enlighten me further. Basically I have these questions. 1. Is WP going to be empowered with subject indexes as a core feature? Is this better an extension? 2. What's the relative priority of subject indexes over other wikidata requirements (ie after Semantic Infoboxes) 3. Should SKOS or a very close equivalent be subsumed into the WOM? The answer to 3 probably should echo the realtionship between the WOM and the OWL, whatever that is intended to be. Thanks, John ---------------------------------------------------------------------------- ---- Date: Sat, 31 Mar 2012 11:22:43 +0200 From: Ivan Herman <ivan(a)w3.org> To: "Discussion list for the Wikidata project." <wikidata-l(a)lists.wikimedia.org> Subject: Re: [Wikidata-l] Wiki Subject Indexes Message-ID: <F67C291D-CE31-4624-AD87-7B3ABCB9B612(a)w3.org> Content-Type: text/plain; charset="us-ascii" On Mar 31, 2012, at 02:00 , John McClure wrote: > Hello, > Can/should wiki subject indexes be a functional requirement of the wikidata > project, or of some other? I think navigating a wiki today without a subject > index is difficult, to say the least. A subject index seems such a critical > component for information libraries! WP's portals are a nice step but still, > users seem at the mercy of topical links inserted by authors of the portal. > How much better it would be to have a taxonomy of subjects that can be > associated with a page by ITS author so that the page can be found > independently of portals. I see a major issue with the user interface of this (though I agree with your assessment). It has to be very easy to set the right subject, otherwise people will not do it. The dbpedia people extract some rough classification of the articles. It is not perfect, but may be worth looking at that, too. > The semantics of SKOS, I suggest, should be baked in to wikis. I also > suggest faceted UDC [1] or similar inter/national classification scheme be > one of many that can be referenced by users when browsing any wiki. I > envision that Subjects would be defined in a namespace as fundamental to a > wiki as the Category namespace is. Basically, I can see requesting some > software to correlate my own subject taxonomy with interwikis' (WP's) > tasxonomies, so that I as a user don't have to manually search each > interwiki for content relative to my personal list of subjects. I think the project should reach out to libraries, ie, real experts. The library world is currently looking at the issues of how to redefine cataloging standards, how to make them Linked Data friendly, etc; because that work is still in flux, it may be the ideal time to talk to them. In view of the importance of WP, libraries cannot allow themselves to ignore this (I believe...) Ivan > Is this an idea before its time, something already considered, or something > to consider now? > Thanks for your thoughts. -- john > > [1] http://en.wikipedia.org/wiki/Universal_Decimal_Classification

12 years, 1 month

Re: [Wikidata-l] Multilinguistics matters

by Lydia Pintscher

Hey :) On Sat, Mar 31, 2012 at 2:18 PM, JFC Morfin <jefsey(a)jefsey.com> wrote: > > Lydia, > > I have a question. I am interested in "multilinguistics". I define > multilinguistics as the cybernetics of the mecalanguages, i.e. the > operational, computer applications and strategic pragmatic coexistence of > natural (and artificial) languages that can be used between > man/man/machine/machine. I am, therefore, interested in stabilizing a table > of all the (meca)languages names and specifics, and I have been working on > the preparation of a WikiLinguae project for years in trying to > converge/cooperate the teams of ISO, private, and Open Source tables, within > the loose framework of the MAAYA network (Members: > http://maaya.org/spip.php?article40, including UNESCO, ITU, ACALAN, > LinguaMon, Union Latine, Francophonie, AUF, etc.) as well as to work on > script homography, which is a DNS problem, etc. > > This problem is extended with the support of IDNA2008 consistency (IDNs), > e-mail addresses, variants (i.e. the same term with different printings, or > different Unicode Points). I would like to know if this area is part of the > Wikidata project, or of the way it is planned to address it), polynymy > issues (i.e. strict cross/languages synonyms), variances (i.e. the variation > of the semantics of a term, or the appearance of a new term) both in data > definitions and in data values. Hmmm I have to confess I don't understand this completely and therefor can't give you an answer. Could you give me an example of what you are talking about and how you see Wikidata fit in? > I would also like to know, from the very beginning, if this WikiLinguae > project was to be kept separate, should/could ally with Wikidata, or if > Wikidata had its own project regarding multilinguistics (*). I note here > that it should be both multilinguistic (to document every language) and > polylingual (to document them in every language). In addition, should there > be a Wikimedia extension/version of the "locale" files that would be > documented cooperatively (in cooperation or not with other "locale" > directories > > Thank you. > jfc > > (*) Multilinguistics by nature is a semiotic discipline with semantic, > syntax, and pragmatics. It treats every language as equal to the others. It > should not be confused with "globalization" (eg. Unicode), which is: > > * the "internationalization" of the medium (support of ISO 10646) within an > English framework > + the localization of the ends (ISO 15897) > + the filtering of linguistic quoted exchanges as per their language (ISO > 639), scripts (15924), and administrative authority as a cultural referent > (ISO 3166:1). > * and results in langtags (RFC 5646). > > Globalization and an open langtag compatible format should be sufficient at > this stage (due to the limited number of Wikipedia languages) if there is no > specific need for Wikimedia formats or locale file extensions. Also, the > WikiLinguae project seeks to be a wiki-based ISO 11179 conformant reference > "spine" in the matter of languages and cultures: this is not a small task > and may still call for time. > > _______________________________________________ > Wikidata-l mailing list > Wikidata-l(a)lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikidata-l > -- Lydia Pintscher - http://about.me/lydia.pintscher Community Communications for Wikidata Wikimedia Deutschland e.V. Eisenacher Straße 2 10777 Berlin www.wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.

12 years, 1 month

[Wikidata-l] Introduction

by Aniket Karmarkar

Hi Everyone, I recently graduated from University with a Bachelors Degree in Computer Science. Unfortunately, I don't write any code at work. :( So, I am interested in helping out with the project! I know how to code and can help with testing/documentation as well. I am also interested in learning about what kind of database/ how you plan on storing the semantic information. Are these definition correct: Wikipedia- uses MediaWiki as software MediaWiki- current software which runs Wikipedia WikiData- new module on MediaWiki that will allow the community to add semantic information to Wikipedia articles Cheers! -- Aniket Karmarkar

12 years, 1 month

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Wikidata April 2012