Wikidata April 2012

wikidata@lists.wikimedia.org

73 participants
111 discussions

[Wikidata-l] Notability in Wikidata
by emijrp 02 Apr '12

02 Apr '12

Hi all; I'm thinking about notability in Wikidata and how it may conflict with Wikipedia current policies and community conceptions. Will Wikidata allow to create entities for small villages, asteroids, galaxies, stars, species, etc, that are not allowed today at Wikipedia? Including about those that don't have article in any Wikipedia? I will be happy if so. Regards, emijrp

12 18

[Wikidata-l] Archiving references for facts?
by emijrp 02 Apr '12

02 Apr '12

Hi all; I have read that every fact for every entity must include a reference. How is Wikidata going to deal with dead links? I hope we can work on this developing an archivist bot, to archive links into WebCitation or using Internet Archive. This is an old problem in all Wikipedias, and it is correctly addressed (the only example I know is French Wikipedia using Wikiwix.com to archive references and external links). Regards, emijrp

7 8

Re: [Wikidata-l] Multilinguistics matters
by Lydia Pintscher 02 Apr '12

02 Apr '12

On Sun, Apr 1, 2012 at 1:58 PM, JFC Morfin <jefsey(a)jefsey.com> wrote: > Dear Lydia, > >> Hmmm I have to confess I don't understand this completely and therefor >> can't give you an answer. Could you give me an example of what you are >> talking about and how you see Wikidata fit in? > > > Hmmmmmmm :-) I am somewhat at loss here. Let start from some basic, then. > > 1) is there a concise consensual definition of what Wikidata is > expected to achieve? http://meta.wikimedia.org/wiki/Wikidata right at the top > 2) Has this Wikidata project some Charter or terms of reference (TOR), > or dedicated web site, you could give the URL? http://wikidata.org is > redirected to http://en.wikipedia.org/wiki/Main_Page No. > 3) Is there a list of the needs of wikimedia projects that wikidata is > to address ? Some are mentioned on http://meta.wikimedia.org/wiki/Wikidata/FAQ I believe > 4) What is/are going to be the language(s) of Wikidata? the languages of the wikipedias > 5) What is the Wikidata's architectural framework? WDE (whole digital > ecosystem)? Human communications? The Internet services? Users > applications? Wikimedia? > > 6) What are the normative choices, strategic objectives and > constraints imposed by the sponsors and the WMF? This is a really broad question ;-) If you have more concrete questions I can try to find answers. > 7) How the Wikidata project is to liaise with its potential users' > representatives (depending on the response to (5)) You mean the initial development team? Through me. Cheers Lydia -- Lydia Pintscher - http://about.me/lydia.pintscher Community Communications for Wikidata Wikimedia Deutschland e.V. Eisenacher Straße 2 10777 Berlin www.wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.

1 0

Re: [Wikidata-l] Notability in Wikidata
by Jakob Voss 02 Apr '12

02 Apr '12

Oren Bochman wrote: > Each project has its own criteria of Notability, since it requires its own noise > filter. As far as I understand, WikiData is about any item, not only about things that have an article in en.Wikipedia or any other Wikipedia. The question "What items may be included in WikiData" should be added to the WikiData FAQ. I can think of a very broad definition: every thinkable thing or object can be included as item with its own page in WikiData if the thing is topic of an article in Wikipedia or in any other Wikimedia project (e.g. Wiktionary or Wikisource) or if the thing is referenced in any of this articles. This includes for example: - normal Wikipedia articles of any language - things that are included as items in a list-of-X-article - words (Wiktionary) - digital objects (Commons) - single pages from digitized books (Wikisource) - publications used as references Managing the vast number of items will only be possible with backlinks, for instance you can get the list of Wikidata items linked to or used in some selected Wikipedia article and you can get a list of items not used in any article for cleanup. Jakob -- Verbundzentrale des GBV (VZG) Digitale Bibliothek - Jakob Voß Platz der Goettinger Sieben 1 37073 Goettingen - Germany +49 (0)551 39-10242 http://www.gbv.de jakob.voss(a)gbv.de

2 1

[Wikidata-l] Data_model: Metamodel: Statement
by Gregor Hagedorn 02 Apr '12

02 Apr '12

Some initial ideas on the statement. I realize that this is not priority in the first phase, but perhaps on the wiki a place could be created to collect some thoughts like those below? http://meta.wikimedia.org/wiki/Wikidata/Notes/Data_model#The_Metamodel explains: Statement = (StatementID, Property, Value, Qualifier*, Reference*) The StatementID is a unique identifier for the given statement and used only internally and for export. A Property is defined on a property page. This definition includes a type. The structure of the Value is given by the type of the property. Simple types could demand an EntityID or a number. More complex types could demand dates, date ranges, numbers with units, a geocoordinate, a geo shape, etc. ---- 1. Will Property include information on observation/recording/measurement methodology? 2. What happens if the main entity has variants or parts for which values are recorded? An example is car models, where typically the revisions sold under the same name are subsumed in one Wikipedia article. Example: http://en.wikipedia.org/wiki/Renault_Kangoo with different length / weight in the "subclass" info boxes for first/second generation. Cars also easily serve as an example for variable parts, see in http://en.wikipedia.org/wiki/%C5%A0koda_Roomster the list of engine specifications. The engines are not Wikipedia entities in their own right. 3. Values may be well defined RDF-resources, but now available in Wikipedia. In my work, many statements I would like to express in a future Wikidata are not allowed as Wikipedia articles at all. You can express "Wowereit" is_mayor_of "Berlin, Germany" but not "Plantago lanceolata" has_leaf_shape "lanceolate" because http://en.wikipedia.org/wiki/Lanceolate is a redirect to http://en.wikipedia.org/wiki/Leaf_shape I personally would love to have illustrated definitions of things people want to learn about being allowed on Wikipedia, but the argument is generally that Wikipedia is not Wiktionary. I believe Wikidata should right from the start be defined to allow references to Wiktionary as well as Wikipedia. And while we are at it, references to Commons as well (semantic image annotation...) This would change Wikipedialink = (Title, LanguageId, Badge?) to Link = (Project, LanguageId, Title, Badge?) --- Gregor

2 1

Re: [Wikidata-l] Notability in Wikidata
by JFC Morfin 02 Apr '12

02 Apr '12

Binaris, Oren, At 13:43 02/04/2012, Oren Bochman wrote: >>>WikiData's mandate appears to be to improve en.Wikipedia (and then >>>the other WMF projects). If WikiData cannot serve the needs of >>>en.wikipedia it will be of little consequence to the other >>>projects since it will need a community to maintain. > >>This is a new piece of information for me. Does enwiki really have >>any priority to other Wikipedias? :-O I would be unhappy with that. > >My point is a pragmatic one not a political one. Note that >Wikidata is planned to add language prototypes in the third part of >the time line - that is my reference. > >If you read my note you will notice I propose to take a realistic >view of the communities. It would be a mistake to forget >that en.wikipeia is where the bulk of the information exists, the >largest community and the source of the most difficult policies i.e. >the greatest challenges. > >However getting data into Wikidata is only half of the problem. To >get it out as to ar...zu.Wikpedias without causing chaos -- is what >we are here to figure out. We have a good experience of that type of problem with the "layer below": languages (data are to be expressed in a language). This is why I raised the point with Lydia and helped her in asking for a more organized road map when she said she did not fully understand. This experience has been with the Governance (countries, and languages), with the character sets (ISO 10646 and Unicode), with the langtags (RFC 5646) and filtering (RFC 4647), with the non-ASCII domain names and mail addresses. In theses cases we met with the interlinguistic (languages), inter-readibility (scripts), interconnectibility (infrastructure) and the interoperability (protocols) issues which made the Internet so far. The Internet+ layer set is new, in use, and clear but by far not stabilized yet (I quoted the IETF Drafts on this) . It concerns fringe to fringe interintelligibility (the next upper network stratum will be the Intersem (semiotic Internet, the Internet of thoughts) which is my main interest with auxiliary intelligence (not for today !)) Wikimedia follows the same strata advancement, but on top of the web application. This means that three fundamental questions are raised by the Wikidata project: 1. is it technically to be a web or a network+ application (this has no impact on the content, but this does impact the architectural and referential issues and the future of the Intersem). 2. how is to be interintelligible: in being internationalized (the leading culture hosts the others) or multilingualised (every language is treated equal). 3. how is it going to support (actually not to block further innovation in) intercomprehensibility i.e. to be able to address the Intersem stratum "capital of Israel" issue. Point 1 : is to decide who (humanists or technicalists) are to be the leaders in the Wikidata specifications. Point 2 : is to decide if we address the logical needs of en.wiki and then propagate the solutions to the other languages, or if we actually are pragmatic and acknowledge that the need is for a multilingual system. So, we try first to address this fully new problem among a limited set of wikis (let say the three leading ones [amenable to ASCII, but with diacritics] and then with a non-roman one. Point 3 : is to understand and accept the complexity that Wikidata is having to face and to try not to over complexify its future with today's "great" solutions and habits. Examples: 1. Network related issues may become a first nightmare as many URIs will become IRIs (different scripts) and the digital convergence will need to use DNS CLASSes, at least between different technologies. This means both that en.wiki will have to support URI/IRI indifferently. English contributors have no Chinese keyboards. CLASSes are not even supported yet in the way to publish a domain name. The more communications layers, the more complexity, the less security, the less room and power in mobile devices. 2. The ISO and IETF experience shows that the multicultural/multilinguistic issue is already very complex to architecturally and technically address (it took ten years, and it is not by far digested) due to people intuitive agreements with stereotypes. The Network side solution to this complexity is "subsidiarity" (i.e. the current Wikipedia solution). Wikidata has therefore to be conceived as a service to subsidiarity (the wikimedia and the Internet+ ones). 3. Then, the third problem no one has addressed yet except ISO 3166, is variance : two identical particulars (effects, names, data, etc.) may be different. eg. there are many ways to compute and present the same date. Are the results to be stored in Wikidata in all these ways every day and bridges to be built? or are they to be stored as a single data with the formulas to compute them, then how to be sure some parameters have not changed (i.e. death of the Emperor) and computation was not tampered with? Variance is everywhere (actually variance is most probably Life). ISO 3166 has no variance, because it is the sovereign reference: the list of States and laws languages (however, Palestine is in it already, Taiwan is there). ISO documents are in French, English and possibly in Russian. ISO 3166:1 states which are the normative languages in every country by reference to ISO 639 (list of language names). ISO 3166 defines the ccTLDs and is used in langtags to document languages and cultures. ISO 10646 (supported by UNICODE) is the scripts character coded tables. At binary layer it is full of variants (same graphs being supported by different code points). Best jfc

1 0

Re: [Wikidata-l] Data_model: Metamodel: Wikipedialink
by Jakob Voss 02 Apr '12

02 Apr '12

Markus Krötzsch wrote: > This is a valid point. It is intended to address this as follows: > > * Wikidata items (our "content pages") will be in *exact* > correspondence to (zero or more) Wikipedia articles in different > languages. > > * Differences in scope will lead to different Wikidata items. I suppose that this will either lead to fuzzy scopes, because there are many subtle differences in scope between articles in different Wikipedia language, or to an inflation of poorly connected, rather similar Wikidata entities because every Wikipedia language requires its own Wikidata entities. > * Relationships such as "broader" or "narrower" can be expressed > as relations between these items, if desired. In any way, you must admit that there is no simple 1-to-1 relationship between articles in different Wikipedias. This fact should be acknowledged in the design of Wikidata. To me the current solution looks like you just ignore the problem at the basic layer and hope that it gets magically solved by the community ;-) > The advantage of this is that the possible relationships are not > system-defined but can be selected and modified by the community. This is not an advantage per se but a design decision. You always need to define some constraints by the system and let some other possibilities open. If you say "Relationships _such as_ 'broader' or 'narrower'" you do not acknowledge any basic kinds of relationships between Wikipedia articles in different languages. There may be arguments to do so, but I don't see them at the moment. Jakob P.S: Could you please all try to use the terms from Wikidata glossary? For instance there is no such thing as a "Wikidata item" but an "Entity" or a "Page". -- Verbundzentrale des GBV (VZG) Digitale Bibliothek - Jakob Voß Platz der Goettinger Sieben 1 37073 Goettingen - Germany +49 (0)551 39-10242 http://www.gbv.de jakob.voss(a)gbv.de

1 0

[Wikidata-l] remove me
by dave＠codeave.com 02 Apr '12

02 Apr '12

remove me Original Message: ----------------- From: JFC Morfin jefsey(a)jefsey.com Date: Mon, 02 Apr 2012 02:37:05 +0200 To: wikidata-l(a)lists.wikimedia.org Subject: Re: [Wikidata-l] Archiving references for facts? At 00:46 02/04/2012, John Erling Blad wrote: >US users can use <http://archive.org>archive.org, but foreigners >might get into troubles. Remember that wikipedia/-data isn't US >only. Both (?) Archive.org and webcitation removes conten on >request, it shall be some tricks to automate it for catalogues and >sites. Check it out. Another issue is to be considered in that line of thinking : copyrights protection by new/coming laws that permit lawyers to block a site preventively (eg. Cyberdefense proposition in the USA). Wikidata should permit a permanent legal operational to attend to a lawyer's demand in one country without blocking users from other countries. This would mean modularity, i.e. a conflict with the concept of a central source. Another issue is that "truth is not always good to say". In lingual versions, not telling a full truth is acceptable. Not for a central reference system like Wikidata. With probable retaliations against its credibility. I will pick a well known example. Many interests (including Govs) wish to hide people the way the DNS is really designed. http://en.wikipedia.org/wiki/Domain_Name_System does not hide that the DNS uses CLASSes and it correctly states that "Each class is an independent name space with potentially different delegations of DNS zones". But it only states: "The CLASS of a record is set to IN (for Internet) for common DNS records involving Internet hostnames, servers, or IP addresses. In addition, the classes Chaos (CH) and Hesiod (HS) exist." This actually hides Wikipedia's readers that the DNS actually counts 65,536 CLASSes, and that a CLASS actually means a fully independent Root file, meaning that the Internet could run perfectly well with 65,536 ICANN/NTIA similar set-ups. And incidentally without root file systems. This is not something that Wikidata acting as a world unique source on the DNS could hide. The only response of those who want a status quo in the people's beliefs about the DNS, would be to fight the credibility of Wikidata. In spite of decades old URLs to the DNS RFCs. Such campaign would multiply with all the truths which are not good to say and that Wikidata will have to collect. We should technically be prepared to opposes such campaigns in having a very easy system to use in order to confirm our sources : the actual words of the concerned texts, not only the URL to a document containing them. jfc -------------------------------------------------------------------- mail2web.com What can On Demand Business Solutions do for you? http://link.mail2web.com/Business/SharePoint

2 1

Re: [Wikidata-l] Archiving references for facts?
by JFC Morfin 01 Apr '12

01 Apr '12

At 00:46 02/04/2012, John Erling Blad wrote: >US users can use <http://archive.org>archive.org, but foreigners >might get into troubles. Remember that wikipedia/-data isn't US >only. Both (?) Archive.org and webcitation removes conten on >request, it shall be some tricks to automate it for catalogues and >sites. Check it out. Another issue is to be considered in that line of thinking : copyrights protection by new/coming laws that permit lawyers to block a site preventively (eg. Cyberdefense proposition in the USA). Wikidata should permit a permanent legal operational to attend to a lawyer's demand in one country without blocking users from other countries. This would mean modularity, i.e. a conflict with the concept of a central source. Another issue is that "truth is not always good to say". In lingual versions, not telling a full truth is acceptable. Not for a central reference system like Wikidata. With probable retaliations against its credibility. I will pick a well known example. Many interests (including Govs) wish to hide people the way the DNS is really designed. http://en.wikipedia.org/wiki/Domain_Name_System does not hide that the DNS uses CLASSes and it correctly states that "Each class is an independent name space with potentially different delegations of DNS zones". But it only states: "The CLASS of a record is set to IN (for Internet) for common DNS records involving Internet hostnames, servers, or IP addresses. In addition, the classes Chaos (CH) and Hesiod (HS) exist." This actually hides Wikipedia's readers that the DNS actually counts 65,536 CLASSes, and that a CLASS actually means a fully independent Root file, meaning that the Internet could run perfectly well with 65,536 ICANN/NTIA similar set-ups. And incidentally without root file systems. This is not something that Wikidata acting as a world unique source on the DNS could hide. The only response of those who want a status quo in the people's beliefs about the DNS, would be to fight the credibility of Wikidata. In spite of decades old URLs to the DNS RFCs. Such campaign would multiply with all the truths which are not good to say and that Wikidata will have to collect. We should technically be prepared to opposes such campaigns in having a very easy system to use in order to confirm our sources : the actual words of the concerned texts, not only the URL to a document containing them. jfc

1 0

Re: [Wikidata-l] Archiving references for facts?
by Sumana Harihareswara 01 Apr '12

01 Apr '12

Thanks for the link, Helder! You can keep up on the current status of Kevin's project at https://www.mediawiki.org/wiki/User:Kevin_Brown/ArchiveLinks/status -- Sumana Harihareswara Volunteer Development Coordinator Wikimedia Foundation > Maybe this GSoC project (from 2011) will be relevant: > http://www.mediawiki.org/wiki/User:Kevin_Brown/ArchiveLinks/Design > > Best regards, > Helder > > On Sun, Apr 1, 2012 at 06:31, emijrp <emijrp at gmail.com> wrote: >> Hi all; >> >> I have read that every fact for every entity must include a reference. How >> is Wikidata going to deal with dead links? I hope we can work on this >> developing an archivist bot, to archive links into WebCitation or using >> Internet Archive. This is an old problem in all Wikipedias, and it is >> correctly addressed (the only example I know is French Wikipedia using >> Wikiwix.com to archive references and external links). >> >> Regards, >> emijrp

1 0

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Wikidata April 2012