[Wikidata-l] Notability in Wikidata

JFC Morfin jefsey at jefsey.com
Mon Apr 2 13:56:37 UTC 2012


Binaris, Oren,

At 13:43 02/04/2012, Oren Bochman wrote:
>>>WikiData's mandate appears to be to improve en.Wikipedia (and then 
>>>the other WMF projects). If WikiData cannot serve the needs of 
>>>en.wikipedia it will be of little consequence to the other 
>>>projects since it will need a community to maintain.
>
>>This is a new piece of information for me. Does enwiki really have 
>>any priority to other Wikipedias? :-O I would be unhappy with that.
>
>My point is a pragmatic one – not a political one. Note that 
>Wikidata is planned to add language prototypes in the third part of 
>the time line - that is my reference.
>
>If you read my note you will notice I propose to  take a realistic 
>view of the communities. It would be a mistake to forget 
>that  en.wikipeia is where the bulk of the information exists, the 
>largest community and the source of the most difficult policies i.e. 
>the greatest  challenges.
>
>However getting data into Wikidata is only half of the problem. To 
>get it out as to ar...zu.Wikpedias without causing chaos -- is what 
>we are here to figure out.

We have a good experience of that type of problem with the "layer 
below": languages (data are to be expressed in a language). This is 
why I raised the point with Lydia and helped her in asking for a more 
organized road map when she said she did not fully understand.

This experience has been with the Governance (countries, and 
languages), with the character sets (ISO 10646 and Unicode), with the 
langtags (RFC 5646) and filtering (RFC 4647), with the non-ASCII 
domain names and mail addresses. In theses cases we met with the 
interlinguistic (languages), inter-readibility (scripts), 
interconnectibility (infrastructure) and the interoperability 
(protocols) issues which made the Internet so far. The Internet+ 
layer set is new, in use, and clear but by far not stabilized yet (I 
quoted the IETF Drafts on this) . It concerns fringe to fringe 
interintelligibility (the next upper network stratum will be the 
Intersem (semiotic Internet, the Internet of thoughts) which is my 
main interest with auxiliary intelligence (not for today !))

Wikimedia follows the same strata advancement, but on top of the web 
application. This means that three fundamental questions are raised 
by the Wikidata project:

1. is it technically to be a web or a network+ application (this has 
no impact on the content, but this does impact the architectural and 
referential issues and the future of the Intersem).
2. how is to be interintelligible: in being internationalized (the 
leading culture hosts the others) or multilingualised (every language 
is treated equal).
3. how is it going to support (actually not to block further 
innovation in) intercomprehensibility i.e. to be able to address the 
Intersem stratum "capital of Israel" issue.

Point 1 : is to decide who (humanists or technicalists) are to be the 
leaders in the Wikidata specifications.

Point 2 : is to decide if we address the logical needs of en.wiki and 
then propagate the solutions to the other languages, or if we 
actually are pragmatic and acknowledge that the need is for a 
multilingual system. So, we try first to address this fully new 
problem among a limited set of wikis (let say the three leading ones 
[amenable to ASCII, but with diacritics] and then with a non-roman one.

Point 3 : is to understand and accept the complexity that Wikidata is 
having to face and to try not to over complexify its future with 
today's "great" solutions and habits.

Examples:

1. Network related issues may become a first nightmare as many URIs 
will become IRIs (different scripts) and the digital convergence will 
need to use DNS CLASSes, at least between different technologies. 
This means both that en.wiki will have to support URI/IRI 
indifferently. English contributors have no Chinese keyboards. 
CLASSes are not even supported yet in the way to publish a domain 
name. The more communications layers, the more complexity, the less 
security, the less room and power in mobile devices.

2. The ISO and IETF experience shows that the 
multicultural/multilinguistic issue is already very complex to 
architecturally and technically address (it took ten years, and it is 
not by far digested) due to people intuitive agreements with 
stereotypes. The Network side solution to this complexity is 
"subsidiarity" (i.e. the current Wikipedia solution). Wikidata has 
therefore to be conceived as a service to subsidiarity (the wikimedia 
and the Internet+ ones).

3. Then, the third problem no one has addressed yet except ISO 3166, 
is variance : two identical particulars (effects, names, data, etc.) 
may be different. eg. there are many ways to compute and present the 
same date. Are the results to be stored in Wikidata in all these ways 
every day and bridges to be built? or are they to be stored as a 
single data with the formulas to compute them, then how to be sure 
some parameters have not changed (i.e. death of the Emperor) and 
computation was not tampered with? Variance is everywhere (actually 
variance is most probably Life). ISO 3166 has no variance, because it 
is the sovereign reference: the list of States and laws languages 
(however, Palestine is in it already, Taiwan is there). ISO documents 
are in French, English and possibly in Russian. ISO 3166:1 states 
which are the normative languages in every country by reference to 
ISO 639 (list of language names). ISO 3166 defines the ccTLDs and is 
used in langtags to document languages and cultures. ISO 10646 
(supported by UNICODE) is the scripts character coded tables. At 
binary layer it is full of variants (same graphs being supported by 
different code points).

Best
jfc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.wikimedia.org/pipermail/wikidata/attachments/20120402/bff4cdc6/attachment.html>


More information about the Wikidata mailing list