Binaris, Oren,
At 13:43 02/04/2012, Oren Bochman wrote:
>>>WikiData's mandate appears to be to improve en.Wikipedia (and then
>>>the other WMF projects). If WikiData cannot serve the needs of
>>>en.wikipedia it will be of little consequence to the other
>>>projects since it will need a community to maintain.
>
>>This is a new piece of information for me. Does enwiki really have
>>any priority to other Wikipedias? :-O I would be unhappy with that.
>
>My point is a pragmatic one not a political one. Note that
>Wikidata is planned to add language prototypes in the third part of
>the time line - that is my reference.
>
>If you read my note you will notice I propose to take a realistic
>view of the communities. It would be a mistake to forget
>that en.wikipeia is where the bulk of the information exists, the
>largest community and the source of the most difficult policies i.e.
>the greatest challenges.
>
>However getting data into Wikidata is only half of the problem. To
>get it out as to ar...zu.Wikpedias without causing chaos -- is what
>we are here to figure out.
We have a good experience of that type of problem with the "layer
below": languages (data are to be expressed in a language). This is
why I raised the point with Lydia and helped her in asking for a more
organized road map when she said she did not fully understand.
This experience has been with the Governance (countries, and
languages), with the character sets (ISO 10646 and Unicode), with the
langtags (RFC 5646) and filtering (RFC 4647), with the non-ASCII
domain names and mail addresses. In theses cases we met with the
interlinguistic (languages), inter-readibility (scripts),
interconnectibility (infrastructure) and the interoperability
(protocols) issues which made the Internet so far. The Internet+
layer set is new, in use, and clear but by far not stabilized yet (I
quoted the IETF Drafts on this) . It concerns fringe to fringe
interintelligibility (the next upper network stratum will be the
Intersem (semiotic Internet, the Internet of thoughts) which is my
main interest with auxiliary intelligence (not for today !))
Wikimedia follows the same strata advancement, but on top of the web
application. This means that three fundamental questions are raised
by the Wikidata project:
1. is it technically to be a web or a network+ application (this has
no impact on the content, but this does impact the architectural and
referential issues and the future of the Intersem).
2. how is to be interintelligible: in being internationalized (the
leading culture hosts the others) or multilingualised (every language
is treated equal).
3. how is it going to support (actually not to block further
innovation in) intercomprehensibility i.e. to be able to address the
Intersem stratum "capital of Israel" issue.
Point 1 : is to decide who (humanists or technicalists) are to be the
leaders in the Wikidata specifications.
Point 2 : is to decide if we address the logical needs of en.wiki and
then propagate the solutions to the other languages, or if we
actually are pragmatic and acknowledge that the need is for a
multilingual system. So, we try first to address this fully new
problem among a limited set of wikis (let say the three leading ones
[amenable to ASCII, but with diacritics] and then with a non-roman one.
Point 3 : is to understand and accept the complexity that Wikidata is
having to face and to try not to over complexify its future with
today's "great" solutions and habits.
Examples:
1. Network related issues may become a first nightmare as many URIs
will become IRIs (different scripts) and the digital convergence will
need to use DNS CLASSes, at least between different technologies.
This means both that en.wiki will have to support URI/IRI
indifferently. English contributors have no Chinese keyboards.
CLASSes are not even supported yet in the way to publish a domain
name. The more communications layers, the more complexity, the less
security, the less room and power in mobile devices.
2. The ISO and IETF experience shows that the
multicultural/multilinguistic issue is already very complex to
architecturally and technically address (it took ten years, and it is
not by far digested) due to people intuitive agreements with
stereotypes. The Network side solution to this complexity is
"subsidiarity" (i.e. the current Wikipedia solution). Wikidata has
therefore to be conceived as a service to subsidiarity (the wikimedia
and the Internet+ ones).
3. Then, the third problem no one has addressed yet except ISO 3166,
is variance : two identical particulars (effects, names, data, etc.)
may be different. eg. there are many ways to compute and present the
same date. Are the results to be stored in Wikidata in all these ways
every day and bridges to be built? or are they to be stored as a
single data with the formulas to compute them, then how to be sure
some parameters have not changed (i.e. death of the Emperor) and
computation was not tampered with? Variance is everywhere (actually
variance is most probably Life). ISO 3166 has no variance, because it
is the sovereign reference: the list of States and laws languages
(however, Palestine is in it already, Taiwan is there). ISO documents
are in French, English and possibly in Russian. ISO 3166:1 states
which are the normative languages in every country by reference to
ISO 639 (list of language names). ISO 3166 defines the ccTLDs and is
used in langtags to document languages and cultures. ISO 10646
(supported by UNICODE) is the scripts character coded tables. At
binary layer it is full of variants (same graphs being supported by
different code points).
Best
jfc