Could you give an example of what we could do better than CLDR or the relevant ISO standards?
On 18 May 2014 10:06, h hanteng@gmail.com wrote:
Dear Nemo,
As I am waiting for a more complete response, I am not sure that I
understand your last "No" as in "No, we definitely can't" means. To clarify, take the CLDR supplement Language-Territory information for example
http://www.unicode.org/cldr/charts/latest/supplemental/language_territory_in...
One can suggest additions of the data point by submitting sourced
numbers for a geo-linguistic population like this: http://unicode.org/cldr/trac/newticket?&description=%3Cterritory%2c%20sp...)
In Wikipedia articles and Wikidata pages, there are many attempts to
provide more updated and better sourced data points. I see the potentials in exchanging such data, curating them better in Wikidata projects as more detailed and dynamic source than the CLDR.
These data points will have extra benefits in curating traffic data.
For one, these geo-linguistic population data points would be useful to normalize traffic data for further analysis, such as geographic normalization. For another, they provide important reference data for the development strategies and policies of the Wikipedia projects.
Best, han-teng liao
2014-05-18 16:23 GMT+08:00 Federico Leva (Nemo) nemowiki@gmail.com:
Thanks for your suggestions. Just some quick pointers below.
h, 18/05/2014 08:26:
(I-A). Tabulate the data points in absolute numbers first, not percentage numbers [...]
(I-B). Include all language versions for the *editing traffic* report as well. [...]
(I-C). Provide static data objects in more accessible format (i.e. csv and/or json). [...]
(II-A). Putting viewing traffic and editing traffic report on the same page. [...]
(II-B). Organizing and archiving the traffic reports for historical comparison. [...]
(I-C). Provide dynamic data objects in more accessible format (i.e. csv and/or json).
At least the first four are "just" changes in the WikiStats reports formatting, personally I encourage you to submit patches: < https://git.wikimedia.org/summary/analytics%2Fwikistats.git%3E (should be the "squids" directory, but there is some ongoing refactoring of the repos).
On archives and "history rewriting"/reports regeneration, see also https://bugzilla.wikimedia.org/show_bug.cgi?id=46198
[...] (III-B). Smaller (i.e more specific) geographic aggregate units.
The country (geographic) information is often based on geo-IP databases, and sometimes provincial and city-level data would be available.
http://lists.wikimedia.org/pipermail/wikitech-l/2014-April/075964.html
[...]
( I know that the Unicode Common Locale Data Repository (CLDR Version 25 http://cldr.unicode.org/index/downloads/cldr-25) provides“language-territory” http://www.unicode.org/cldr/charts/latest/supplemental/ language_territory_information.htmlor “territory-language” http://www.unicode.org/cldr/charts/latest/supplemental/ territory_language_information.htmlunit-based
charts, but I believe that the Wikimedia projects can use and build one better..) [...]
No, we definitely can't, not alone. I've asked for help, please contribute: https://www.mediawiki.org/wiki/Universal_Language_ Selector/FAQ#How_does_Universal_Language_Selector_ determine_which_languages_I_may_understand.
Nemo
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l