Hello,
I've drafted my proposal about language fallback and conversion issues for Wikidata at [1].
Currently Wikidata stores multilingual contents. Labels (names, descriptions etc) are expected to be written in every language, so every user can read them in their own language. But there're some problems currently:
* If some content doesn't exist in some specific language, users with this exact language set in their preferences see something meaningless (its ID instead). This renders some language with fewer users (thus fewer labels filled) even unusable. * There're some similar languages which may often share the same value. Having strings populated for every language one by one wastes resources and may allow them out of sync later. * Even for languages which are not "that similar", MediaWiki already has some facility to transliterate (aka. convert) contents from its another sister language (aka. variant) which can be used to provide better results for users.
This proposal aims at resolving these issues by displaying contents from another language to users based on user preferences (some users may know more than one languages), language similarity (language fallback chain), or the possibility to do transliteration, and allow proper editing on these contents.
Although Wikidata is in its fast development stage, lots of data have been added to it. The later we resolve these issues, the more duplications may be created which will require more clean up work in the future, like what we had to face before / when the language converter (that transliteration system) was introduced for the Chinese Wikipedia. So I'm planning to do this project in this summer.
There's also a backup proposal about category redirects at [2]. I wrote it because I really want to see it implemented too, either by me or someone else. Some of its contents may be also useful for other participants willing to do this project.
Comments are welcome and appreciated.
[1] https://www.mediawiki.org/wiki/User:Liangent/wb-lang [2] https://www.mediawiki.org/wiki/User:Liangent/cat-redir
-Liangent
How about if I don't want such fallback to work for me? What if I'd like to see what is labeled and what is not? Have you considered this a user option with a flexible fallback schema or a site-wide preference with a fixed one?
In general, this is a very good Wikidata feature yet not implemented. And thanks for raising the category redirects once again! :)
On Sat, Apr 27, 2013 at 11:07 PM, Liangent liangent@gmail.com wrote:
Hello,
I've drafted my proposal about language fallback and conversion issues for Wikidata at [1].
Currently Wikidata stores multilingual contents. Labels (names, descriptions etc) are expected to be written in every language, so every user can read them in their own language. But there're some problems currently:
- If some content doesn't exist in some specific language, users with
this exact language set in their preferences see something meaningless (its ID instead). This renders some language with fewer users (thus fewer labels filled) even unusable.
- There're some similar languages which may often share the same
value. Having strings populated for every language one by one wastes resources and may allow them out of sync later.
- Even for languages which are not "that similar", MediaWiki already
has some facility to transliterate (aka. convert) contents from its another sister language (aka. variant) which can be used to provide better results for users.
This proposal aims at resolving these issues by displaying contents from another language to users based on user preferences (some users may know more than one languages), language similarity (language fallback chain), or the possibility to do transliteration, and allow proper editing on these contents.
Although Wikidata is in its fast development stage, lots of data have been added to it. The later we resolve these issues, the more duplications may be created which will require more clean up work in the future, like what we had to face before / when the language converter (that transliteration system) was introduced for the Chinese Wikipedia. So I'm planning to do this project in this summer.
There's also a backup proposal about category redirects at [2]. I wrote it because I really want to see it implemented too, either by me or someone else. Some of its contents may be also useful for other participants willing to do this project.
Comments are welcome and appreciated.
[1] https://www.mediawiki.org/wiki/User:Liangent/wb-lang [2] https://www.mediawiki.org/wiki/User:Liangent/cat-redir
-Liangent
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
We already have a way to handle that kind of thing with normal language fallbacks. &uselang=qqx. Wikidata should be able to do something similar trivially.
On Sun, Apr 28, 2013 at 5:05 AM, Paul Selitskas p.selitskas@gmail.com wrote:
How about if I don't want such fallback to work for me? What if I'd like to see what is labeled and what is not? Have you considered this a user option with a flexible fallback schema or a site-wide preference with a fixed one?
In general, this is a very good Wikidata feature yet not implemented. And thanks for raising the category redirects once again! :)
I guess being able to see what's translated and what's not can be resolved by appending language names to labels when it's falling back to another language.
On Sun, Apr 28, 2013 at 5:55 AM, Lukas Benedix benedix@zedat.fu-berlin.de wrote:
Hi,
If I understand your proposal on User:Liangent/wb-lang right you want to write an extension or implement the language-fallback in directly wikibase.
I thougt about doing something about the missing-lang-issue by myself and think writing a gadget would have a higher possibility to get it deployed on wikidata.org. And users who don't like it could easily disable the gadget in their preferences.
I guess I prefer to patch Wikibase directly.
You should keep in mind that the userinterface of such a feature is not easy to design.
So there're two weeks used for collecting feedback about designs in my proposal.
On Sun, Apr 28, 2013 at 10:03 AM, Daniel Friesen daniel@nadir-seen-fire.com wrote:
We already have a way to handle that kind of thing with normal language fallbacks. &uselang=qqx. Wikidata should be able to do something similar trivially.
I don't understand how &uselang=qqx would work for this. Any explanation?
-Liangent
What exactly is uselang=qqx doing?
http://www.wikidata.org/wiki/Q567?uselang=qqx
looks like this:
http://lbenedix.monoceres.uberspace.de/screenshots/oojp8u26nl_(2013-04-28_17... http://lbenedix.monoceres.uberspace.de/screenshots/oojp8u26nl_%282013-04-28_17.13.54%29.png
Lukas
Am So 28.04.2013 04:03, schrieb Daniel Friesen:
We already have a way to handle that kind of thing with normal language fallbacks. &uselang=qqx. Wikidata should be able to do something similar trivially.
On 04/28/2013 11:15 AM, Lukas Benedix wrote:
What exactly is uselang=qqx doing?
http://www.wikidata.org/wiki/Q567?uselang=qqx
looks like this:
http://lbenedix.monoceres.uberspace.de/screenshots/oojp8u26nl_(2013-04-28_17... http://lbenedix.monoceres.uberspace.de/screenshots/oojp8u26nl_%282013-04-28_17.13.54%29.png
uselang in general lets you temporarily display an interface with a different language. You can use it with regular languages (uselang=es, uselang=zh, etc.).
However, uselang=qqx is special. It shows the keys (identifiers for a message) of the MW messages (https://www.mediawiki.org/wiki/Manual:System_messages) used in the page.
Matt Flaschen
Okay…
So uselang is not usable for Liangents proposal.
An item could have the label only in en and the description only in en-ca... uselang=en would miss the one and uselang=en-ca the other… his solution would display en as well as en-ca for both (with the missing language highlighted in a not yet defined way)
I still think that a gadget would be the best for his proposal… users could easily turn it off and modify it for their own needs.
Lukas
Am So 28.04.2013 18:11, schrieb Matthew Flaschen:
On 04/28/2013 11:15 AM, Lukas Benedix wrote:
What exactly is uselang=qqx doing?
http://www.wikidata.org/wiki/Q567?uselang=qqx
looks like this:
http://lbenedix.monoceres.uberspace.de/screenshots/oojp8u26nl_(2013-04-28_17... http://lbenedix.monoceres.uberspace.de/screenshots/oojp8u26nl_%282013-04-28_17.13.54%29.png
uselang in general lets you temporarily display an interface with a different language. You can use it with regular languages (uselang=es, uselang=zh, etc.).
However, uselang=qqx is special. It shows the keys (identifiers for a message) of the MW messages (https://www.mediawiki.org/wiki/Manual:System_messages) used in the page.
Matt Flaschen
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
On Mon, Apr 29, 2013 at 12:23 AM, Lukas Benedix benedix@zedat.fu-berlin.de wrote:
Okay…
So uselang is not usable for Liangents proposal.
An item could have the label only in en and the description only in en-ca... uselang=en would miss the one and uselang=en-ca the other… his solution would display en as well as en-ca for both (with the missing language highlighted in a not yet defined way)
I still think that a gadget would be the best for his proposal… users could easily turn it off and modify it for their own needs.
Maybe this is true for fallback-only requirement, but it's not true when conversion/transliteration comes, as conversion must take place server-side.
-Liangent
Lukas
Am So 28.04.2013 18:11, schrieb Matthew Flaschen:
On 04/28/2013 11:15 AM, Lukas Benedix wrote:
What exactly is uselang=qqx doing?
http://www.wikidata.org/wiki/Q567?uselang=qqx
looks like this:
http://lbenedix.monoceres.uberspace.de/screenshots/oojp8u26nl_(2013-04-28_17...
http://lbenedix.monoceres.uberspace.de/screenshots/oojp8u26nl_%282013-04-28_17.13.54%29.png
uselang in general lets you temporarily display an interface with a different language. You can use it with regular languages (uselang=es, uselang=zh, etc.).
However, uselang=qqx is special. It shows the keys (identifiers for a message) of the MW messages (https://www.mediawiki.org/wiki/Manual:System_messages) used in the page.
Matt Flaschen
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Hi,
If I understand your proposal on User:Liangent/wb-lang right you want to write an extension or implement the language-fallback in directly wikibase.
I thougt about doing something about the missing-lang-issue by myself and think writing a gadget would have a higher possibility to get it deployed on wikidata.org. And users who don't like it could easily disable the gadget in their preferences.
You should keep in mind that the userinterface of such a feature is not easy to design.
btw: You can get labels and descriptions in all available language via the api-call: action=wbgetentities&format=json&ids=Q567
Lukas
Am Sa 27.04.2013 22:07, schrieb Liangent:
Hello,
I've drafted my proposal about language fallback and conversion issues for Wikidata at [1].
Currently Wikidata stores multilingual contents. Labels (names, descriptions etc) are expected to be written in every language, so every user can read them in their own language. But there're some problems currently:
- If some content doesn't exist in some specific language, users with
this exact language set in their preferences see something meaningless (its ID instead). This renders some language with fewer users (thus fewer labels filled) even unusable.
- There're some similar languages which may often share the same
value. Having strings populated for every language one by one wastes resources and may allow them out of sync later.
- Even for languages which are not "that similar", MediaWiki already
has some facility to transliterate (aka. convert) contents from its another sister language (aka. variant) which can be used to provide better results for users.
This proposal aims at resolving these issues by displaying contents from another language to users based on user preferences (some users may know more than one languages), language similarity (language fallback chain), or the possibility to do transliteration, and allow proper editing on these contents.
Although Wikidata is in its fast development stage, lots of data have been added to it. The later we resolve these issues, the more duplications may be created which will require more clean up work in the future, like what we had to face before / when the language converter (that transliteration system) was introduced for the Chinese Wikipedia. So I'm planning to do this project in this summer.
There's also a backup proposal about category redirects at [2]. I wrote it because I really want to see it implemented too, either by me or someone else. Some of its contents may be also useful for other participants willing to do this project.
Comments are welcome and appreciated.
[1] https://www.mediawiki.org/wiki/User:Liangent/wb-lang [2] https://www.mediawiki.org/wiki/User:Liangent/cat-redir
-Liangent
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
wikitech-l@lists.wikimedia.org