On 22 September 2013 15:54, Amir Ladsgroup ladsgroup@gmail.com wrote:
Hello, Persian Wikipedia is one of the largest wikis based on number of categories but It's not very common that people consider adding interwiki of categories (they think interwiki is just for articles) so we have tons of tons (before writing my engine that was 30K out of 170K) categories without any interwikis which is really bad. I wrote some codes to make it better but It wasn't enough So I wrote an engine that gets two database: 1-list of categories without interwiki 2-list of categories with interwiki to a certain language (e.g. English) with the target interwiki and after that my bot analyzes and "guess" what is the correct interwiki of category based on patterns of naming them in the second database and bot reports. After running this code on fa.wp there was a very huge report [1] and we started to sort things out (merging duplicates [2], deleting extra ones, adding the correct iw) and now it's less than 25K categories without interwikis (and It's becoming less and less) we did the same on templates namespace [3] and we interwikified more than 10K templates after that.
And because this engine doesn't use any language-related analyses It can be ran in any language and get interwiki from any language (we planned to run this on Persian Wikipedia again but this time we use Dutch and German languages as repo of interwiki)
So here is my question: Is there similar situation in your wiki? Do you want to run this code in your wiki too? Do you have any suggestion? [1]: https://fa.wikipedia.org/w/index.php?title=%DA%A9%D8%A7%D8%B1%D8%A8%D8%B1:La... ردهها&oldid=10959457 [2]: One of the benefits of running this engine is we can find duplicates [3]:
https://fa.wikipedia.org/wiki/%DA%A9%D8%A7%D8%B1%D8%A8%D8%B1:Ladsgroup/%D8%A... Best
Hi Amir -
I have a different question. Why is it in the interests of Fawiki to use the same categorization system as any other project? I ask this because I know that almost every Wikipedia has variations in the way that it categorizes articles and other pages, and there is not really a cross-wiki standard - nor would I expect one. Categorization is more or less in the same realm as defining notability, determining neutral point of view, and Manuals of Style: while philosophically we are very similar across all the Wikipedias, each project has a slightly different way of addressing these situations.
I'd suggest that the issue isn't really a technical problem, it's more a cultural one. That is, Wikipedia community cultures have developed categorization systems slightly differently, so it is unlikely that any one will be a perfect match for another.
Risker/Anne