On 6/6/06, Roger Luethi collector@hellgate.ch wrote:
On Tue, 06 Jun 2006 07:54:27 -0400, Anthony DiPierro wrote:
That brings up another, longer term, to-do for categories: they should be language independent. For instance [[Marie Curie]] is in de: and en: (they happen to have the same title, but even if they don't they are linked via interwiki links). [[Kategorie:Pole]] is linked to [[Category:Polish people]]. So there should be no need to categorize Marie Curie twice (multiply by the actual number of languages which have a Polish people category and an article on Marie Curie).
http://meta.wikimedia.org/wiki/Wikidata
From what I know of that project, it's a lot more extravagant than
what I'm talking about. Infoboxes are a lot more complicated than categories. Categories are just sets of articles, and the interwiki links are already there. Infoboxes have multiple types of fields, with various constraints on each of them, and various translation issues most of which haven't even been begun.
This is pretty simple theoretically. The only real problem is getting the multiple category schemes in sync. Considering your point about how the German categorization scheme differs from the English one, this might be a lot harder in practice than it is in theory.
Right. You _could_ make it work on some categories, say "Women" or "Nobelprize winners".
It cannot possible work for all categories, though, because different languages know different categories. For instance, security and safety is the same word in German: Sicherheit. Thus, [[de:Lifebelt]] is in the German category that is linked to the English category Security.
I'd say then that either the German [[Kategorie:Sicherheit]] should be disambiguated into two different categories, or that [[Kategorie:Sicherheit]] shouldn't be linked to [[Category:Security]], because they don't define the same set. Maybe an English [[Category:Security and Safety]] could be made, with "Security" and "Safety" as subcats - then [[Kategorie:Sicherheit]] could link to [[Category:Security and Safety]].
But maybe this is a common enough thing that that's not going to be reasonable. At some point someone should look at how the en categories differ from the de ones. I figure there will be 5 major points of difference:
1) Things being categorized at different levels ([[Category:Polish women]] vs. [[Kategorie:Frau]]. 2) Interwiki links between articles on different things. 3) Interwiki links between categories defining different sets. 4) Articles categorized where they shouldn't be. 5) Articles missing from categories where they should be.
1) is the reason why I call this a "longer-term solution". 2 and/or 3 are what you describe above. 4 and 5 are the reason why it would be useful to coordinate things.
It'd be interesting to get a decent size sample and sort the differences into those 5 categories. If 2 and/or 3 were significant, then I suppose this idea fails, at least initially. If 1 is significant, and I think it might be, then the idea rests upon reaching a consensus among the different Wikipedias as to what level to put things. And that's probably going to be dependent on having on-the-fly intersection categories.
Numbers are much easier to share. Population of a country. Weight of a molecule.
Personally I don't see the difference between taxonomies and attributes, as described. But I suppose one (taxonomies?) could be described as partitioning (an article can only be in one taxonomy category) whereas attributes can be mixed. Under that definition
I suspect this is impossible. But it's hard to tell if you're not even sure what counts as a taxonomy.
Roger
Yeah, I don't know. I'm not the one who made the initial distinction between taxonomies and attributes. I was just trying to make sense of it, rather poorly I guess. :)
Anthony