Hi!
Some time ago I was categorizing some pages at pt.wikibooks and I found the following curious situation: I've used the code [[Category:Test|*]] in these pages: http://pt.wikibooks.org/w/index.php?title=User:Heldergeovane/*Bi* ologia_celular/Índice&action=edithttp://pt.wikibooks.org/w/index.php?title=User:Heldergeovane/Biologia_celular/%C3%8Dndice&action=edit http://pt.wikibooks.org/w/index.php?title=User:Heldergeovane/*Bo* nsai_no_Brasil:_Índice&action=edithttp://pt.wikibooks.org/w/index.php?title=User:Heldergeovane/Bonsai_no_Brasil:_%C3%8Dndice&action=edit
So, we expect: * Both pages should appear at "Category:Test" under "*", and * "User:Heldergeovane/B*i*ologia_celular/Índice" should be before "User:Heldergeovane/B*o*nsai_no_Brasil:_Índice" (since "i" comes first than "o").
However, what we found at http://pt.wikibooks.org/w/index.php?title=Category:Test is the reverse order.
1) What is the criteria for ordering the pages when the sort key of two pages are the same? 2) Is there anything wrong with the ordering of the two pages above?
Helder
2009/8/10 Helder Geovane Gomes de Lima heldergeovane@gmail.com:
- What is the criteria for ordering the pages when the sort key of two
pages are the same?
I think they're ordered by page ID, but I'm not sure. For all practical purposes, the ordering of pages with the same sortkey is undefined.
Roan Kattouw (Catrope)
On 8/10/09 12:18 PM, Roan Kattouw wrote:
2009/8/10 Helder Geovane Gomes de Limaheldergeovane@gmail.com:
- What is the criteria for ordering the pages when the sort key of two
pages are the same?
I think they're ordered by page ID, but I'm not sure. For all practical purposes, the ordering of pages with the same sortkey is undefined.
To clarify, here's the information that's available when sorting a category membership list:
* category name (fixed, since we're looking at a particular category) * sort key (normally the page title, unless you overrode it) * page ID (roughly corresponds to page creation time)
The page title can only be applied to the sorting if it's actually *in* the sort key. If you've overridden it, then *only* the sort key you provided will have any relevance in ordering; page ID will serve as a 'tiebreaker' but isn't really predictable.
-- brion
On Mon, Aug 10, 2009 at 4:07 PM, Brion Vibberbrion@wikimedia.org wrote:
To clarify, here's the information that's available when sorting a category membership list:
- category name (fixed, since we're looking at a particular category)
- sort key (normally the page title, unless you overrode it)
- page ID (roughly corresponds to page creation time)
The page title can only be applied to the sorting if it's actually *in* the sort key. If you've overridden it, then *only* the sort key you provided will have any relevance in ordering; page ID will serve as a 'tiebreaker' but isn't really predictable.
We could break ties by appending the page title to custom sort keys, if this is a problem.
2009/8/10 Aryeh Gregor <Simetrical+wikilist@gmail.comSimetrical%2Bwikilist@gmail.com
We could break ties by appending the page title to custom sort keys, if this is a problem.
I think it would be good! =)
(We actually have manually used "*{{PAGENAME}}" for a while... =S)
Helder
2009/8/11 Helder Geovane Gomes de Lima heldergeovane@gmail.com:
2009/8/10 Aryeh Gregor
We could break ties by appending the page title to custom sort keys, if this is a problem.
I think it would be good! =)
I don’t (at least not in the way it is expressed). If you want to use the page title as a tiebreaker, then add it as a new column to the index (before the page_id), not (as I read the original sentence) by appending the title to the sort key.
Otherwise, you’ll have to separate the sort key from the title with some control character under U+0020 (to ensure correct ordering of different-length sort keys – you need a separator which sorts before any valid character), which would be messy.
But still, I don’t see the point in doing that. You don’t want a page called “Aaa” to come after a page called “Abc” when you set their sortkeys both to the same value? Don’t do that then. Set the sortkey accordingly to what you want.
(OBTW, a different thing is that category paging is probably buggy in this tiebreaking aspect – even though the index is correctly defined to be unique, the page_id column is not included in the &from= paging parameter. But this bug will probably appear only in extreme cases, like 300 articles with an identical sortkey.)
-- [[cs:User:Mormegil | Petr Kadlec]]
2009/8/11 Petr Kadlec petr.kadlec@gmail.com:
I don’t (at least not in the way it is expressed). If you want to use the page title as a tiebreaker, then add it as a new column to the index (before the page_id), not (as I read the original sentence) by appending the title to the sort key.
The page title is not in the categorylinks table, so we can't add it to the index.
Otherwise, you’ll have to separate the sort key from the title with some control character under U+0020 (to ensure correct ordering of different-length sort keys – you need a separator which sorts before any valid character), which would be messy.
But still, I don’t see the point in doing that. You don’t want a page called “Aaa” to come after a page called “Abc” when you set their sortkeys both to the same value? Don’t do that then. Set the sortkey accordingly to what you want.
Exactly. When using identical sortkeys, you shouldn't complain that MediaWiki doesn't magically know in which order you want to sort them. You can make it predictable by using a (more) unique sortkey.
Roan Kattouw (Catrope)
On Tue, Aug 11, 2009 at 4:35 AM, Petr Kadlecpetr.kadlec@gmail.com wrote:
(OBTW, a different thing is that category paging is probably buggy in this tiebreaking aspect – even though the index is correctly defined to be unique, the page_id column is not included in the &from= paging parameter. But this bug will probably appear only in extreme cases, like 300 articles with an identical sortkey.)
It will return slightly wrong results whenever two articles with the same sort key happen to hit a page boundary. It's not a huge deal, since sortkeys are usually fairly unique, but it shouldn't be hard to fix if cl_from is already part of the sortkey index -- which it is, on trunk, although I can't say for sure whether that matches the deployed version.
wikitech-l@lists.wikimedia.org