Hidden categories should appear in the category namespace, but not elsewhere.
As for being "Hidden" or "Admin", I can envision uses for both. Admin categories could have a separate collapsible listing, while hidden categories might have some other uses. Since we've also been discussing the problems of implementing "Category Intersection", an interim solution could be repopulating parent categories and "hiding" intersection categories. Fully populated parent categories are the norm in some projects like German Wikipedia and they also appear sometimes in English Wikipedia (eg. Category:Operas). I have a proposal posted currently about fully populating "Index" categories at en:Wikipedia talk:Categorization, and it would be much improved if the intersection categories could be hidden. The primary reason we have been deleting intersection categories is because they clutter articles. If they didn't clutter articles, they wouldn't be a problem.
Perhaps the non-hidden categories could be expanded with a [+] the same way subcategories are expanded. For example, if someone is listed under "Methodist", clicking on the plus might add the hidden categories "American methodist" or "Methodist presidents". This would require searching to see if any of the hidden categories are descendants of the clicked on category. This pseudo categorization intersection system would also be an incentive to get ready for a real implementation. For Category Intersection to work, hundreds of categories will need to be repopulated.
Along those lines, I'm wondering about yet another interim step toward full category intersection. A while back, several of us editors on English Wikipedia worked on a design for an interface for implementing Category Intersection (it is at en:WP:CI). We envision check boxes next to each category listing in an article, and then a button that queries the intersection. If making the query were to create a hidden category and automatically categorize all the articles that result from the query, the next time the request is made it could just display the results, just like any other category. There might be a timer that resets (every week?) that would force another query to update the category. This way each intersection query would happen fairly infrequently -- as infrequently as need be to keep from overloading the servers.
There would need to be a naming convention for the automatically generated categories, perhaps using a double colon -- so the intersection of Category:Mozart and Category:Operas would generate Category:Mozart::Operas. I don't think we'd want these auto-generated categories to be orphan categories. The category could be automatically put in a maintenance category, or better yet, a child category of each parent could be created to hold all automatically created categories. If the category is called "Operas" this holding category could be called "Intersections with Operas" or "Operas and..." If the query is worth keeping it could be recategorized by an editor (eg. Category:Operas by composer). It would probably be useful to be able to see how often the query was requested. If intersection categories get renamed, a category redirect should be able to get the user (and future queries) to the correct place.
If any of these intersection queries cause problems, an administrator could protect the category page. The next time the query is requested, the blocked page would keep the query from being run. The user would see the reason for the blocked query posted on the category page. This would prevent two or more huge categories from being intersected (eg. Category:Living People intersected with Category:Films). If the CPU time was analyzed for each query automatically, the blocks might be able to happen automatically.
-- Samuel Wantman en:User:Sam
On Sat, Feb 23, 2008 at 5:21 AM, Samuel Wantman wantman@earthlink.net wrote:
Since we've also been discussing the problems of implementing "Category Intersection", an interim solution could be repopulating parent categories and "hiding" intersection categories. Fully populated parent categories are the norm in some projects like German Wikipedia and they also appear sometimes in English Wikipedia (eg. Category:Operas). I have a proposal posted currently about fully populating "Index" categories at en:Wikipedia talk:Categorization, and it would be much improved if the intersection categories could be hidden. The primary reason we have been deleting intersection categories is because they clutter articles. If they didn't clutter articles, they wouldn't be a problem.
A very reasonable idea, that can be implemented Right Now with little effort on the user level. Important or interesting intersections can be manually populated, or populated in batches with bots. It's not as good as a real solution, but it should work well enough.
Perhaps the non-hidden categories could be expanded with a [+] the same way subcategories are expanded. For example, if someone is listed under "Methodist", clicking on the plus might add the hidden categories "American methodist" or "Methodist presidents".
A very specific feature. I might even call it kind of weird in its narrowness: it assumes a *very* particular use of hidden categories. I think it's okay to have people just click on the general category, and then navigate their way down to intersection categories if they're so inclined.
If making the query were to create a hidden category and automatically categorize all the articles that result from the query, the next time the request is made it could just display the results, just like any other category. There might be a timer that resets (every week?) that would force another query to update the category. This way each intersection query would happen fairly infrequently -- as infrequently as need be to keep from overloading the servers.
Well, this is just caching. We do that anyway (query cache, etc.). It's still not really acceptable enough. Especially because your proposed caching method would require not just a scan of a large chunk of an index, but insertion of potentially thousands or tens of thousands of rows.
There would need to be a naming convention for the automatically generated categories, perhaps using a double colon -- so the intersection of Category:Mozart and Category:Operas would generate Category:Mozart::Operas. . . .
By this point, the feature you describe would be more difficult to implement than just implementing real and properly efficient category intersection with Lucene or something. When you realize that to get your hack working properly, you need to implement so many workarounds than the real feature would be easier, it's time to discard the idea of a hack.
Simetrical wrote:
Since we've also been discussing the problems of implementing "Category Intersection", an interim solution could be repopulating parent categories and "hiding" intersection categories. Fully populated parent categories are the norm in some projects like German Wikipedia and they also appear sometimes in English Wikipedia (eg. Category:Operas). I have a proposal posted currently about fully populating "Index" categories at en:Wikipedia talk:Categorization, and it would be much improved if the intersection categories could be hidden. The primary reason we have been deleting intersection categories is because they clutter articles. If they didn't clutter articles, they wouldn't be a problem.
A very reasonable idea, that can be implemented Right Now with little effort on the user level. Important or interesting intersections can be manually populated, or populated in batches with bots. It's not as good as a real solution, but it should work well enough.
Fabulous! Can we do it? The ability to hide categories from pages will have a huge effect on categories. One that will be quite positive, I think. The current discussion started because a bot was set to repopulate categories, so the bots are ready. If this is going to happen, we need to start discussions about guidelines beforehand and get people on board, so please give us some warning in advance. If it literally can happen "right now", I'd actually prefer it be left turned off until us category wonks can generate a plan.
...
There would need to be a naming convention for the automatically generated categories, perhaps using a double colon -- so the intersection of Category:Mozart and Category:Operas would generate Category:Mozart::Operas. . . .
By this point, the feature you describe would be more difficult to implement than just implementing real and properly efficient category intersection with Lucene or something. When you realize that to get your hack working properly, you need to implement so many workarounds than the real feature would be easier, it's time to discard the idea of a hack.
Forget I ever mentioned it. I'll wait for the real thing. With the first step of making hidden categories we can set to work on repopulating categories. This will make the transition to Category Intersection much smoother.
--Samuel Wantman
[en:User:Sam]
wikitech-l@lists.wikimedia.org