Hi Nick,
Nick Jenkins wrote:
Hi Julien,
I was wondering if there is projects to help
*detect* categories and
then to help editors by *suggesting* categories ?
I think it's a good idea.
The closest thing I can see is at:
http://en.wikipedia.org/wiki/Wikipedia:Auto-categorization (although I'm not 100%
clear on what
that project was doing with categories, so maybe it's not as related as it sounds,
but I thought it best to mention it).
Thank you for the link, it is very interesting. I will try to contact
the author of this page.
Once you have the clusters, you
need to try labelling them with a category :
- give to the user the role of identifying the category name
- use the words space to find the better words that describe this set
of articles
- ...
Then you can run this algorithm on a category to try to split it in sub
categories.
How are you going to apply the categories? E.g. leave a message on the talk page /
automatically apply them with a bot / an external
web page where people can see possible categories / or something integrated into
MediaWiki?
I think a proof of concept on a external page is better to see if
results are good enough to be integrated in MediaWiki.
If it's something interactive, you can maybe
produce a checkbox list of the top 10 possible categories, have the user tick all the
categories that apply. Then for the categories that apply, you could maybe expand those
and show the subcategories, and have the
user tick the ones that apply, and keep on expanding the subcategories that apply until
the user has gotten as specific as they can.
Clustering is really an interactive process, but needs quite a lot of
cpu time. I don't think I will produce a interactive result, a least
at the beginning.
The second
idea is really more simple and easier to implement : When you
edit an article, you can suggest categories of linked articles (can be
replaced by an other graph-exploration algorithm).
Can it maybe also suggest which stubs to use? I can never remember what the right stub to
use is (so I just use the standard
"{{stub}}"), but there has got to be a better way than having to remember the
full list of stub types.
Yes, you can extend it to stubs with the same kind of processing.
Best Regards.
Julien Lemoine