Doug from WikiSource started a page over at meta:
http://meta.wikimedia.org/wiki/Beyond_categories
I'll be trying to fill in some of my understanding of the problem and
the scope of a possible solution. I recognize there's been a lot of
prior art on this issue, and a lot of existing overlapping tools and
infrastructure, and I'm pretty new around here, and apt to be
inaccurate and naive. So I do hope others with more experience will
come and help sort it out.
Chris
On Sun, May 5, 2013 at 11:06 AM, Michael Hale <hale.michael.jr@live.com> wrote:
> As far as checking the import progress of Wikidata, the category American
> women writers has 1479 articles. 651 of them currently have a main type
> (GND), 328 have a sex, 162 have an occupation, 111 have a country of
> citizenship, 49 have a sexual orientation, 39 have a place of birth, etc.
>
>> From: jc@sahnwaldt.de
>> Date: Sun, 5 May 2013 16:28:14 +0200
>
>> To: wikidata-l@lists.wikimedia.org
>> Subject: Re: [Wikidata-l] Question about wikipedia categories.
>>
>> Hi Pat,
>>
>> I've been involved with DBpedia for several years, so these are
>> interesting thoughts.
>>
>> On 5 May 2013 01:25, Patrick Cassidy <pat@micra.com> wrote:
>> > If one is interested in a functional “category” system, it would be very
>> > helpful to have a good logic-based ontology as the backbone.
>> >
>> > I haven’t looked recently, but when I inquired about the ontology used
>> > by
>> > DBpedia a year ago, I was referred to “dbpedia-ontology.owl”, an
>> > ontology in
>> > the format of the “semantic web” ontology format OWL. The OWL format is
>> > excellent for simple purposes, but the dbpedia-ontology.owl (at that
>> > time)
>> > was not well-structured (being very polite).
>>
>> Do you mean just the file dbpedia-ontology.owl or the DBpedia ontology
>> in general? We still use OWL as our main format for publishing the
>> ontology. The file is generated automatically. Maybe the generation
>> process could be improved.
>>
>> > I did inquire as to who was
>> > maintaining the ontology, and had a hard time figuring out how to help
>> > bring
>> > it up to professional standards. But it was like punching jello, nothing
>> > to
>> > grasp onto. I gave up, having other useful things to do with my time.
>>
>> The ontology is maintained by a community that everyone can join at
>> http://mappings.dbpedia.org/ . An overview of the current class
>> hierarchy is here:
>> http://mappings.dbpedia.org/server/ontology/classes/ . You're more
>> than welcome to help! I think talk pages are not used enough on the
>> mappings wiki, so if you have ideas, misgivings or questions about the
>> DBpedia ontology, the place to go is probably the mailing list:
>> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
>>
>> Thanks!
>>
>> Christopher
>>
>> >
>> >
>> >
>> > Perhaps it is time now, with more experience in hand, to rethink the
>> > category system starting with basics. This is not as hard as it sounds.
>> > It may require some changes where there is ambiguity or logical
>> > inconsistency, but mostly it only necessary to link the Wikipedia
>> > categories
>> > to an ontology based on a well-structured and logically sound foundation
>> > ontology (also referred to as an “upper ontology”), that supplies the
>> > basic
>> > categories and relations. Such an ontology can provide the basic
>> > concepts,
>> > whose labels can be translated into any terminology that any local user
>> > wants to use. There are several well-structured foundation ontologies,
>> > based on over twenty years of research, but the one I suggest is the one
>> > I
>> > am most familiar with (which I created over the past seven years),
>> > called
>> > COSMO. The files at http://micra.com/COSMO will provide the ontology
>> > itself
>> > (“COSMO.owl”, in OWL) and papers describing the basic principles. COSMO
>> > is structured to be a “primitives-based foundation ontology”, containing
>> > all
>> > of the “semantic primitives” needed to describe anything one wants to
>> > talk
>> > about. All other categories are structured as logical combinations of
>> > the
>> > basic elements. Its inventory of primitives is probably incomplete, but
>> > is
>> > able to describe everything I have been concerned with for years (7000
>> > categories and 800 relations thus far) can always be supplemented as
>> > required for new fields. With an OWL ontology, queries can be executed
>> > by
>> > any of several logic-based utilities. Making the query system easy for
>> > those who prefer not to build SPARQL queries (including myself) would
>> > require some programming, but that is a miniscule effort compared to
>> > what
>> > has already been put into the DBPedia database. Tools such as “Protégé”
>> > make it easy to work with an OWL ontology, and there is a web site where
>> > an
>> > OWL ontology can be developed collaboratively.
>> >
>> >
>> >
>> > I will be willing to put some effort into this and assist anyone who
>> > wants
>> > to used the COSMO ontology for this project. If those who are in charge
>> > of
>> > maintaining the ontology (is anyone?) would like to discuss this at
>> > greater
>> > length, send me an email or telephone me. All those who are interested
>> > in
>> > this topic may also feel free to contact me, or to discuss this thread
>> > on
>> > the list. I suggest the thread title “Foundation Ontology”.
>> >
>> >
>> >
>> > Pat
>> >
>> >
>> >
>> > Patrick Cassidy
>> >
>> > MICRA Inc.
>> >
>> > cassidy@micra.com
>> >
>> > 908-561-3416
>> >
>> >
>> >
>> > From: wikidata-l-bounces@lists.wikimedia.org
>> > [mailto:wikidata-l-bounces@lists.wikimedia.org] On Behalf Of Michael
>> > Hale
>> > Sent: Saturday, May 04, 2013 2:57 AM
>> > To: Discussion list for the Wikidata project.
>> >
>> >
>> > Subject: Re: [Wikidata-l] Question about wikipedia categories.
>> >
>> >
>> >
>> > I think it's important to consider the distinction between a category
>> > system
>> > and semantic queries. I think it's very likely that DBpedia and Wikidata
>> > will converge over time and develop a simple enough query interface that
>> > causes fewer people to use the category system because we will be able
>> > to
>> > automatically generate relevant queries related to a given article.
>> > DBpedia
>> > currently has a lot more data, but Wikidata is important for many
>> > editing
>> > scenarios. Also, in the future I think there will be a lot of content
>> > scenarios where it is natural to start by putting data into Wikidata and
>> > then including it in articles instead of just extracting information
>> > from
>> > articles. If you are familiar with query languages you can get
>> > comfortable
>> > with the DBpedia SPARQL examples in a few minutes, but for a typical
>> > reader
>> > that just wants to go from an article about a person to a list of
>> > similar
>> > people it is hard to beat scrolling down and just clicking on a
>> > category. I
>> > did a test query on DBpedia to plot all sports cars by their engine
>> > sizes,
>> > and I think for the types of things it enables you to do it is totally
>> > worth
>> > the learning curve. That being said, I think the category system has a
>> > lot
>> > of potential for better browsing scenarios as opposed to queries. I've
>> > been
>> > making a tool that mixes the article view data with the category system.
>> > You
>> > can see a video of the basic idea here and a screenshot of football
>> > league
>> > popularity split by language.
>> > http://en.wikipedia.org/wiki/User:Wakebrdkid/Popular_category_browsing
>> > I'm
>> > currently multiplying the Chinese traffic by 30 to try and account for
>> > Baidu
>> > Baike.
>> >
>> >> Date: Sat, 4 May 2013 08:14:54 +0200
>> >> From: jane023@gmail.com
>> >> To: wikidata-l@lists.wikimedia.org
>> >> Subject: Re: [Wikidata-l] Question about wikipedia categories.
>> >>
>> >> Wondering exactly the same thing - my frustrations with categories
>> >> began about three years ago and it seems I am surprised monthly by
>> >> severe limitations to this outdated apparatus. I am a heavy category
>> >> user, but I would love to be able to kick it out the door in favour of
>> >> a more structured method. As far as I can tell, there is very little
>> >> synchronisation among language Wikipedias of category trees, and being
>> >> able to apply a central structure to all Wikipedias through Wikidata
>> >> sounds like a great idea, and one which would not disturb the current
>> >> category trees we already have, but supplement them. As I see it, some
>> >> category structures are OK, but when categories get big, people split
>> >> them in non-standard ways, causing problems like this recent
>> >> media-hype regarding female novellists. I think that it's great this
>> >> is in the news in this way, because I am sure that most Wikipedia
>> >> readers never knew we had categories, and this is a great introduction
>> >> to them, as well as an invitation to edit Wikipedia.
>> >>
>> >> 2013/5/4, Chris Maloney <voldrani@gmail.com>:
>> >> > I am just curious if there has ever been discussion about the
>> >> > potential for reimplementing / replacing the category system in
>> >> > Wikipedia with semantic tagging in WikiData. It seem to me that the
>> >> > recent kerfuffle with regards to "American women writers" would not
>> >> > have happened if the pages were tagged with simple RDF assertions
>> >> > instead of these convoluted categories. I know, of course, that it
>> >> > would be a huge undertaking, but I just don't see how the category
>> >> > system can continue to scale (I'm amazed it has scaled as well as it
>> >> > has already, of course).
>> >> >
>> >> > I am trying to learn more about wikidata, and have perused the
>> >> > various
>> >> > infos and FAQs for the last two hours, and can't find any discussion
>> >> > of this particular issue.
>> >> >
>> >> > -- Chris
>> >> >
>> >> > _______________________________________________
>> >> > Wikidata-l mailing list
>> >> > Wikidata-l@lists.wikimedia.org
>> >> > https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>> >> >
>> >>
>> >> _______________________________________________
>> >> Wikidata-l mailing list
>> >> Wikidata-l@lists.wikimedia.org
>> >> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>> >
>> >
>> > _______________________________________________
>> > Wikidata-l mailing list
>> > Wikidata-l@lists.wikimedia.org
>> > https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>> >
>>
>> _______________________________________________
>> Wikidata-l mailing list
>> Wikidata-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>
> _______________________________________________
> Wikidata-l mailing list
> Wikidata-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>
_______________________________________________
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l