I just wanted to throw in a few more thoughts after looking at the Commons and Meta pages about improving categories. Wikidata does have items that correspond to category pages. These store translations and we can use them to add any extra relations between categories that we need. http://www.wikidata.org/wiki/Q7217075 I agree that Semantic MediaWiki seems most useful for smaller, specialized wikis with fewer anonymous users that don't have the need for synchronized updates across many pages that can't be handled by just templates. If I was going to play around with adding filters to categories I would want it to be something like a category filter template that can be added to category pages to show a dropdown box (with typing support) for a given property. I would automatically generate the initial set of filtered properties for a category by taking all of the properties that the items in the category don't have all the same value for and don't have all different values for (obviously we wouldn't want to filter a category of people by their main type because they are all people and we wouldn't want to filter by their parents because most all of them have different parents and they all have different LCCN and VIAF IDs). Then I would take the 5 most popular of those properties among the items to use as filters. Then we could add or remove them manually from there and we could start merging categories that are handled by the filters.
Date: Sun, 5 May 2013 14:45:05 -0400 From: emw.wiki@gmail.com To: wikidata-l@lists.wikimedia.org Subject: Re: [Wikidata-l] Question about wikipedia categories.
There's a related essay on Wikimedia Commons: http://commons.wikimedia.org/wiki/User:Multichill/Next_generation_categories.
The Wikidata properties instance of (formerly "is a") and subclass of are likely relevant to folks interested in ontology building on Wikidata. They're based on rdfs:type and rdfs:subClassOf from W3C recommendations, and allow for building a rooted DAG that places concepts into a hierarchy of knowledge. They also allow for a degree of type-token distinction when classifying subjects, though how that applies to certain knowledge domains hasn't been fully sussed out.
On Sun, May 5, 2013 at 2:17 PM, Chris Maloney voldrani@gmail.com wrote:
Doug from WikiSource started a page over at meta:
http://meta.wikimedia.org/wiki/Beyond_categories
I'll be trying to fill in some of my understanding of the problem and
the scope of a possible solution. I recognize there's been a lot of
prior art on this issue, and a lot of existing overlapping tools and
infrastructure, and I'm pretty new around here, and apt to be
inaccurate and naive. So I do hope others with more experience will
come and help sort it out.
Chris
On Sun, May 5, 2013 at 11:06 AM, Michael Hale hale.michael.jr@live.com wrote:
As far as checking the import progress of Wikidata, the category American
women writers has 1479 articles. 651 of them currently have a main type
(GND), 328 have a sex, 162 have an occupation, 111 have a country of
citizenship, 49 have a sexual orientation, 39 have a place of birth, etc.
From: jc@sahnwaldt.de
Date: Sun, 5 May 2013 16:28:14 +0200
To: wikidata-l@lists.wikimedia.org
Subject: Re: [Wikidata-l] Question about wikipedia categories.
Hi Pat,
I've been involved with DBpedia for several years, so these are
interesting thoughts.
On 5 May 2013 01:25, Patrick Cassidy pat@micra.com wrote:
If one is interested in a functional “category” system, it would be very
helpful to have a good logic-based ontology as the backbone.
I haven’t looked recently, but when I inquired about the ontology used
by
DBpedia a year ago, I was referred to “dbpedia-ontology.owl”, an
ontology in
the format of the “semantic web” ontology format OWL. The OWL format is
excellent for simple purposes, but the dbpedia-ontology.owl (at that
time)
was not well-structured (being very polite).
Do you mean just the file dbpedia-ontology.owl or the DBpedia ontology
in general? We still use OWL as our main format for publishing the
ontology. The file is generated automatically. Maybe the generation
process could be improved.
I did inquire as to who was
maintaining the ontology, and had a hard time figuring out how to help
bring
it up to professional standards. But it was like punching jello, nothing
to
grasp onto. I gave up, having other useful things to do with my time.
The ontology is maintained by a community that everyone can join at
http://mappings.dbpedia.org/ . An overview of the current class
hierarchy is here:
http://mappings.dbpedia.org/server/ontology/classes/ . You're more
than welcome to help! I think talk pages are not used enough on the
mappings wiki, so if you have ideas, misgivings or questions about the
DBpedia ontology, the place to go is probably the mailing list:
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
Thanks!
Christopher
Perhaps it is time now, with more experience in hand, to rethink the
category system starting with basics. This is not as hard as it sounds.
It may require some changes where there is ambiguity or logical
inconsistency, but mostly it only necessary to link the Wikipedia
categories
to an ontology based on a well-structured and logically sound foundation
ontology (also referred to as an “upper ontology”), that supplies the
basic
categories and relations. Such an ontology can provide the basic
concepts,
whose labels can be translated into any terminology that any local user
wants to use. There are several well-structured foundation ontologies,
based on over twenty years of research, but the one I suggest is the one
I
am most familiar with (which I created over the past seven years),
called
COSMO. The files at http://micra.com/COSMO will provide the ontology
itself
(“COSMO.owl”, in OWL) and papers describing the basic principles. COSMO
is structured to be a “primitives-based foundation ontology”, containing
all
of the “semantic primitives” needed to describe anything one wants to
talk
about. All other categories are structured as logical combinations of
the
basic elements. Its inventory of primitives is probably incomplete, but
is
able to describe everything I have been concerned with for years (7000
categories and 800 relations thus far) can always be supplemented as
required for new fields. With an OWL ontology, queries can be executed
by
any of several logic-based utilities. Making the query system easy for
those who prefer not to build SPARQL queries (including myself) would
require some programming, but that is a miniscule effort compared to
what
has already been put into the DBPedia database. Tools such as “Protégé”
make it easy to work with an OWL ontology, and there is a web site where
an
OWL ontology can be developed collaboratively.
I will be willing to put some effort into this and assist anyone who
wants
to used the COSMO ontology for this project. If those who are in charge
of
maintaining the ontology (is anyone?) would like to discuss this at
greater
length, send me an email or telephone me. All those who are interested
in
this topic may also feel free to contact me, or to discuss this thread
on
the list. I suggest the thread title “Foundation Ontology”.
Pat
Patrick Cassidy
MICRA Inc.
cassidy@micra.com
908-561-3416
From: wikidata-l-bounces@lists.wikimedia.org
[mailto:wikidata-l-bounces@lists.wikimedia.org] On Behalf Of Michael
Hale
Sent: Saturday, May 04, 2013 2:57 AM
To: Discussion list for the Wikidata project.
Subject: Re: [Wikidata-l] Question about wikipedia categories.
I think it's important to consider the distinction between a category
system
and semantic queries. I think it's very likely that DBpedia and Wikidata
will converge over time and develop a simple enough query interface that
causes fewer people to use the category system because we will be able
to
automatically generate relevant queries related to a given article.
DBpedia
currently has a lot more data, but Wikidata is important for many
editing
scenarios. Also, in the future I think there will be a lot of content
scenarios where it is natural to start by putting data into Wikidata and
then including it in articles instead of just extracting information
from
articles. If you are familiar with query languages you can get
comfortable
with the DBpedia SPARQL examples in a few minutes, but for a typical
reader
that just wants to go from an article about a person to a list of
similar
people it is hard to beat scrolling down and just clicking on a
category. I
did a test query on DBpedia to plot all sports cars by their engine
sizes,
and I think for the types of things it enables you to do it is totally
worth
the learning curve. That being said, I think the category system has a
lot
of potential for better browsing scenarios as opposed to queries. I've
been
making a tool that mixes the article view data with the category system.
You
can see a video of the basic idea here and a screenshot of football
league
popularity split by language.
http://en.wikipedia.org/wiki/User:Wakebrdkid/Popular_category_browsing
I'm
currently multiplying the Chinese traffic by 30 to try and account for
Baidu
Baike.
Date: Sat, 4 May 2013 08:14:54 +0200
From: jane023@gmail.com
To: wikidata-l@lists.wikimedia.org
Subject: Re: [Wikidata-l] Question about wikipedia categories.
Wondering exactly the same thing - my frustrations with categories
began about three years ago and it seems I am surprised monthly by
severe limitations to this outdated apparatus. I am a heavy category
user, but I would love to be able to kick it out the door in favour of
a more structured method. As far as I can tell, there is very little
synchronisation among language Wikipedias of category trees, and being
able to apply a central structure to all Wikipedias through Wikidata
sounds like a great idea, and one which would not disturb the current
category trees we already have, but supplement them. As I see it, some
category structures are OK, but when categories get big, people split
them in non-standard ways, causing problems like this recent
media-hype regarding female novellists. I think that it's great this
is in the news in this way, because I am sure that most Wikipedia
readers never knew we had categories, and this is a great introduction
to them, as well as an invitation to edit Wikipedia.
2013/5/4, Chris Maloney voldrani@gmail.com:
I am just curious if there has ever been discussion about the
potential for reimplementing / replacing the category system in
Wikipedia with semantic tagging in WikiData. It seem to me that the
recent kerfuffle with regards to "American women writers" would not
have happened if the pages were tagged with simple RDF assertions
instead of these convoluted categories. I know, of course, that it
would be a huge undertaking, but I just don't see how the category
system can continue to scale (I'm amazed it has scaled as well as it
has already, of course).
I am trying to learn more about wikidata, and have perused the
various
infos and FAQs for the last two hours, and can't find any discussion
of this particular issue.
-- Chris
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
_______________________________________________
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l
_______________________________________________ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l