Thanks Andrew but I know and see the pitfalls.
The use case is only 1 level deep and to be treated not as a Category but
instead as simply something "Related" or "Relative" to the topic.
Keeping within those 2 constraints from my tests seems to work reasonably
well. Beyond those constraints... I agree, its the wild west and not
something that we are interested in.
On Mon, Oct 10, 2016 at 5:20 AM Andrew Gray <andrew.gray(a)dunelm.org.uk>
wrote:
Hi Thad,
One quick red flag - I'm not sure how familiar you are with the
category system, but automatically parsing it without sanity-checking
can very quickly lead you into a minefield. There are a substantial
number of category trees which seem reasonable at first, but rapidly
go in unexpected directions.
For example, if you start at "Category:Road transport" on enwiki and
go two categories deep, you get "Category:Parking facilities" - so
far, so good - but you also get "Category:The Hitchhiker's Guide to
the Galaxy", "Category:Songs about buses", and "Category:Cycling
journalists".
There are a few pairs of categories which contain each other directly,
and a much larger number which contain each other once you go a few
levels of subcategories deep & so go in endless loops. And, of course,
a clean category tree in French might be a mess in German, or vice
versa.
None of this is to say "don't do it", but rather "don't expect it
to
be clean and tidy" :-)
Andrew.
On 7 October 2016 at 02:08, Thad Guidry <thadguidry(a)gmail.com> wrote:
Thanks Stas,
So, hmm...we'd have to build our own parser or something is what your
saying
?
Because Wikidata doesn't have those kinds of connections in its graph and
also doesn't have a SPARQL service yet against the Wikipedia
API:Categorymembers
https://www.mediawiki.org/wiki/API:Categorymembers to
deduce those subcategories, right ?
How difficult is it for someone to create a service like the
LabelService ,
but instead using the WP Categorymembers API ?
Or do you have some other
ideas ?
On Thu, Oct 6, 2016 at 6:10 PM Stas Malyshev <smalyshev(a)wikimedia.org>
wrote:
Hi!
Can it all be done in SPARQL against some
services that already expose
WP subcategories given a specific category ? Or is there an API that
does this already ? other tools that might expose WP categories ?
I don't think subcategory relationship is not recorded in Wikidata. E.g.
https://www.wikidata.org/wiki/Q7361750 contains
https://www.wikidata.org/wiki/Q14436424 but neither have any indication
of that.
The problem I guess is that category hierarchy is different on all
wikis, so it's hard to have one property that expresses it on Wikidata.
You could do through "subclass of" and "category's main topic"
but not
sure that'd capture all. E.g.:
http://tinyurl.com/h7qpcdn but that only
captures one subcategory, since other items don't have the same
hierarchy in Wikidata.
--
Stas Malyshev
smalyshev(a)wikimedia.org
_______________________________________________
Wikidata mailing list
Wikidata(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata
_______________________________________________
Wikidata mailing list
Wikidata(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata
--
- Andrew Gray
andrew.gray(a)dunelm.org.uk
_______________________________________________
Wikidata mailing list
Wikidata(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata