Great tool ! The error detection is precious !

2015-10-22 17:31 GMT+02:00 Markus Kroetzsch <markus.kroetzsch@tu-dresden.de>:
Hi all,

I am happy to announce a new tool [1], written by Serge Stratan, which allows you to browse the taxonomy (subclass of & instance of relations) between Wikidata's most important class items. For example, here is the Wikidata taxonomy for Pizza (discussed recently on this list):

http://sergestratan.bitbucket.org?draw=true&optid=s0&item=177,2095,7802,28877,35120,223557,386724,488383,666242,736427,746549,2424752,15222213,16686448


== What you see there ==

Solid green lines mean "subclass of" relations (subclasses are lower), while dashed purple lines are "instance of" relations (instances are lower). Drag and zoom the view as usual. Hover over items for more information. Click on arrows with numbers to display upper or lower neighbours. Right-click on classes to get more options.

The sidebar on the left shows statistics and presumed problems in the data (redundancies and likely errors). You can select a report type to see the reports, and click on any line to show the error. If you search for a class in the search field, the errors will be narrowed down to issues related to the taxonomy of this class.

The toolbar at the top has options to show and hide items based on the current selection (left click on any box).

Edges in red are the wrong way around (top to bottom). This occurs only when there are cycles in the "taxonomy".


== Micro tutorial ==

(1) Enter "Unicorn" in the search box, press return.
(2) Zoom out a bit by scrolling your mouse/touchpad
(3) Click on the "Unicorn" item box. It becomes blue (selected).
(4) Click "Expand up" in the toolbar at the top
(5) Zoom out to see the taxonomy of unicorn
(6) Find the class "Fictional Horse" (directly above unicorn) and click its downwards arrow labelled "3" to see all three children items of "fictional horse".
(7) Click the share button on the top right to get a link to this view.

You can also create your own share link manually by just changing the Qids in the URL as you like.


== Status and limitations ==

This is a prototype and it still has some limits:

* It only shows "proper" classes that have at least one instance or subclass. This is to reduce the overall data size and load time.
* The data is based on dumps (the date is shown on the right). It is not a live view.
* The layout is sometimes too dense. You can find a "hidden" option to make it more spacy behind the sidebar (click "Sidebar" to see it). This helps to disentangle larger graphs.
* There are some minor bugs in the UI. You sometimes need to click more than once until the right thing happens.
* The help page at http://sergestratan.bitbucket.org/howtouse.html does not explain everything in detail yet.

It is planned to work on some of these limitations in the future.

The hope is that this tool will reveal many errors in Wikidata's taxonomy that are otherwise hard to detect. For example, you can see easily that every "Ship" is an "Event" in Wikidata, that every "Hobbit" is a "Fantasy Race", and that every "Monday" is both a "Mathematical object" and a "Unit of measurement".

Feedback is welcome (on the tool; better start new threads for feedback on the Wikidata taxonomy ;-),

Markus


[1] http://sergestratan.bitbucket.org

--
Markus Kroetzsch
Faculty of Computer Science
Technische Universität Dresden
+49 351 463 38486
http://korrekt.org/

_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata