follow up: I now see all of the extra information in the ontology project https://www.wikidata.org/wiki/Wikidata:WikiProject_Ontology and will look into contributing there. 

On Sat, Jun 15, 2019 at 5:10 PM Gabriel Altay <gabriel.altay@gmail.com> wrote:
Hello everyone, 

I was playing around with a recent wikidata dump and extracted the items that "looked" like classes based on the definition here, 

https://www.wikidata.org/wiki/Wikidata:WikiProject_Ontology/Classes

Specifically, an item is a class-item if any of the following are true,  
  * the item is the value of a P31 ("instance of") statement                                                      
  * the item has a P279 ("subclass of") statement (subclass)                                                      
  * the item is the value of a P279 ("subclass of") statement (superclass)

Once I extracted all items that met these criteria (2,399,621 items from wikidata-20190603-all.json.bz2) I started examining the results.  One of the things I found slightly surprising is that there are about 23k badminton events that are classes b/c they have "subclass of https://www.wikidata.org/wiki/Q13357858" statements.  SPARQL query below.  

https://query.wikidata.org/#SELECT%20%3Fitem%20%3FitemLabel%20%0AWHERE%20%0A%7B%0A%20%20%3Fitem%20wdt%3AP31%20wd%3AQ57733494.%0A%20%20%3Fitem%20wdt%3AP279%20wd%3AQ13357858.%0A%20%20SERVICE%20wikibase%3Alabel%20%7B%20bd%3AserviceParam%20wikibase%3Alanguage%20%22%5BAUTO_LANGUAGE%5D%2Cen%22.%20%7D%0A%7D

It also looks like there is a badminton project page, 
https://www.wikidata.org/wiki/Category:WikiProject_Badminton
https://www.wikidata.org/wiki/Wikidata:WikiProject_Badminton/Subclass


I'd like to remove these statements as it seems that a particular instance of a badminton tournament 
https://www.wikidata.org/wiki/Q121940
is not a class.  

It seems that this pattern is also in place for about 1,000,000 items which are instance of gene (e.g. https://www.wikidata.org/wiki/Q40108).

I had a couple questions for the mailing list, 

 1) do folks know if there is an active group working on wikidata ontology
 2) i've read a few messages about shape expressions.  would it be worthwhile to setup a shape expression that prevents most items from having both "instance of" and "subclass of" statements?
 3) if these entries are generated by bots, what is the best way to get in touch with the owner, their user talk page? 

I am probably missing a lot of information about what has been done so far in the community, but I'm happy to read anything someone points me towards. 

best, 
-Gabriel