Hello Wikidata folks,

I would like to bring your attention to an open source dataset I've been developing called the Kensho Derived Wikimedia Dataset (KDWD).  It's a cleaned English subset of Wikipedia/Wikidata with 2.3B tokens, 5.3M pages, 51M nodes, and 120M edges.  More details are available here https://blog.kensho.com/announcing-the-kensho-derived-wikimedia-dataset-5d1197d72bcf

best, 
-Gabriel