Forwarding to the Discovery list, since this project seems like it might be of interest even outside the wikidata context. Blame me if you've already seen this elsewhere. :)
TL;DR: StrepHit is an intelligent reading agent that understands text and translates it into *referenced* Wikidata statements.
It is a IEG project funded by the Wikimedia Foundation.
Key features:
-Web spiders to harvest a collection of documents (corpus) from reliable sources
-automatic corpus analysis to understand the most meaningful verbs
-sentences and semi-structured data extraction
-train a machine learning classifier via crowdsourcing
-*supervised and rule-based fact extraction from text*
-Natural Language Processing utilities
-parallel processing