Hi Ben,
On 6/15/16 18:24, Benjamin Good wrote:
Hi Marco,
Where might we find some statistics on the current accuracy of the
automated claim and reference extractors? I assume that information
must be in there somewhere, but I had trouble finding it.
The StrepHit pipeline
(codebase) is ready, while the project is ongoing.
We are not there yet, and will publish performance values in the final
report.
This is a very ambitious project covering a very large technical
territory (which I applaud). It would be great if your results could be
synthesized a bit more clearly so we can understand where the
weak/strong points are and where we might be able to help improve or
make use of what you have done in other domains.
Sure, this will be done in the
final report.
Up to now, you can have a look at the midpoint report summary:
https://meta.wikimedia.org/wiki/Grants:IEG/StrepHit:_Wikidata_Statements_Va…
Best,
Marco
-Ben
On Wed, Jun 15, 2016 at 9:06 AM, Marco Fossati <fossati(a)spaziodati.eu
<mailto:fossati@spaziodati.eu>> wrote:
[Feel free to blame me if you read this more than once]
To whom it may interest,
Full of delight, I would like to announce the first beta release of
*StrepHit*:
https://github.com/Wikidata/StrepHit
TL;DR: StrepHit is an intelligent reading agent that understands
text and translates it into *referenced* Wikidata statements.
It is a IEG project funded by the Wikimedia Foundation.
Key features:
-Web spiders to harvest a collection of documents (corpus) from
reliable sources
-automatic corpus analysis to understand the most meaningful verbs
-sentences and semi-structured data extraction
-train a machine learning classifier via crowdsourcing
-*supervised and rule-based fact extraction from text*
-Natural Language Processing utilities
-parallel processing
You can find all the details here:
https://meta.wikimedia.org/wiki/Grants:IEG/StrepHit:_Wikidata_Statements_Va…
https://meta.wikimedia.org/wiki/Grants:IEG/StrepHit:_Wikidata_Statements_Va…
If you like it, star it on GitHub!
Best,
Marco
_______________________________________________
Wikidata mailing list
Wikidata(a)lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org>
https://lists.wikimedia.org/mailman/listinfo/wikidata
_______________________________________________
Wikidata mailing list
Wikidata(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata