Hoi, There are two parts to your question.
- When is the available data good enough - When is the script that does the text generation good enough
The second is something that you can help with. When it is good enough for one item that is really well developed like JS Bach, it is good enough for most similar items.
The first is something that really proves that Wikidata is a Wiki. It is easy to find fault, I do it regularly and I blog about it. The point is not that I should add all the missing "prime minister of the United Kingdom", it is that I inform how we know that these things are problematic. You can help, competent bot operators can help, cooperation with DBpedia may help.
What we need is to invest in the comparison of data of a Wikipedia and Wikidata.. any Wikipedia. Such reports will show at this time overwhelmingly where Wikidata does NOT have data. So much so that it is a bad idea to report it. It is better to just add the missing data and report only on the discrepancies where data fails to match.
When you check out the data for Romanian villages, you do something I cannot do. Again this is how it shows that Wikidata is a wiki. We need a community that cares about the quality of the information provided.
We need you.
We need you to check our data, we need you to compare our data, we need you to use our data because in the end, the usage of the information embedded in Wikidata is what will ensure that enough eyeballs ensure the quality we/you are seeking to provide. Reasonator does the best possible job as far as I am concerned. Its audience grows really rapidly month by month. Thanks, GerardM
http://tools.wmflabs.org/reasonator/stats.php
On 7 February 2014 19:41, Strainu strainu10@gmail.com wrote:
Hi Gerard,
The problem that I have with Wikidata and automatically generated content like the reasonator does is: how do you know when has the script and the data reached an acceptable level?
I've tested Reasonator a little with Romanian vilages and found errors in both the Wikidata fields and in the extrapolations made by the script (like pictures identified as linked to that item). In that case it is pretty obvious that the data is still unusable, but what if the 3-5-10 articles I check are OK? Are there any guidelines on that?
Thanks, Strainu
2014-02-07 17:42 GMT+02:00 Gerard Meijssen gerard.meijssen@gmail.com:
Hoi,
We have often had discussions about generating texts to be used in Wikipedia articles. The drawback of articles that are generated in this
way
is that they do not get updated as more information becomes available.
The data in Wikidata is comparable with the kind of data often used in
such
processes.
The first kind texts that are generated bu the Reasonator are based on
the
biographic information of a person. As Johann Sebastian Bach is a great example of the power of both Wikidata and Reasonator, I invite you to
have
a look [1].
The text is completely generated based on the information in Wikidata and Magnus is very much in the process of iterating this functionality. What
I
hope you will see as a challenge is writing similar functionality for
other
languages.
What I hope for is that the Wikidata development team will appreciate
this
for what its potential and support it when it is found that additional technology is needed. Thanks, GerardM
[1] http://tools.wmflabs.org/reasonator/?q=Q1339 _______________________________________________ Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe
Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe