Tpt did take a few datasets that have a high-enough quality from the Freebase dataset and uploaded it directly. These numbers do not appear in the Primary Sources tool, because they were uploaded directly - each set going through the normal community process.

The Primary Sources Tool is left with the datasets where we were not able to establish a high enough threshold of quality. For any dataset where this quality can be demonstrated to the community, I assume they will agree with a direct upload.

I am not sure what else to do here.

I am very thankful to Nemo for his rephrasing of the discussion and to pull it to a constructive and actionable level.




Gerard, regarding your arguments:
Can you explain what you mean with "add data directly". I am really not sure what you mean with this argument. Are you suggesting to upload the whole dataset without further review?
But it is not over 90% good! We have a rejection rate of almost 20%. Also, 10% errors means more than 1 Million errors. I yet need to see consensus to upload this.
The tool has been used to add thousands of statements and references to Wikidata, and that by a rather small set of people (because you need to intentionally install it). I would think that if we switch it on per default, the throughput should grow considerably. Nemo identified a few issues for that, and it would be good if we would work on these. Everyone is invited to help out with that.
Kian is free to learn from the datasets. The data of Freebase has been available for years, and Kian would by far not be the first ML tool to use it for training purposes. If there is anything hindering Kian to use the Freebase data, let me know, I will try to fix it.
Because we don't know which ones are which. If you could tell me which of the 12 Million statements are good and which ones are not, and if there is consensus about that assessment, I'd be happy to upload them.

I hope that this answers your arguments.

Again, I do not understand what your proposal is. I am going through the process to release the data in an easy to use way. If the community agrees with that, it can then be directly imported to Wikidata - I certainly won't stop anyone from doing so and never had.

My feeling is that you are frustrated by what you perceive as slow progress. You keep yelling at people that their ideas and work are not good. I remember how much you attacked me about Wikidata and all the things I have been doing wrong about it. Gerard, if you think you are motivating me with your constant attacks, I have to tell you, you are not. I am not speaking for anyone else, but I am getting tired of this. I appreciate a critical voice, but not in the tone you are often delivering it.

So, instead of telling everyone how we are supposed to spend our volunteer time in order to get things done better, and how we are doing things wrong, why don't you lead by example, and do it right? All the data, all the tools, for anything you want to get done are available to you for free. It is a pretty amazing world - all you need is at click away. So go ahead and do what you want to get done.







On Tue, Sep 29, 2015 at 1:07 AM Federico Leva (Nemo) <nemowiki@gmail.com> wrote:
Denny Vrandečić, 28/09/2015 23:27:
> Actually, my suggestion would be to switch on Primary Sources as a
> default tool for everyone.

Yes, it's a desirable aim to have one-click suggested actions (à la
Wikidata game) embedded into items for everyone. As for this tool,
unrelatedly from the data used, at least slowness and misleading
messaging need to be fixed first:
https://www.wikidata.org/wiki/Wikidata_talk:Primary_sources_tool

(Compare: we already have very easy "remove" buttons on all statements
on all items. So the interface for large-scale easy correction of
mistakes is already there, while for *insertion* it's still missing.
Which is also the gist of Gerard's argument, I believe. I agree with
Lydia we can eventually do both, of course.)

Nemo

_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata