In another thread, we are discussing the preponderance of problematic merges of gene/protein items. One of the hypotheses raised to explain the volume and nature of these merges (which are often by fairly inexperienced editors and/or people that seem to only do merges) was that they were coming from the wikidata game. It seems to me that anything like the wikidata game that has the potential to generate a very large volume of edits - especially from new editors - ought to tag its contributions so that they can easily be tracked by the system. It should be easy to answer the question of whether an edit came from that game (or any of what I hope to be many of its descendants). This will make it possible to debug what could potentially be large swathes of problems and to make it straightforward to 'reward' game/other developers with information about the volume of the edits that they have enabled directly from the system (as opposed to their own tracking data).
Please don't misunderstand me. I am a big fan of the wikidata game and actually am pushing for our group to make a bio-specific version of it that will build on that code. I see a great potential here - but because of the potential scale of edits this could quickly generate, we (the whole wikidata community) need ways to keep an eye on what is going on.
-Ben
If I understand correctly:
1) Magnus' game already tags the edits with 'Widar'.
2) Magnus' game cannot merge protein and genes if they link to each other. With 'ortholog' and 'expressed by' Magnus' merging game does not contribute to the problematic merges (Magnus email from previously today: "FWIW, checked again. Neither game can merge two items that link to each other. So, if the protein is "expressed by" the gene, that pair will not even be suggested.").
There is nothing more that Magnus can do, - except making an unmerging game. :-)
/Finn
On 11/10/2015 05:54 PM, Benjamin Good wrote:
In another thread, we are discussing the preponderance of problematic merges of gene/protein items. One of the hypotheses raised to explain the volume and nature of these merges (which are often by fairly inexperienced editors and/or people that seem to only do merges) was that they were coming from the wikidata game. It seems to me that anything like the wikidata game that has the potential to generate a very large volume of edits - especially from new editors - ought to tag its contributions so that they can easily be tracked by the system. It should be easy to answer the question of whether an edit came from that game (or any of what I hope to be many of its descendants). This will make it possible to debug what could potentially be large swathes of problems and to make it straightforward to 'reward' game/other developers with information about the volume of the edits that they have enabled directly from the system (as opposed to their own tracking data).
Please don't misunderstand me. I am a big fan of the wikidata game and actually am pushing for our group to make a bio-specific version of it that will build on that code. I see a great potential here - but because of the potential scale of edits this could quickly generate, we (the whole wikidata community) need ways to keep an eye on what is going on.
-Ben
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
You misunderstand me if you thought I was blaming Magnus for this. It was a hypothesis that right now seems false and we do not yet have another answer. I do think it is entirely possible that a high-volume, low-user-expertise game interface could generate problems very much like what we are observing. I think we should be able to track them more transparently than we can now. The widar tag seems a starting point: https://www.wikidata.org/w/index.php?title=Special:RecentChanges&tagfilt... but this could be improved.
-Ben p.s. Side note on the game. Other very similar things usually incorporate some level of redundancy - e.g. you show the same thing to multiple people and only keep statements where 2 or more people agree.. Lower recall but higher precision - depends on the goal.
On Tue, Nov 10, 2015 at 9:44 AM, Finn Årup Nielsen fn@imm.dtu.dk wrote:
If I understand correctly:
Magnus' game already tags the edits with 'Widar'.
Magnus' game cannot merge protein and genes if they link to each other.
With 'ortholog' and 'expressed by' Magnus' merging game does not contribute to the problematic merges (Magnus email from previously today: "FWIW, checked again. Neither game can merge two items that link to each other. So, if the protein is "expressed by" the gene, that pair will not even be suggested.").
There is nothing more that Magnus can do, - except making an unmerging game. :-)
/Finn
On 11/10/2015 05:54 PM, Benjamin Good wrote:
In another thread, we are discussing the preponderance of problematic merges of gene/protein items. One of the hypotheses raised to explain the volume and nature of these merges (which are often by fairly inexperienced editors and/or people that seem to only do merges) was that they were coming from the wikidata game. It seems to me that anything like the wikidata game that has the potential to generate a very large volume of edits - especially from new editors - ought to tag its contributions so that they can easily be tracked by the system. It should be easy to answer the question of whether an edit came from that game (or any of what I hope to be many of its descendants). This will make it possible to debug what could potentially be large swathes of problems and to make it straightforward to 'reward' game/other developers with information about the volume of the edits that they have enabled directly from the system (as opposed to their own tracking data).
Please don't misunderstand me. I am a big fan of the wikidata game and actually am pushing for our group to make a bio-specific version of it that will build on that code. I see a great potential here - but because of the potential scale of edits this could quickly generate, we (the whole wikidata community) need ways to keep an eye on what is going on.
-Ben
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
-- Finn Årup Nielsen http://people.compute.dtu.dk/faan/
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Fine. I have added a ticket https://phabricator.wikimedia.org/T118322 "Merging wizard shouldn't allow dissimilar items to be merged". Perhaps a developer can help solve the issue.
On 11/10/2015 08:47 PM, Benjamin Good wrote:
You misunderstand me if you thought I was blaming Magnus for this. It was a hypothesis that right now seems false and we do not yet have another answer. I do think it is entirely possible that a high-volume, low-user-expertise game interface could generate problems very much like what we are observing. I think we should be able to track them more transparently than we can now. The widar tag seems a starting point: https://www.wikidata.org/w/index.php?title=Special:RecentChanges&tagfilt... but this could be improved.
-Ben p.s. Side note on the game. Other very similar things usually incorporate some level of redundancy - e.g. you show the same thing to multiple people and only keep statements where 2 or more people agree.. Lower recall but higher precision - depends on the goal.
On Tue, Nov 10, 2015 at 9:44 AM, Finn Årup Nielsen <fn@imm.dtu.dk mailto:fn@imm.dtu.dk> wrote:
If I understand correctly: 1) Magnus' game already tags the edits with 'Widar'. 2) Magnus' game cannot merge protein and genes if they link to each other. With 'ortholog' and 'expressed by' Magnus' merging game does not contribute to the problematic merges (Magnus email from previously today: "FWIW, checked again. Neither game can merge two items that link to each other. So, if the protein is "expressed by" the gene, that pair will not even be suggested."). There is nothing more that Magnus can do, - except making an unmerging game. :-) /Finn On 11/10/2015 05:54 PM, Benjamin Good wrote: In another thread, we are discussing the preponderance of problematic merges of gene/protein items. One of the hypotheses raised to explain the volume and nature of these merges (which are often by fairly inexperienced editors and/or people that seem to only do merges) was that they were coming from the wikidata game. It seems to me that anything like the wikidata game that has the potential to generate a very large volume of edits - especially from new editors - ought to tag its contributions so that they can easily be tracked by the system. It should be easy to answer the question of whether an edit came from that game (or any of what I hope to be many of its descendants). This will make it possible to debug what could potentially be large swathes of problems and to make it straightforward to 'reward' game/other developers with information about the volume of the edits that they have enabled directly from the system (as opposed to their own tracking data). Please don't misunderstand me. I am a big fan of the wikidata game and actually am pushing for our group to make a bio-specific version of it that will build on that code. I see a great potential here - but because of the potential scale of edits this could quickly generate, we (the whole wikidata community) need ways to keep an eye on what is going on. -Ben _______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org> https://lists.wikimedia.org/mailman/listinfo/wikidata -- Finn Årup Nielsen http://people.compute.dtu.dk/faan/ _______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org> https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
On Tue, Nov 10, 2015 at 10:04 PM, Finn Årup Nielsen fn@imm.dtu.dk wrote:
Fine. I have added a ticket https://phabricator.wikimedia.org/T118322 "Merging wizard shouldn't allow dissimilar items to be merged". Perhaps a developer can help solve the issue.
This is scheduled to go live tonight. From then on two items that link to each other in statements should no longer be mergeable by default.
Cheers Lydia