I'm confused by this from today's Wikidata weekly summary:
- New request for comments: Semi-automatic Addition of References to Wikidata Statements - feedback on the Primary Sources Tool https://www.wikidata.org/wiki/Wikidata:Requests_for_comment/Semi-automatic_Addition_of_References_to_Wikidata_Statements
First of all, the title makes no sense because "semi-automatic addition of references to Wikidata statements" is one of the main things that the tool can't currently do. You'll almost always end up with duplicate statements if there's an existing statement, rather than the desired behavior of just adding the statement.
Second, I'm not sure who "Hjfocs" is (why does everyone have to make up fake wikinames?), but why are they asking for more feedback when there's been *ample* feedback already? There hasn't been an issue with getting people to test the tool or provide feedback based on the testing. The issue has been with getting anyone to *act* on the feedback. Everything is a) "too hard," or b) "beyond our resources," or depends on something in category a or b, or is incompatible with the arbitrary implementation scheme chosen, or some other excuse.
We're 12-18+ months into the project, depending on how you measure, and not only is the tool not usable yet, but it's no longer improving, so I think it's time to take a step back and ask some fundamental questions.
- Is the current data pipeline and front end gadget the right approach and the right technology for this task? Can they be fixed to be suitable for users? - If so, should Google continue to have sole responsibility for it or should it be transferred to the Wikidata team or someone else who'll actually work on it? - If not, what should the data pipeline and tooling look like to make maximum use of the Freebase data?
The whole project needs a reboot.
Tom
On Tue, Jun 14, 2016 at 1:02 AM, Tom Morris tfmorris@gmail.com wrote:
I'm confused by this from today's Wikidata weekly summary:
- New request for comments: Semi-automatic Addition of References to
Wikidata Statements - feedback on the Primary Sources Tool https://www.wikidata.org/wiki/Wikidata:Requests_for_comment/Semi-automatic_Addition_of_References_to_Wikidata_Statements
First of all, the title makes no sense because "semi-automatic addition of references to Wikidata statements" is one of the main things that the tool can't currently do. You'll almost always end up with duplicate statements if there's an existing statement, rather than the desired behavior of just adding the statement.
Second, I'm not sure who "Hjfocs" is (why does everyone have to make up fake wikinames?), but why are they asking for more feedback when there's been
the user page is linked to https://www.linkedin.com/in/marco-fossati-7647ab42 Marco Gabriele Enrico Fossati http://wed.fbk.eu/people/profile/fossati : http://wed.fbk.eu/people/profile/fossati
with interest i notice that the following goes now in "awards" of a CV: Wikimedia Foundation Individual Engangement Grant: Winner of the largest grant, 2015 round 2 call.
rupert
Notifying Marti of this discussion.
Pine On Jun 13, 2016 17:07, "rupert THURNER" rupert.thurner@gmail.com wrote:
On Tue, Jun 14, 2016 at 1:02 AM, Tom Morris tfmorris@gmail.com wrote:
I'm confused by this from today's Wikidata weekly summary:
- New request for comments: Semi-automatic Addition of References to
Wikidata Statements - feedback on the Primary Sources Tool https://www.wikidata.org/wiki/Wikidata:Requests_for_comment/Semi-automatic_Addition_of_References_to_Wikidata_Statements
First of all, the title makes no sense because "semi-automatic addition of references to Wikidata statements" is one of the main things that the tool can't currently do. You'll almost always end up with duplicate statements if there's an existing statement, rather than the desired behavior of just adding the statement.
Second, I'm not sure who "Hjfocs" is (why does everyone have to make up fake wikinames?), but why are they asking for more feedback when there's been
the user page is linked to https://www.linkedin.com/in/marco-fossati-7647ab42 Marco Gabriele Enrico Fossati http://wed.fbk.eu/people/profile/fossati : http://wed.fbk.eu/people/profile/fossati
with interest i notice that the following goes now in "awards" of a CV: Wikimedia Foundation Individual Engangement Grant: Winner of the largest grant, 2015 round 2 call.
rupert
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
On Tue, Jun 14, 2016 at 1:03 AM Tom Morris tfmorris@gmail.com wrote:
I'm confused by this from today's Wikidata weekly summary:
- New request for comments: Semi-automatic Addition of References to
Wikidata Statements - feedback on the Primary Sources Tool https://www.wikidata.org/wiki/Wikidata:Requests_for_comment/Semi-automatic_Addition_of_References_to_Wikidata_Statements
First of all, the title makes no sense because "semi-automatic addition of references to Wikidata statements" is one of the main things that the tool can't currently do. You'll almost always end up with duplicate statements if there's an existing statement, rather than the desired behavior of just adding the statement.
Second, I'm not sure who "Hjfocs" is (why does everyone have to make up fake wikinames?), but why are they asking for more feedback when there's been *ample* feedback already? There hasn't been an issue with getting people to test the tool or provide feedback based on the testing. The issue has been with getting anyone to *act* on the feedback. Everything is a) "too hard," or b) "beyond our resources," or depends on something in category a or b, or is incompatible with the arbitrary implementation scheme chosen, or some other excuse.
We're 12-18+ months into the project, depending on how you measure, and not only is the tool not usable yet, but it's no longer improving, so I think it's time to take a step back and ask some fundamental questions.
- Is the current data pipeline and front end gadget the right approach and
the right technology for this task? Can they be fixed to be suitable for users?
- If so, should Google continue to have sole responsibility for it or
should it be transferred to the Wikidata team or someone else who'll actually work on it?
- If not, what should the data pipeline and tooling look like to make
maximum use of the Freebase data?
The whole project needs a reboot.
I realize you are upset but you are really barking up the wrong tree. Marco is trying to give the whole thing more structure and sort through all the requests to find a way forward. He is actually doing something constructive about the issues you are raising.
Cheers Lydia
Hi Tom and thanks Lydia for the clarification,
that request for comments (RFC) [1] aims at gathering feedback both on the primary sources tool and the available datasets (especially StrepHit [2]), which are closely intertwined: the dataset is in the tool, so people can play with both in one single interaction and leave their thoughts in the RFC.
Sorry if the title is misleading: the pipeline is indeed semi-automatic, as the StrepHit dataset is generated automatically, while its validation requires human attention.
Since I'm trying to centralize the discussion, it would be great if you could expand in the RFC the 3 fundamental questions you raised.
Best,
Marco
[1] https://www.wikidata.org/wiki/Wikidata:Requests_for_comment/Semi-automatic_A... [2] https://meta.wikimedia.org/wiki/Grants:IEG/StrepHit:_Wikidata_Statements_Val...
On 6/14/16 08:27, Lydia Pintscher wrote:
On Tue, Jun 14, 2016 at 1:03 AM Tom Morris <tfmorris@gmail.com mailto:tfmorris@gmail.com> wrote:
I'm confused by this from today's Wikidata weekly summary: * New request for comments: Semi-automatic Addition of References to Wikidata Statements - feedback on the Primary Sources Tool <https://www.wikidata.org/wiki/Wikidata:Requests_for_comment/Semi-automatic_Addition_of_References_to_Wikidata_Statements> First of all, the title makes no sense because "semi-automatic addition of references to Wikidata statements" is one of the main things that the tool can't currently do. You'll almost always end up with duplicate statements if there's an existing statement, rather than the desired behavior of just adding the statement. Second, I'm not sure who "Hjfocs" is (why does everyone have to make up fake wikinames?), but why are they asking for more feedback when there's been *ample* feedback already? There hasn't been an issue with getting people to test the tool or provide feedback based on the testing. The issue has been with getting anyone to *act* on the feedback. Everything is a) "too hard," or b) "beyond our resources," or depends on something in category a or b, or is incompatible with the arbitrary implementation scheme chosen, or some other excuse. We're 12-18+ months into the project, depending on how you measure, and not only is the tool not usable yet, but it's no longer improving, so I think it's time to take a step back and ask some fundamental questions. - Is the current data pipeline and front end gadget the right approach and the right technology for this task? Can they be fixed to be suitable for users? - If so, should Google continue to have sole responsibility for it or should it be transferred to the Wikidata team or someone else who'll actually work on it? - If not, what should the data pipeline and tooling look like to make maximum use of the Freebase data? The whole project needs a reboot.
I realize you are upset but you are really barking up the wrong tree. Marco is trying to give the whole thing more structure and sort through all the requests to find a way forward. He is actually doing something constructive about the issues you are raising.
Cheers Lydia -- Lydia Pintscher - http://about.me/lydia.pintscher Product Manager for Wikidata
Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de http://www.wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207.
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
OK, Marco Fossati (aka marfox on Github) is a name I recognize.
Lydia - I'm not upset, just confused by the RFC. Have you considered asking Google to transfer ownership of the project since they're no longer doing anything with it? That would do away with the requirement that contributors sign a Google CLA and open the door for active project leadership.
Marco - Centralizing the discussion is good, but why not pick one of the three existing channels (issue tracker, project page, this mailing list) rather than creating a fourth channel? As much as I love playing and watching soccer, I'm much more interested in the vast trove of identifiers and other curated information in Freebase than I am in improving Wikidata's soccer coverage, but the Primary Sources tool could be useful for some portions of the Freebase data, if it could be usable. I'm sure you've seen my issues and pull requests on Github.
Tom
On Tue, Jun 14, 2016 at 6:53 AM, Marco Fossati fossati@spaziodati.eu wrote:
Hi Tom and thanks Lydia for the clarification,
that request for comments (RFC) [1] aims at gathering feedback both on the primary sources tool and the available datasets (especially StrepHit [2]), which are closely intertwined: the dataset is in the tool, so people can play with both in one single interaction and leave their thoughts in the RFC.
Sorry if the title is misleading: the pipeline is indeed semi-automatic, as the StrepHit dataset is generated automatically, while its validation requires human attention.
Since I'm trying to centralize the discussion, it would be great if you could expand in the RFC the 3 fundamental questions you raised.
Best,
Marco
[1] https://www.wikidata.org/wiki/Wikidata:Requests_for_comment/Semi-automatic_A... [2] https://meta.wikimedia.org/wiki/Grants:IEG/StrepHit:_Wikidata_Statements_Val...
On 6/14/16 08:27, Lydia Pintscher wrote:
On Tue, Jun 14, 2016 at 1:03 AM Tom Morris <tfmorris@gmail.com mailto:tfmorris@gmail.com> wrote:
I'm confused by this from today's Wikidata weekly summary: * New request for comments: Semi-automatic Addition of References to Wikidata Statements - feedback on the Primary Sources Tool <
https://www.wikidata.org/wiki/Wikidata:Requests_for_comment/Semi-automatic_A...
First of all, the title makes no sense because "semi-automatic addition of references to Wikidata statements" is one of the main things that the tool can't currently do. You'll almost always end up with duplicate statements if there's an existing statement, rather than the desired behavior of just adding the statement. Second, I'm not sure who "Hjfocs" is (why does everyone have to make up fake wikinames?), but why are they asking for more feedback when there's been *ample* feedback already? There hasn't been an issue with getting people to test the tool or provide feedback based on the testing. The issue has been with getting anyone to *act* on the feedback. Everything is a) "too hard," or b) "beyond our resources," or depends on something in category a or b, or is incompatible with the arbitrary implementation scheme chosen, or some other excuse. We're 12-18+ months into the project, depending on how you measure, and not only is the tool not usable yet, but it's no longer improving, so I think it's time to take a step back and ask some fundamental questions. - Is the current data pipeline and front end gadget the right approach and the right technology for this task? Can they be fixed to be suitable for users? - If so, should Google continue to have sole responsibility for it or should it be transferred to the Wikidata team or someone else who'll actually work on it? - If not, what should the data pipeline and tooling look like to make maximum use of the Freebase data? The whole project needs a reboot.
I realize you are upset but you are really barking up the wrong tree. Marco is trying to give the whole thing more structure and sort through all the requests to find a way forward. He is actually doing something constructive about the issues you are raising.
Cheers Lydia -- Lydia Pintscher - http://about.me/lydia.pintscher Product Manager for Wikidata
Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de http://www.wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207.
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
On Tue, Jun 14, 2016 at 7:26 PM, Tom Morris tfmorris@gmail.com wrote:
OK, Marco Fossati (aka marfox on Github) is a name I recognize.
Lydia - I'm not upset, just confused by the RFC. Have you considered asking Google to transfer ownership of the project since they're no longer doing anything with it? That would do away with the requirement that contributors sign a Google CLA and open the door for active project leadership.
I am happy to handle that if I have someone who says they'd contribute to the tool without the CLA but not with it, yes.
Cheers Lydia
Lydia,
I would contribute. Most of us from Freebase around would gladly help out I'm sure. We'd like to see the data eventually move into Wikidata rather than rot away.
Thad +ThadGuidry https://www.google.com/+ThadGuidry
On Tue, Jun 14, 2016 at 12:30 PM, Lydia Pintscher < lydia.pintscher@wikimedia.de> wrote:
On Tue, Jun 14, 2016 at 7:26 PM, Tom Morris tfmorris@gmail.com wrote:
OK, Marco Fossati (aka marfox on Github) is a name I recognize.
Lydia - I'm not upset, just confused by the RFC. Have you considered
asking
Google to transfer ownership of the project since they're no longer doing anything with it? That would do away with the requirement that
contributors
sign a Google CLA and open the door for active project leadership.
I am happy to handle that if I have someone who says they'd contribute to the tool without the CLA but not with it, yes.
Cheers Lydia
-- Lydia Pintscher - http://about.me/lydia.pintscher Product Manager for Wikidata
Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207.
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Hoi, That would be really welcome.. I notice how much duplication is going on... SAD... My position on this has been clear. Thanks, GerardM
On 14 June 2016 at 21:18, Thad Guidry thadguidry@gmail.com wrote:
Lydia,
I would contribute. Most of us from Freebase around would gladly help out I'm sure. We'd like to see the data eventually move into Wikidata rather than rot away.
Thad +ThadGuidry https://www.google.com/+ThadGuidry
On Tue, Jun 14, 2016 at 12:30 PM, Lydia Pintscher < lydia.pintscher@wikimedia.de> wrote:
On Tue, Jun 14, 2016 at 7:26 PM, Tom Morris tfmorris@gmail.com wrote:
OK, Marco Fossati (aka marfox on Github) is a name I recognize.
Lydia - I'm not upset, just confused by the RFC. Have you considered
asking
Google to transfer ownership of the project since they're no longer
doing
anything with it? That would do away with the requirement that
contributors
sign a Google CLA and open the door for active project leadership.
I am happy to handle that if I have someone who says they'd contribute to the tool without the CLA but not with it, yes.
Cheers Lydia
-- Lydia Pintscher - http://about.me/lydia.pintscher Product Manager for Wikidata
Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207.
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Hi Tom, all,
Have you considered asking Google to transfer ownership of the project since they're no longer doing anything with it?
Denny wrote [1] that we are open to transferring the project to a new owner: "If anyone wants to take over the project, we would invite you to contribute a bit for a while, and then let’s discuss about it. I would be thrilled to see this tool develop.". This still stands :-)
Cheers, Tom
-- [1] https://lists.wikimedia.org/pipermail/wikidata/2016-February/008316.html
Thanks for the reminder. So that solves the "asking" part.
Does anyone *not* think that the Wikidata engineering team is the correct place for this?
Lydia - can you assign someone to come up to speed at whatever level Denny requires to feel comfortable making the transfer?
Tom
On Tue, Jun 14, 2016 at 4:26 PM, Thomas Steiner tomac@google.com wrote:
Hi Tom, all,
Have you considered asking Google to transfer ownership of the project since they're no longer doing anything with it?
Denny wrote [1] that we are open to transferring the project to a new owner: "If anyone wants to take over the project, we would invite you to contribute a bit for a while, and then let’s discuss about it. I would be thrilled to see this tool develop.". This still stands :-)
Cheers, Tom
-- [1] https://lists.wikimedia.org/pipermail/wikidata/2016-February/008316.html
-- Dr. Thomas Steiner, Employee (http://blog.tomayac.com, https://twitter.com/tomayac)
Google Germany GmbH, ABC-Str. 19, 20354 Hamburg, Germany Managing Directors: Matthew Scott Sucherman, Paul Terence Manicle Registration office and registration number: Hamburg, HRB 86891
-----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.29 (GNU/Linux)
iFy0uwAntT0bE3xtRa5AfeCheCkthAtTh3reSabiGbl0ck0fjumBl3DCharaCTersAttH3b0ttom hTtPs://xKcd.cOm/1181/ -----END PGP SIGNATURE-----
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
On Jun 14, 2016 23:56, "Tom Morris" tfmorris@gmail.com wrote:
Thanks for the reminder. So that solves the "asking" part.
Does anyone *not* think that the Wikidata engineering team is the correct
place for this?
Lydia - can you assign someone to come up to speed at whatever level
Denny requires to feel comfortable making the transfer?
I will take care of it with Denny in the next days.
Cheers Lydia
Tom
On Tue, Jun 14, 2016 at 4:26 PM, Thomas Steiner tomac@google.com wrote:
Hi Tom, all,
Have you considered asking Google to transfer ownership of the project since they're no longer
doing
anything with it?
Denny wrote [1] that we are open to transferring the project to a new owner: "If anyone wants to take over the project, we would invite you to contribute a bit for a while, and then let’s discuss about it. I would be thrilled to see this tool develop.". This still stands :-)
Cheers, Tom
-- [1]
https://lists.wikimedia.org/pipermail/wikidata/2016-February/008316.html
-- Dr. Thomas Steiner, Employee (http://blog.tomayac.com, https://twitter.com/tomayac)
Google Germany GmbH, ABC-Str. 19, 20354 Hamburg, Germany Managing Directors: Matthew Scott Sucherman, Paul Terence Manicle Registration office and registration number: Hamburg, HRB 86891
-----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.29 (GNU/Linux)
iFy0uwAntT0bE3xtRa5AfeCheCkthAtTh3reSabiGbl0ck0fjumBl3DCharaCTersAttH3b0ttom
hTtPs://xKcd.cOm/1181/ -----END PGP SIGNATURE-----
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Hi Lydia,
On 6/15/16 07:42, Lydia Pintscher wrote:
Lydia - can you assign someone to come up to speed at whatever level
Denny requires to feel comfortable making the transfer?
I will take care of it with Denny in the next days.
Repasting part of a previous message with the list of requirements:
A. a developer to understand the back-end code [1], written in C++; B. a developer to understand the front-end code [2], written in Javascript; C. access to the WMF Labs machine to deploy the back-end [3]; D. a Wikidata administrator to deploy the front-end [4]; E. centralized and exhaustive documentation.
As part of the StrepHit project goals [5], my team is striving to help with A. (not exactly trivial) and C., but we really need B. and D. to be effective.
Best,
Marco
[1] https://github.com/google/primarysources/tree/master/backend [2] https://github.com/google/primarysources/tree/master/frontend [3] https://tools.wmflabs.org/wikidata-primary-sources [4] https://github.com/google/primarysources/tree/master/frontend#deployment-on-... [5] https://meta.wikimedia.org/wiki/Grants:IEG/StrepHit:_Wikidata_Statements_Val...
On Wed, Jun 15, 2016 at 7:42 AM Lydia Pintscher < lydia.pintscher@wikimedia.de> wrote:
On Jun 14, 2016 23:56, "Tom Morris" tfmorris@gmail.com wrote:
Thanks for the reminder. So that solves the "asking" part.
Does anyone *not* think that the Wikidata engineering team is the
correct place for this?
Lydia - can you assign someone to come up to speed at whatever level
Denny requires to feel comfortable making the transfer?
I will take care of it with Denny in the next days.
FYI: In-progress now. I'll report back as soon as it is done but it'll take a few days to clarify some stuff.
Cheers Lydia
On Wed, Jun 15, 2016 at 12:49 PM, Lydia Pintscher < Lydia.Pintscher@wikimedia.de> wrote:
On Wed, Jun 15, 2016 at 7:42 AM Lydia Pintscher < lydia.pintscher@wikimedia.de> wrote:
On Jun 14, 2016 23:56, "Tom Morris" tfmorris@gmail.com wrote:
Thanks for the reminder. So that solves the "asking" part.
Does anyone *not* think that the Wikidata engineering team is the
correct place for this?
Lydia - can you assign someone to come up to speed at whatever level
Denny requires to feel comfortable making the transfer?
I will take care of it with Denny in the next days.
FYI: In-progress now. I'll report back as soon as it is done but it'll take a few days to clarify some stuff.
Cool. Thanks for the quick action.
Tom
Hi Tom,
On 6/14/16 19:26, Tom Morris wrote:
Marco - Centralizing the discussion is good, but why not pick one of the three existing channels (issue tracker, project page, this mailing list) rather than creating a fourth channel?
The RFC is meant to put together low-level technical problems (issue tracker), usability discussions (project page), less structured discussions (mailing list). *And* comments on the uploaded datasets.
As much as I love playing and watching soccer, I'm much more interested in the vast trove of identifiers and other curated information in Freebase than I am in improving Wikidata's soccer coverage, but the Primary Sources tool could be useful for some portions of the Freebase data, if it could be usable.
I guess you are referring to the StrepHit prototype dataset 'strephit-soccer'. Why don't you try the 'strephit-testing' one? It deals with biographies and has much broader coverage.
Best,
Marco