Hi all
We are researchers from the dlab at EPFL working with Bob West.
We have plans to build a graph-based ML algorithm, which will further facilitate development of a tool to assist Wikipedia editors by providing recommendations on two novel use-cases. One consists of suggesting hyperlinks (Wikipedia articles) to be inserted within a section of an article. Note that this is different from "classical link prediction".
We feel the tool could be of great value, as it can work with newly created sections that do not have any content yet. What's more, the editor can type *any* section name (either non-existent in that article or even in the whole Wiki project) and the tool would have the power to suggest hyperlinks that are likely to be of interest for that section in the article. We think that (specially) stub articles can benefit from this tool.
However, we have one assumption. In addition to the section name, the editor must provide the "entity type" (Place, People, Date, Organization...) of the Wikipedia articles she would like to insert in the section. The reason is that within a section you can find links to articles of diverse types.
The reason we are reaching out to you is two fold: (1) To check whether such a tool would be of interest and likely to be used by the editors. (2) How limiting is the assumption that the editor needs to specify the entity type of the Wikipedia articles for which she needs recommendations from the tool?
One one hand, some of us think this is not a problem as the number of entity types is relatively small (between 10 and 20) and they can be easily and visually presented to the editor with a dropdown list. On the other side, others think this requirement is limiting.
We would like to know your opinion to decide whether we should move forward with this project.
Thanks! dlab
Hi,
A good place to get feedback from the English Wikipedia community would be the Village Pump Idea Lab: https://en.wikipedia.org/wiki/Wikipedia:Village_pump_(idea_lab) .
It’s not clear to me whether the tool would be suggesting inline links for the text that’s already in the article, or “See also” links. A description of what problem the tool would solve would be really helpful. It would also be helpful to see “before and after” mockups showing a specific stub article as it exists today and what the article would look like after the tool’s suggestions have been applied.
Cheers, Su-Laine Wikipedia contributor
On Sep 18, 2020, at 3:30 AM, Garcia Duran Alberto alberto.duran@epfl.ch wrote:
Hi all
We are researchers from the dlab at EPFL working with Bob West.
We have plans to build a graph-based ML algorithm, which will further facilitate development of a tool to assist Wikipedia editors by providing recommendations on two novel use-cases. One consists of suggesting hyperlinks (Wikipedia articles) to be inserted within a section of an article. Note that this is different from "classical link prediction".
We feel the tool could be of great value, as it can work with newly created sections that do not have any content yet. What's more, the editor can type *any* section name (either non-existent in that article or even in the whole Wiki project) and the tool would have the power to suggest hyperlinks that are likely to be of interest for that section in the article. We think that (specially) stub articles can benefit from this tool.
However, we have one assumption. In addition to the section name, the editor must provide the "entity type" (Place, People, Date, Organization...) of the Wikipedia articles she would like to insert in the section. The reason is that within a section you can find links to articles of diverse types.
The reason we are reaching out to you is two fold: (1) To check whether such a tool would be of interest and likely to be used by the editors. (2) How limiting is the assumption that the editor needs to specify the entity type of the Wikipedia articles for which she needs recommendations from the tool?
One one hand, some of us think this is not a problem as the number of entity types is relatively small (between 10 and 20) and they can be easily and visually presented to the editor with a dropdown list. On the other side, others think this requirement is limiting.
We would like to know your opinion to decide whether we should move forward with this project.
Thanks! dlab
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Hi dlab,
Are you looking at articles from scratch, or articles that already exist? If they already exist, you could perhaps derive the 'type' from the category the article is placed in, or predict it from the lead paragraph. If you're aiming at articles that get created in non-English (I don't know if you're limiting yourself by language), you could also consider using the Wikidata item associated with the article (which will almost always have some 'instance of' defined).
As a use case, I could see this tool be helpful in the context of translating articles, and in that case you already have an article in another language to your avail, as well as a category tree in that language, and an introduction paragraph. In that case, you probably don't have to ask them what type of topic it is.
That would seem also the most likely place to implement it (but also with the least added value maybe?). I think this kind of tool would be most likely to be used if you can somehow fit it into a toolbox or existing workflow.
I could also suggest to you to consider the opposite use case: detect links that are very 'out of place': links that are likely referring to a homonym. An example would be a biology article linking to a mathematician, where you would have expected a link to a biologist. Or in a historical article about someone/something in the 14th century, a link to a person in the 17th century. This could still be a valid link, but it may be helpful to detect these rare events - it might trigger a disambiguation. Just thinking out loud.
Best,
Lodewijk
On Sat, Sep 19, 2020 at 1:44 PM Su-Laine Brodsky sulainey@gmail.com wrote:
Hi,
A good place to get feedback from the English Wikipedia community would be the Village Pump Idea Lab: https://en.wikipedia.org/wiki/Wikipedia:Village_pump_(idea_lab) .
It’s not clear to me whether the tool would be suggesting inline links for the text that’s already in the article, or “See also” links. A description of what problem the tool would solve would be really helpful. It would also be helpful to see “before and after” mockups showing a specific stub article as it exists today and what the article would look like after the tool’s suggestions have been applied.
Cheers, Su-Laine Wikipedia contributor
On Sep 18, 2020, at 3:30 AM, Garcia Duran Alberto alberto.duran@epfl.ch
wrote:
Hi all
We are researchers from the dlab at EPFL working with Bob West.
We have plans to build a graph-based ML algorithm, which will further
facilitate development of a tool to assist Wikipedia editors by providing recommendations on two novel use-cases. One consists of suggesting hyperlinks (Wikipedia articles) to be inserted within a section of an article. Note that this is different from "classical link prediction".
We feel the tool could be of great value, as it can work with newly
created sections that do not have any content yet. What's more, the editor can type *any* section name (either non-existent in that article or even in the whole Wiki project) and the tool would have the power to suggest hyperlinks that are likely to be of interest for that section in the article. We think that (specially) stub articles can benefit from this tool.
However, we have one assumption. In addition to the section name, the
editor must provide the "entity type" (Place, People, Date, Organization...) of the Wikipedia articles she would like to insert in the section. The reason is that within a section you can find links to articles of diverse types.
The reason we are reaching out to you is two fold: (1) To check whether such a tool would be of interest and likely to be
used by the editors.
(2) How limiting is the assumption that the editor needs to specify the
entity type of the Wikipedia articles for which she needs recommendations from the tool?
One one hand, some of us think this is not a problem as the number of
entity types is relatively small (between 10 and 20) and they can be easily and visually presented to the editor with a dropdown list. On the other side, others think this requirement is limiting.
We would like to know your opinion to decide whether we should move
forward with this project.
Thanks! dlab
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
wiki-research-l@lists.wikimedia.org