Hello everyone,
The next Research Showcase, “The_Tower_of_Babel.jpg” and “A Warm Welcome, Not a Cold Start,” will be live-streamed next Wednesday, February 20, 2019, at 11:30 AM PST/19:30 UTC. The first presentation is about how images are used across language editions, and the second is about new editors.
YouTube stream: https://www.youtube.com/watch?v=_jpJIFXwlEg
As usual, you can join the conversation on IRC at #wikimedia-research. You can also watch our past research showcases here: https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase
This month's presentations:
The_Tower_of_Babel.jpg: Diversity of Visual Encyclopedic Knowledge Across Wikipedia Language Editions
By Shiqing He (presenting, University of Michigan), Brent Hecht (presenting, Northwestern University), Allen Yilun Lin (Northwestern University), Eytan Adar (University of Michigan), ICWSM'18.
Across all Wikipedia language editions, millions of images augment text in critical ways. This visual encyclopedic knowledge is an important form of wikiwork for editors, a critical part of reader experience, an emerging resource for machine learning, and a lens into cultural differences. However, Wikipedia research--and cross-language edition Wikipedia research in particular--has thus far been limited to text. In this paper, we assess the diversity of visual encyclopedic knowledge across 25 language editions and compare our findings to those reported for textual content. Unlike text, translation in images is largely unnecessary. Additionally, the Wikimedia Foundation, through the Wikipedia Commons, has taken steps to simplify cross-language image sharing. While we may expect that these factors would reduce image diversity, we find that cross-language image diversity rivals, and often exceeds, that found in text. We find that diversity varies between language pairs and content types, but that many images are unique to different language editions. Our findings have implications for readers (in what imagery they see), for editors (in deciding what images to use), for researchers (who study cultural variations), and for machine learning developers (who use Wikipedia for training models).
A Warm Welcome, Not a Cold Start: Eliciting New Editors' Interests via Questionnaires
By Ramtin Yazdanian (presenting, Ecole Polytechnique Federale de Lausanne)
Every day, thousands of users sign up as new Wikipedia contributors. Once joined, these users have to decide which articles to contribute to, which users to reach out to and learn from or collaborate with, etc. Any such task is a hard and potentially frustrating one given the sheer size of Wikipedia. Supporting newcomers in their first steps by recommending articles they would enjoy editing or editors they would enjoy collaborating with is thus a promising route toward converting them into long-term contributors. Standard recommender systems, however, rely on users' histories of previous interactions with the platform. As such, these systems cannot make high-quality recommendations to newcomers without any previous interactions -- the so-called cold-start problem. Our aim is to address the cold-start problem on Wikipedia by developing a method for automatically building short questionnaires that, when completed by a newly registered Wikipedia user, can be used for a variety of purposes, including article recommendations that can help new editors get started. Our questionnaires are constructed based on the text of Wikipedia articles as well as the history of contributions by the already onboarded Wikipedia editors. We have assessed the quality of our questionnaire-based recommendations in an offline evaluation using historical data, as well as an online evaluation with hundreds of real Wikipedia newcomers, concluding that our method provides cohesive, human-readable questions that perform well against several baselines. By addressing the cold-start problem, this work can help with the sustainable growth and maintenance of Wikipedia's diverse editor community.
This is just a reminder that the event below will be happening tomorrow, February 20.
On Thu, Feb 14, 2019 at 11:20 AM Janna Layton jlayton@wikimedia.org wrote:
Hello everyone,
The next Research Showcase, “The_Tower_of_Babel.jpg” and “A Warm Welcome, Not a Cold Start,” will be live-streamed next Wednesday, February 20, 2019, at 11:30 AM PST/19:30 UTC. The first presentation is about how images are used across language editions, and the second is about new editors.
YouTube stream: https://www.youtube.com/watch?v=_jpJIFXwlEg
As usual, you can join the conversation on IRC at #wikimedia-research. You can also watch our past research showcases here: https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase
This month's presentations:
The_Tower_of_Babel.jpg: Diversity of Visual Encyclopedic Knowledge Across Wikipedia Language Editions
By Shiqing He (presenting, University of Michigan), Brent Hecht (presenting, Northwestern University), Allen Yilun Lin (Northwestern University), Eytan Adar (University of Michigan), ICWSM'18.
Across all Wikipedia language editions, millions of images augment text in critical ways. This visual encyclopedic knowledge is an important form of wikiwork for editors, a critical part of reader experience, an emerging resource for machine learning, and a lens into cultural differences. However, Wikipedia research--and cross-language edition Wikipedia research in particular--has thus far been limited to text. In this paper, we assess the diversity of visual encyclopedic knowledge across 25 language editions and compare our findings to those reported for textual content. Unlike text, translation in images is largely unnecessary. Additionally, the Wikimedia Foundation, through the Wikipedia Commons, has taken steps to simplify cross-language image sharing. While we may expect that these factors would reduce image diversity, we find that cross-language image diversity rivals, and often exceeds, that found in text. We find that diversity varies between language pairs and content types, but that many images are unique to different language editions. Our findings have implications for readers (in what imagery they see), for editors (in deciding what images to use), for researchers (who study cultural variations), and for machine learning developers (who use Wikipedia for training models).
A Warm Welcome, Not a Cold Start: Eliciting New Editors' Interests via Questionnaires
By Ramtin Yazdanian (presenting, Ecole Polytechnique Federale de Lausanne)
Every day, thousands of users sign up as new Wikipedia contributors. Once joined, these users have to decide which articles to contribute to, which users to reach out to and learn from or collaborate with, etc. Any such task is a hard and potentially frustrating one given the sheer size of Wikipedia. Supporting newcomers in their first steps by recommending articles they would enjoy editing or editors they would enjoy collaborating with is thus a promising route toward converting them into long-term contributors. Standard recommender systems, however, rely on users' histories of previous interactions with the platform. As such, these systems cannot make high-quality recommendations to newcomers without any previous interactions -- the so-called cold-start problem. Our aim is to address the cold-start problem on Wikipedia by developing a method for automatically building short questionnaires that, when completed by a newly registered Wikipedia user, can be used for a variety of purposes, including article recommendations that can help new editors get started. Our questionnaires are constructed based on the text of Wikipedia articles as well as the history of contributions by the already onboarded Wikipedia editors. We have assessed the quality of our questionnaire-based recommendations in an offline evaluation using historical data, as well as an online evaluation with hundreds of real Wikipedia newcomers, concluding that our method provides cohesive, human-readable questions that perform well against several baselines. By addressing the cold-start problem, this work can help with the sustainable growth and maintenance of Wikipedia's diverse editor community.
-- Janna Layton (she, her) Administrative Assistant - Audiences & Technology Wikimedia Foundation https://wikimediafoundation.org/
The Research Showcase will be starting in about 30 minutes! Info below:
On Thu, Feb 14, 2019 at 11:20 AM Janna Layton jlayton@wikimedia.org wrote:
Hello everyone,
The next Research Showcase, “The_Tower_of_Babel.jpg” and “A Warm Welcome, Not a Cold Start,” will be live-streamed next Wednesday, February 20, 2019, at 11:30 AM PST/19:30 UTC. The first presentation is about how images are used across language editions, and the second is about new editors.
YouTube stream: https://www.youtube.com/watch?v=_jpJIFXwlEg
As usual, you can join the conversation on IRC at #wikimedia-research. You can also watch our past research showcases here: https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase
This month's presentations:
The_Tower_of_Babel.jpg: Diversity of Visual Encyclopedic Knowledge Across Wikipedia Language Editions
By Shiqing He (presenting, University of Michigan), Brent Hecht (presenting, Northwestern University), Allen Yilun Lin (Northwestern University), Eytan Adar (University of Michigan), ICWSM'18.
Across all Wikipedia language editions, millions of images augment text in critical ways. This visual encyclopedic knowledge is an important form of wikiwork for editors, a critical part of reader experience, an emerging resource for machine learning, and a lens into cultural differences. However, Wikipedia research--and cross-language edition Wikipedia research in particular--has thus far been limited to text. In this paper, we assess the diversity of visual encyclopedic knowledge across 25 language editions and compare our findings to those reported for textual content. Unlike text, translation in images is largely unnecessary. Additionally, the Wikimedia Foundation, through the Wikipedia Commons, has taken steps to simplify cross-language image sharing. While we may expect that these factors would reduce image diversity, we find that cross-language image diversity rivals, and often exceeds, that found in text. We find that diversity varies between language pairs and content types, but that many images are unique to different language editions. Our findings have implications for readers (in what imagery they see), for editors (in deciding what images to use), for researchers (who study cultural variations), and for machine learning developers (who use Wikipedia for training models).
A Warm Welcome, Not a Cold Start: Eliciting New Editors' Interests via Questionnaires
By Ramtin Yazdanian (presenting, Ecole Polytechnique Federale de Lausanne)
Every day, thousands of users sign up as new Wikipedia contributors. Once joined, these users have to decide which articles to contribute to, which users to reach out to and learn from or collaborate with, etc. Any such task is a hard and potentially frustrating one given the sheer size of Wikipedia. Supporting newcomers in their first steps by recommending articles they would enjoy editing or editors they would enjoy collaborating with is thus a promising route toward converting them into long-term contributors. Standard recommender systems, however, rely on users' histories of previous interactions with the platform. As such, these systems cannot make high-quality recommendations to newcomers without any previous interactions -- the so-called cold-start problem. Our aim is to address the cold-start problem on Wikipedia by developing a method for automatically building short questionnaires that, when completed by a newly registered Wikipedia user, can be used for a variety of purposes, including article recommendations that can help new editors get started. Our questionnaires are constructed based on the text of Wikipedia articles as well as the history of contributions by the already onboarded Wikipedia editors. We have assessed the quality of our questionnaire-based recommendations in an offline evaluation using historical data, as well as an online evaluation with hundreds of real Wikipedia newcomers, concluding that our method provides cohesive, human-readable questions that perform well against several baselines. By addressing the cold-start problem, this work can help with the sustainable growth and maintenance of Wikipedia's diverse editor community.
-- Janna Layton (she, her) Administrative Assistant - Audiences & Technology Wikimedia Foundation https://wikimediafoundation.org/