Dear folks,
Is there a good way to query a user's edit history, e.g., edit count during a period?
My current solution is using usercontribs API (https://www.mediawiki.org/wiki/API:Usercontribs).
But, the process has been stalled maybe due to some query limit.
Thanks,
Haifeng Zhang
Apologies for cross-posting
====
2nd Language, Data and Knowledge (LDK) conference
May 20-22, 2019
Leipzig, Germany
====
Call for Participation
With the advent of digital technologies, language data becomes more and
more valuable. In that context, we would like to draw your attention to
the 2nd Language, Data and Knowledge (LDK) conference which will be held
in Leipzig from May 20th till 22nd, 2019.
This new biennial conference series aims at bringing together
researchers from across disciplines concerned with language data in data
science and knowledge-based applications.
We invite practitioners, professionals, researchers and anyone with an
interest in language data to be part of this event in Leipzig and catch
up with the latest research outcomes in the areas of acquisition,
provenance, representation, maintenance, usability, quality as well as
legal, organizational and infrastructure aspects of language data.
# Keynote Speakers
We already have the following keynote speakers confirmed
- Keynote #1: Christian Bizer, Mannheim University
- Keynote #2: Christiane Fellbaum, Princeton University
- Keynote #3: Eduard Werner, Leipzig University
# Associated Events
The following events are co-located with LDK 2019:
3rd Summer Datathon on Linguistic Linked Open Data (SD-LLOD-19) from May
12th to 17th, 2019 at the Leibniz Center for Informatics in Wadern, Germany
Please find more information about the Datathon here:
https://datathon2019.linguistic-lod.org/
Workshops on the 20th May 2019
- Translation Inference across Dictionaries (TIAD)
- 3rd Workshop on Humanities in the Semantic web (WHiSe III)
- 2nd OntoLex Face to Face Meeting
Please find more information about the workshops here:
http://2019.ldk-conf.org/co-located-events/
DBpedia Community Meeting on the 23rd May 2019
- 13th DBpedia Community Meeting
https://wiki.dbpedia.org/meetings/Leipzig2019
Further information on the LDK program can be found here:
http://2019.ldk-conf.org/events-programme/ To register and be part of
the LDK conference and its associated events, please go to
http://2019.ldk-conf.org/registration/.
We are looking forward to meeting you in Leipzig.
Kind regards
Your LDK-Team
Hi everyone,
We’re preparing for the March 2019 research newsletter and looking for contributors. Please take a look at https://etherpad.wikimedia.org/p/WRN201903 and add your name next to any paper you are interested in covering. Our target publication date is on March 30 UTC. If you can't make this deadline but would like to cover a particular paper in the subsequent issue, leave a note next to the paper's entry in the etherpad. As usual, short notes and one-paragraph reviews are most welcome.
Highlights from this month:
- Analysis of the Wikipedia Network of Mathematicians
- Detecting pages to protect in Wikipedia across multiple languages
- How the medium shapes the message: Printing and the rise of the arts and sciences
- Trajectories of Blocked Community Members: Redemption, Recidivism and Departure
- What is the central bank of Wikipedia?
- Who Counts as a Notable Sociologist on Wikipedia? Gender, Race, and the “Professor Test”
Mohammed Abdulai and Tilman Bayer
[1] http://meta.wikimedia.org/wiki/Research:Newsletter
I noticed just now that the Foundation is soliciting applications for a new CTO:
https://www.linkedin.com/feed/update/urn:li:activity:6515003866130505729
Can we please hire a CTO who would prefer to protect reader privacy
above the interests of any State or non-state actors, whether they
have infiltrated staff, contractor, and NDA signatory ranks, and
whether it interferes with reader statistics and analytics or not,
please?
In particular, I would like to repeat my request that we should not be logging
personally identifiable information which might increase our subpoena
burden or result in privacy violation incidents. Fuzzing geolocation
is okay, but we should not be transmitting IP addresses into logs
across even a LAN, for example, and we certainly shouldn't be
purchasing hardware with backdoor coprocessors wasting electricity and
exposing us to government or similar intrusions:
https://lists.wikimedia.org/pipermail/analytics/2017-January/005696.html
Best regards,
Jim
Hello everyone,
I'm just sending a reminder that the below Showcase will be starting in
half an hour.
-Janna Layton
Hi all,
The next Research Showcase, “Learning How to Correct a Knowledge Base
from the Edit History” and “TableNet: An Approach for Determining
Fine-grained Relations for Wikipedia Tables” will be live-streamed
this Wednesday, March 20, 2019, at 11:30 AM PST/18:30 UTC (Please note
the change in time in UTC due to daylight saving changes in the U.S.).
The first presentation is about using edit history to automatically
correct constraint violations in Wikidata, and the second is about
interlinking Wikipedia tables.
YouTube stream: https://www.youtube.com/watch?v=6p62PMhkVNM
As usual, you can join the conversation on IRC at #wikimedia-research.
You can also watch our past research showcases at
https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase .
This month's presentations:
Learning How to Correct a Knowledge Base from the Edit History
By Thomas Pellissier Tanon (Télécom ParisTech), Camille Bourgaux (DI
ENS, CNRS, ENS, PSL Univ. & Inria), Fabian Suchanek (Télécom
ParisTech), WWW'19.
The curation of Wikidata (and other knowledge bases) is crucial to
keep the data consistent, to fight vandalism and to correct good faith
mistakes. However, manual curation of the data is costly. In this
work, we propose to take advantage of the edit history of the
knowledge base in order to learn how to correct constraint violations
automatically. Our method is based on rule mining, and uses the edits
that solved violations in the past to infer how to solve similar
violations in the present. For example, our system is able to learn
that the value of the [[d:Property:P21|sex or gender]] property
[[d:Q467|woman]] should be replaced by [[d:Q6581072|female]]. We
provide [https://tools.wmflabs.org/wikidata-game/distributed/#game=43
a Wikidata game] that suggests our corrections to the users in order
to improve Wikidata. Both the evaluation of our method on past
corrections, and the Wikidata game statistics show significant
improvements over baselines.
TableNet: An Approach for Determining Fine-grained Relations for
Wikipedia Tables
By Besnik Fetahu
Wikipedia tables represent an important resource, where information is
organized w.r.t table schemas consisting of columns. In turn each
column, may contain instance values that point to other Wikipedia
articles or primitive values (e.g. numbers, strings etc.). In this
work, we focus on the problem of interlinking Wikipedia tables for two
types of table relations: equivalent and subPartOf. Through such
relations, we can further harness semantically related information by
accessing related tables or facts therein. Determining the relation
type of a table pair is not trivial, as it is dependent on the
schemas, the values therein, and the semantic overlap of the cell
values in the corresponding tables. We propose TableNet, an approach
that constructs a knowledge graph of interlinked tables with subPartOf
and equivalent relations. TableNet consists of two main steps: (i) for
any source table we provide an efficient algorithm to find all
candidate related tables with high coverage, and (ii) a neural based
approach, which takes into account the table schemas, and the
corresponding table data, we determine with high accuracy the table
relation for a table pair. We perform an extensive experimental
evaluation on the entire Wikipedia with more than 3.2 million tables.
We show that with more than 88% we retain relevant candidate tables
pairs for alignment. Consequentially, with an accuracy of 90% we are
able to align tables with subPartOf or equivalent relations.
Comparisons with existing competitors show that TableNet has superior
performance in terms of coverage and alignment accuracy.
--
Janna Layton (she, her)
Administrative Assistant - Audiences & Technology
Wikimedia Foundation <https://wikimediafoundation.org/>
The Scoring Platform Team at the Wikimedia Foundation will be performing
maintenance on WikiLabels <https://labels.wmflabs.org>. We are moving the
database to a different server. This work will take place from 14:00-14:30
UTC (7am Pacific) on Tuesday, March 19. WikiLabels may go down in the
process. Please let me know if you have any questions.
--
*James Hare* (he/him)
Associate Product Manager
Wikimedia Foundation <https://wikimediafoundation.org/>
Hi all,
The next Research Showcase, “Learning How to Correct a Knowledge Base
from the Edit History” and “TableNet: An Approach for Determining
Fine-grained Relations for Wikipedia Tables” will be live-streamed
this Wednesday, March 20, 2019, at 11:30 AM PST/18:30 UTC (Please note
the change in time in UTC due to daylight saving changes in the U.S.).
The first presentation is about using edit history to automatically
correct constraint violations in Wikidata, and the second is about
interlinking Wikipedia tables.
YouTube stream: https://www.youtube.com/watch?v=6p62PMhkVNM
As usual, you can join the conversation on IRC at #wikimedia-research.
You can also watch our past research showcases at
https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase .
This month's presentations:
Learning How to Correct a Knowledge Basefrom the Edit History
By Thomas Pellissier Tanon (Télécom ParisTech), Camille Bourgaux (DI
ENS, CNRS, ENS, PSL Univ. & Inria), Fabian Suchanek (Télécom
ParisTech), WWW'19.
The curation of Wikidata (and other knowledge bases) is crucial to
keep the data consistent, to fight vandalism and to correct good faith
mistakes. However, manual curation of the data is costly. In this
work, we propose to take advantage of the edit history of the
knowledge base in order to learn how to correct constraint violations
automatically. Our method is based on rule mining, and uses the edits
that solved violations in the past to infer how to solve similar
violations in the present. For example, our system is able to learn
that the value of the [[d:Property:P21|sex or gender]] property
[[d:Q467|woman]] should be replaced by [[d:Q6581072|female]]. We
provide [https://tools.wmflabs.org/wikidata-game/distributed/#game=43
a Wikidata game] that suggests our corrections to the users in order
to improve Wikidata. Both the evaluation of our method on past
corrections, and the Wikidata game statistics show significant
improvements over baselines.
TableNet: An Approach for Determining Fine-grained Relations for
Wikipedia Tables
By Besnik Fetahu
Wikipedia tables represent an important resource, where information is
organized w.r.t table schemas consisting of columns. In turn each
column, may contain instance values that point to other Wikipedia
articles or primitive values (e.g. numbers, strings etc.). In this
work, we focus on the problem of interlinking Wikipedia tables for two
types of table relations: equivalent and subPartOf. Through such
relations, we can further harness semantically related information by
accessing related tables or facts therein. Determining the relation
type of a table pair is not trivial, as it is dependent on the
schemas, the values therein, and the semantic overlap of the cell
values in the corresponding tables. We propose TableNet, an approach
that constructs a knowledge graph of interlinked tables with subPartOf
and equivalent relations. TableNet consists of two main steps: (i) for
any source table we provide an efficient algorithm to find all
candidate related tables with high coverage, and (ii) a neural based
approach, which takes into account the table schemas, and the
corresponding table data, we determine with high accuracy the table
relation for a table pair. We perform an extensive experimental
evaluation on the entire Wikipedia with more than 3.2 million tables.
We show that with more than 88% we retain relevant candidate tables
pairs for alignment. Consequentially, with an accuracy of 90% we are
able to align tables with subPartOf or equivalent relations.
Comparisons with existing competitors show that TableNet has superior
performance in terms of coverage and alignment accuracy.
Best,
Leila
Hi folks,
My work needs to randomly sample new editors in each month, e.g., 100 editors per month.
Do any of you have good suggestions for how to do this efficiently?
I could think of using the dump files, but wonder are there other options?
Thanks,
Haifeng Zhang
***Apologies for cross-posting***
Dear Colleagues,
The organizing committee of the 10th Annual International Conference on
Social Media and Society (#SMSociety) invites you to visit Toronto on July
19-21, 2019. This year's conference will be held at the Ted Rogers School of
Management and hosted by Ryerson University Social Media Lab. The 2019
conference theme is "Rethinking Privacy and Trust in the Social Media Age."
The Conference offers an intensive 3-day program including hands-on
workshops, full & work-in-progress papers, panels, and posters featuring the
latest in social media research. This year, we are honoured to have two
highly distinguished scholars as our keynote speakers:
* Tarleton Gillespie, a principal researcher at Microsoft Research New
England, and an affiliated professor in the Department of Communication and
Department of Information Science at Cornell University. His new book,
Custodians of the Internet: Platforms, Content Moderation, and the Hidden
Decisions that Shape Social Media (Yale University Press) was published in
June 2018.
* Valerie Steeves, a professor in the Department of Criminology at the
University of Ottawa. As the lead researcher of MediaSmarts' Young Canadians
in a Wired World research project, she has been tracking young people's use
of new media since 2000. She is also the principal investigator of The
eQuality Project, a seven-year partnership exploring young people's
experiences of privacy and equality in networked spaces. More info at:
https://socialmediaandsociety.org/2019/announcement-of-2019-keynotes/
Submission Deadline:
There is still time to submit a poster, panel or hands-on workshop proposal
(due March 18, midnight, Hawaiian time). We hope you consider submitting
your work to the conference and join our interdisciplinary community of
researchers. More info at: https://socialmediaandsociety.org/submit/
If you have any questions, please contact us via email at
ask(a)socialmediaandsociety.org <mailto:ask@socialmediaandsociety.org> or on
Twitter at @SocMediaConf <https://twitter.com/SocMediaConf>
--
2019 SMSociety Organizing Committee