Dear all,
I thank you for your efforts. I am honoured to inform you that we are managing to launch a research unit dealing with Data Science research in Tunisia. This research unit is named "Data Engineering and Semantics". It belongs to Computer Science and Communications Department, Faculty of Sciences of Sfax, University of Sfax. The group will be led by Dr. Mohamed BEN AOUICHA (https://scholar.google.ca/citations?user=XPVUu-gAAAAJ&hl=fr). He was the person behind the project of University of Sfax about the use of Wikipedia and Wiktionary in semantic technologies, particularly semantic similarity measures. As explained in my previous emails, we need the research unit to have Wikimedia Researchers in the University of Sfax affiliated to an official research structure. The main goal of the unit is to promote and monitor research efforts about Data Engineering and Semantic Technologies in the University of Sfax. Data Enginering and Semantic Technology researches are needed to ameliorate the usage and processing of knowledge bases including wikis. Effectively, we will work in this unit to develop tools to ameliorate linked data quality of Wikidata and to use Wikidata for a variety of purposes including clinical decision support, Natural Language Processing and Education. We will also analyze the output of wikis, identify deficiencies and work to solve them. However, this will not be possible without several letters of support from several institutions. I ask if your research institution can send us a letter of support so that we can have the approval of the Tunisian Ministry of Higher Education to create our new research unit.
Yours Sincerely,
Houcemeddine Turki (he/him)
Medical Student, Faculty of Medicine of Sfax, University of Sfax, Tunisia
GLAM and Education Coordinator, Wikimedia TN User Group
Member, Wiki Project Med
Member, WikiIndaba Steering Committee
Member, Wikimedia and Library User Group Steering Committee
____________________
+21629499418
Hi all,
The next Research Showcase will be live-streamed next Wednesday, October
16, at 9:30 AM PDT/16:30 UTC.
YouTube stream: https://www.youtube.com/watch?v=KZ35weAVlIU
As usual, you can join the conversation on IRC at #wikimedia-research. You
can also watch our past Research Showcases here:
https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase
This month's presentations:
Elections Without Fake: Deploying Real Systems to Counter Misinformation
Campaigns
By Fabrício Benevenuto, Computer Science Department, Universidade Federal
de Minas Gerais (UFMG), Brazil
The political debate and electoral dispute in the online space during the
2018 Brazilian elections were marked by an information war. In order to
mitigate the misinformation problem, we created the project Elections
Without Fake <http://www.eleicoes-sem-fake.dcc.ufmg.br/> and developed a
few technological solutions able to reduce the abuse of misinformation
campaigns in the online space. Particularly, we created a system to monitor
public groups in WhatsApp and a system to monitor ads in Facebook. Our
systems showed to be fundamental for fact-checking and investigative
journalism, and are currently being used by over 150 journalists with
editorial lines and various fact-checking agencies.
More info on second talk by Francesca Spezzano to come
--
Janna Layton (she, her)
Administrative Assistant - Product & Technology
Wikimedia Foundation <https://wikimediafoundation.org/>
Hi everybody,
the Analytics team is going to stop HDFS and Yarn services for a
(hopefully) brief time window tomorrow, Tue Oct 15th, from 14:30 to 15:30
CEST.
We are going to swap the Zookeeper cluster from the one currently used by
all the Kafka production services to a dedicated one within the Analytics
VLAN. More details in https://phabricator.wikimedia.org/T217057
Side effects: the move should be transparent to all users, except the ones
relying on history in Yarn (for example, checking it via yarn.wikimedia.org).
The history is in fact stored in zookeeper, and to keep things simple we
are not copying znodes over to the new cluster. Please let us know if this
impacts you in any way.
As usual, if the time window affects your work please let us know and we'll
find a new maintenance window.
Thanks!
Luca (on behalf of the Analytics team).
Hello researchers,
A lot of research on Wikipedia is published in English and also uses the
English Wikipedia as source of data or researchers get their participants
via English Wikipedia [0].
A frequent criticism I meet when discussing such research with non-en.wp
community members is that their Wikipedia is different and the results of
en.wp base research are problematic/incomparable/totally useless.
So I want to ask:
- Do you know of research comparing different Wikis, preferably across
language versions? [1]
- How would you deal with such criticism, particularly of the "if it is not
about 'my' wp it is useless"-kind [2]?
Kind Regards,
Jan
____
[0] Plausible due to academi fields, particularly Computer Science,
publishing mainly in english, size and WMF as actor being US-based.
[1] I know of »revisiting "The Rise and Decline" in a Population of Peer
Production Projects« (https://dl.acm.org/citation.cfm?id=3173929),
comparing different Wikia-Wikis; Research like "limits of
self-organization" (https://firstmonday.org/article/view/1405/1323) that
refer to general principles of peer production. Comparisons of Wikipedias
across languages and the impact of their different contexts, languages and
regulations would be very interesting to me.
[2] I'm aware that making heterogeneous things comparable is seen as a core
academic/scientific activity in STS research (Law, SL Star, Turnbull…) so I
do not want to say, transfer to a different setting is not a problem – but
it is certainly not "totally useless" either.
--
Jan Dittrich
UX Design/ Research
Wikimedia Deutschland e. V. | Tempelhofer Ufer 23-24 | 10963 Berlin
Tel. (030) 219 158 26-0
https://wikimedia.de
Unsere Vision ist eine Welt, in der alle Menschen am Wissen der Menschheit
teilhaben, es nutzen und mehren können. Helfen Sie uns dabei!
https://spenden.wikimedia.de
Wikimedia Deutschland — Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/029/42207.
Hello,
Several publications are in preparation... I will get back to you...
Thanks for your interest,
Ludovic BOCKEN
lbocken(a)gmail.com
www.ludovicbocken.com
Skype: ludovic.bocken
http://www.linkedin.com/in/ludovicbocken
2222 Rue Hochelaga,
Montréal, QC H2K 4N8
+1 (514) 649 0755
*Avis de confidentialité*
Le présent message transmis par télécopie est confidentiel, et son contenu
peut être protégé par le secret professionnel. Il est à l’usage exclusif de
son ou sa destinataire. Toute autre personne est par les présentes avisée
qu’il lui est strictement interdit de le diffuser, de le distribuer ou de
le reproduire. Si la ou le destinataire ne peut être joint ou vous est
inconnu, nous vous prions d’en informer immédiatement l’expéditeur ou
l’expéditrice et de détruire ce message et toute copie de celui-ci.
Le sam. 28 sept. 2019 à 08:00, <wiki-research-l-request(a)lists.wikimedia.org>
a écrit :
> Send Wiki-research-l mailing list submissions to
> wiki-research-l(a)lists.wikimedia.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> or, via email, send a message with subject or body 'help' to
> wiki-research-l-request(a)lists.wikimedia.org
>
> You can reach the person managing the list at
> wiki-research-l-owner(a)lists.wikimedia.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Wiki-research-l digest..."
>
>
> Today's Topics:
>
> 1. Re: Standardization of Wikipedia articles according to the
> lexical constancy of their introductions and body texts (Morten Wang)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Fri, 27 Sep 2019 07:47:00 -0700
> From: Morten Wang <nettrom(a)gmail.com>
> To: Research into Wikimedia content and communities
> <wiki-research-l(a)lists.wikimedia.org>
> Subject: Re: [Wiki-research-l] Standardization of Wikipedia articles
> according to the lexical constancy of their introductions and body
> texts
> Message-ID:
> <CALkHm5AvzdMzxCLM2T=ZWVFwdRLWAZH0+N6dLpym-nN09d=
> 6rQ(a)mail.gmail.com>
> Content-Type: text/plain; charset="UTF-8"
>
> Hi Ludovic,
>
> This work sounds interesting, I'm looking forward to learning more about it
> as your papers come out!
>
> I read through the post on LinkedIn and from how I interpret it you are
> only looking at two quality classes (Features Articles vs other articles).
> This seems somewhat odd to me and I'd like to know more about why? The
> current trend when it comes to predicting article quality in the English
> Wikipedia does not limit the prediction problem to just FAs vs the rest,
> instead it's using the whole quality scale[1]. See the list below for some
> papers along this line of research.
>
> I'm also really curious about what "standardize the cognitive accessibility
> of Wikipedia" means? That might mean more than just "article quality",
> hence why I'm asking.
>
> All that being said, I think the approach sounds interesting and probably
> adds some signal, so I'm curious to learn more how it works and performs.
>
> References:
>
> - Warncke-Wang, M., Cosley, D., & Riedl, J. Tell me more: an actionable
> quality model for Wikipedia. OpenSym/WikiSym 2013. [We argue that
> metadata
> isn't useful because contributors can't change it]
> - Warncke-Wang, M., Ayukaev, V. R., Hecht, B., & Terveen, L. G. The
> success and failure of quality improvement projects in peer production
> communities. CSCW 2015. [See the Appendix for details of the improved
> model
> and how to get good training data]
> - https://www.mediawiki.org/wiki/ORES builds upon the 2015 paper and is
> a readily accessible API, reference datasets are available on figshare
> <
> https://figshare.com/articles/English_Wikipedia_Quality_Asssessment_Dataset…
> >
> and
> also in the GitHub repository
> <https://github.com/wikimedia/articlequality>. Now the benchmark to
> compare against, as in the three other papers listed below.
> - Dang, Q. V., & Ignat, C. L. Measuring quality of collaboratively
> edited documents: the case of Wikipedia. CIC 2016. [Shows that adding
> readability features can improve predictions]
> - Dang, Q. V., & Ignat, C. L. An end-to-end learning solution for
> assessing the quality of Wikipedia articles. OpenSym 2017. [Shows the
> performance of RNNs, also contains an important discussion of
> performance,
> interpretability, etc]
>
> I also came across this recent paper by Schmidt and Zangerle that reports
> significant improvements, but haven't yet had the time to read the paper
> closely:
>
> - Schmidt, M., & Zangerle, E. Article quality classification on
> Wikipedia: introducing document embeddings and content features. OpenSym
> 2019.
>
> Footnotes:
>
> 1. Typically without A-class articles due to how few of them they are.
>
>
> Cheers,
> Morten
>
> On Mon, 23 Sep 2019 at 13:09, Ludovic Bocken <lbocken(a)gmail.com> wrote:
>
> > Hello,
> >
> > I am finishing my PhDs and I think that you could be interested in my
> last
> > main work about the quality of Wikipedia :
> >
> >
> https://www.linkedin.com/pulse/standardization-wikipedia-articles-according…
> > and in a future collaboration.
> >
> > I would be very grateful for your feedbacks ! Several publications are in
> > preparation... Let me know if you are interested in following this
> > thread...
> >
> > Have a nice week,
> >
> > Ludovic BOCKEN
> > lbocken(a)gmail.com
> > www.ludovicbocken.com
> > Skype: ludovic.bocken
> > http://www.linkedin.com/in/ludovicbocken
> > 2222 Rue Hochelaga,
> > Montréal, QC H2K 4N8
> > +1 (514) 649 0755
> >
> > *Avis de confidentialité*
> >
> > Le présent message transmis par télécopie est confidentiel, et son
> contenu
> > peut être protégé par le secret professionnel. Il est à l’usage exclusif
> de
> > son ou sa destinataire. Toute autre personne est par les présentes avisée
> > qu’il lui est strictement interdit de le diffuser, de le distribuer ou de
> > le reproduire. Si la ou le destinataire ne peut être joint ou vous est
> > inconnu, nous vous prions d’en informer immédiatement l’expéditeur ou
> > l’expéditrice et de détruire ce message et toute copie de celui-ci.
> > _______________________________________________
> > Wiki-research-l mailing list
> > Wiki-research-l(a)lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> >
>
>
> ------------------------------
>
> Subject: Digest Footer
>
> _______________________________________________
> Wiki-research-l mailing list
> Wiki-research-l(a)lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
>
> ------------------------------
>
> End of Wiki-research-l Digest, Vol 169, Issue 12
> ************************************************
>
Hello,
I am finishing my PhDs and I think that you could be interested in my last
main work about the quality of Wikipedia :
https://www.linkedin.com/pulse/standardization-wikipedia-articles-according…
and in a future collaboration.
I would be very grateful for your feedbacks ! Several publications are in
preparation... Let me know if you are interested in following this thread...
Have a nice week,
Ludovic BOCKEN
lbocken(a)gmail.com
www.ludovicbocken.com
Skype: ludovic.bocken
http://www.linkedin.com/in/ludovicbocken
2222 Rue Hochelaga,
Montréal, QC H2K 4N8
+1 (514) 649 0755
*Avis de confidentialité*
Le présent message transmis par télécopie est confidentiel, et son contenu
peut être protégé par le secret professionnel. Il est à l’usage exclusif de
son ou sa destinataire. Toute autre personne est par les présentes avisée
qu’il lui est strictement interdit de le diffuser, de le distribuer ou de
le reproduire. Si la ou le destinataire ne peut être joint ou vous est
inconnu, nous vous prions d’en informer immédiatement l’expéditeur ou
l’expéditrice et de détruire ce message et toute copie de celui-ci.
Hi everybody,
stat1005 was replaced almost a year ago by stat1007 to allow GPU research
and testing (https://phabricator.wikimedia.org/T148843). After a
long journey we are happy to add stat1005 back in the pool of available
Analytics client hosts. I have updated the documentation in:
https://wikitech.wikimedia.org/wiki/Stat1005https://wikitech.wikimedia.org/wiki/Analytics/Data_access#Analytics_clients
The host is now an Hadoop client like stat1004, and also offers an AMD GPU (
https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/AMD_GPU). For
the moment we are limiting the access to the GPU to people that explicitly
need it, since it is still a testing environment and we'd like to give some
priority to people that already have projects relying on it. The end goal
is to give access to the GPU by default to everybody, so stay tuned. If you
wish to participate to the testing efforts, please reach out to the
Analytics team! (
https://wikitech.wikimedia.org/wiki/Analytics/Data_access#GPU_usage).
Last but not the least - stat1005 is running Debian 10 (Buster), and
openjdk-8 instead of the ones shipped by Debian (openjdk-11) since the
Hadoop cluster is not ready to migrate yet. Everything seem running fine
from our tests, but please report to us anything that looks strange.
Thanks in advance!
Luca (on behalf of the Analytics team)
Dear all,
I thank you for your efforts. As requested by many interested scientists, I put the full text of the work in Zenodo. It is currently available at https://zenodo.org/record/3461198.
Yours Sincerely,
Houcemeddine Turki (he/him)
Medical Student, Faculty of Medicine of Sfax, University of Sfax, Tunisia
GLAM and Education Coordinator, Wikimedia TN User Group
Member, Wiki Project Med
Member, WikiIndaba Steering Committee
Member, Wikimedia and Library User Group Steering Committee
____________________
+21629499418
-------- Message d'origine --------
De : Houcemeddine Turki <turkiabdelwaheb(a)hotmail.fr>
Date : 2019/09/25 13:12 (GMT+01:00)
À : wiki-research-l(a)lists.wikimedia.org, Paladox via Wikitech-l <wikitech-l(a)lists.wikimedia.org>, wikidata(a)lists.wikimedia.org, wikimedia-medicine(a)lists.wikimedia.org
Objet : Our research paper about Wikidata and Health has been published
Dear all,
I thank you for your efforts. I am honoured to inform you that our latest research paper about Wikidata and Health has been published in Journal of Biomedical Informatics (IF=2.9). The paper is available at https://doi.org/10.1016/j.jbi.2019.103292. This paper is the first one of our series of research publications about Medical Wikidata project. If you like to have the paper, please contact me and I will send the PDF to you. Our next papers will work on finding methods to ameliorate the coverage and quality of medical information in Wikidata. We are quite sure that our work is important as it will be provide trustworthy reference medical information that can be used by physicians and computer programs to process medical data and enhance the efficiency of health care.
Yours Sincerely,
Houcemeddine Turki (he/him)
Medical Student, Faculty of Medicine of Sfax, University of Sfax, Tunisia
GLAM, Research and Education Coordinator, Wikimedia TN User Group
Member, Wiki Project Med
Member, WikiIndaba Steering Committee
Member, Wikimedia and Library User Group Steering Committee
____________________
+21629499418
The August 2019 issue of the Wikimedia Research Newsletter is out:
https://meta.wikimedia.org/wiki/Research:Newsletter/2019/August
22 recent publications about Wikipedia's gender gaps and potential gender biases, summarized and reviewed in the latest edition of our monthly research newsletter.
In this issue:
1 Female and nonwhite US sociologists less likely to have Wikipedia articles than scholars of similar citation impact2 "Mapping and Bridging the Gender Gap: An Ethnographic Study of Indian Wikipedians and Their Motivations to Contribute"3 "Investigating the Gender Pronoun Gap in Wikipedia"4 Safety and women editors on Wikipedia5 "Hacking History: Redressing Gender Inequities on Wikipedia Through an Editathon"6 "'(Weitergeleitet von Journalistin)': The Gendered Presentation of Professions on Wikipedia"7 "Striking result": No bias against contributions by female editors in quality assessment8 Other recent publications8.1 "Unexpected forms of bias" on Wikipedia favor female over male CEOs8.2 English Wikipedia biased against conservative and female topics, at least when compared to US magazines8.3 "Gender and deletion on Wikipedia"8.4 "Deleted gender wars"8.5 "Gender gap through time and space: A journey through Wikipedia biographies via the Wikidata Human Gender Indicator"8.6 Special issue of the "Nordic journal for information science and dissemination of culture"8.7 "Breastfeeding, Authority, and Genre: Women's Ethos in Wikipedia and Blogs"8.8 "Cyberfeminism on Wikipedia: Visibility and deliberation in feminist Wikiprojects"8.9 "Women and Wikipedia. Diversifying Editors and Enhancing Content through Library Edit-a-Thons"8.10 "Similar Gaps, Different Origins? Women Readers and Editors at Greek Wikipedia"8.11 "Writing Women in Mathematics into Wikipedia"8.12 "How do students trust Wikipedia? An examination across genders"8.13 "Breaking the glass ceiling on Wikipedia"8.14 Using Wikipedia for "Analyzing Gender Stereotyping in Bollywood Movies"8.15 Contrary to expectations, "no evidence of discrimination of female users based on their usernames"8.16 Some informative non-research overview publications
*** 23 recent publications were covered or listed in this issue ***
Masssly and Tilman Bayer
---
Wikimedia Research Newsletterhttps://meta.wikimedia.org/wiki/Research:Newsletter/* Follow us on Twitter: @WikiResearch
* Like us on Facebook: Facebook.com/WikiResearch/
* Receive this newsletter by mail: Research-newsletter Mailing List - Wikimedia
* Subscribe to the RSS feed:
http://blog.wikimedia.org/c/research-2/wikimedia-research-newsletter/feed