Hello,
Several publications are in preparation... I will get back to you...
Thanks for your interest,
Ludovic BOCKEN lbocken@gmail.com www.ludovicbocken.com Skype: ludovic.bocken http://www.linkedin.com/in/ludovicbocken 2222 Rue Hochelaga, Montréal, QC H2K 4N8 +1 (514) 649 0755
*Avis de confidentialité*
Le présent message transmis par télécopie est confidentiel, et son contenu peut être protégé par le secret professionnel. Il est à l’usage exclusif de son ou sa destinataire. Toute autre personne est par les présentes avisée qu’il lui est strictement interdit de le diffuser, de le distribuer ou de le reproduire. Si la ou le destinataire ne peut être joint ou vous est inconnu, nous vous prions d’en informer immédiatement l’expéditeur ou l’expéditrice et de détruire ce message et toute copie de celui-ci.
Le sam. 28 sept. 2019 à 08:00, wiki-research-l-request@lists.wikimedia.org a écrit :
Send Wiki-research-l mailing list submissions to wiki-research-l@lists.wikimedia.org
To subscribe or unsubscribe via the World Wide Web, visit https://lists.wikimedia.org/mailman/listinfo/wiki-research-l or, via email, send a message with subject or body 'help' to wiki-research-l-request@lists.wikimedia.org
You can reach the person managing the list at wiki-research-l-owner@lists.wikimedia.org
When replying, please edit your Subject line so it is more specific than "Re: Contents of Wiki-research-l digest..."
Today's Topics:
- Re: Standardization of Wikipedia articles according to the lexical constancy of their introductions and body texts (Morten Wang)
Message: 1 Date: Fri, 27 Sep 2019 07:47:00 -0700 From: Morten Wang nettrom@gmail.com To: Research into Wikimedia content and communities wiki-research-l@lists.wikimedia.org Subject: Re: [Wiki-research-l] Standardization of Wikipedia articles according to the lexical constancy of their introductions and body texts Message-ID: <CALkHm5AvzdMzxCLM2T=ZWVFwdRLWAZH0+N6dLpym-nN09d= 6rQ@mail.gmail.com> Content-Type: text/plain; charset="UTF-8"
Hi Ludovic,
This work sounds interesting, I'm looking forward to learning more about it as your papers come out!
I read through the post on LinkedIn and from how I interpret it you are only looking at two quality classes (Features Articles vs other articles). This seems somewhat odd to me and I'd like to know more about why? The current trend when it comes to predicting article quality in the English Wikipedia does not limit the prediction problem to just FAs vs the rest, instead it's using the whole quality scale[1]. See the list below for some papers along this line of research.
I'm also really curious about what "standardize the cognitive accessibility of Wikipedia" means? That might mean more than just "article quality", hence why I'm asking.
All that being said, I think the approach sounds interesting and probably adds some signal, so I'm curious to learn more how it works and performs.
References:
- Warncke-Wang, M., Cosley, D., & Riedl, J. Tell me more: an actionable
quality model for Wikipedia. OpenSym/WikiSym 2013. [We argue that metadata isn't useful because contributors can't change it]
- Warncke-Wang, M., Ayukaev, V. R., Hecht, B., & Terveen, L. G. The
success and failure of quality improvement projects in peer production communities. CSCW 2015. [See the Appendix for details of the improved model and how to get good training data]
- https://www.mediawiki.org/wiki/ORES builds upon the 2015 paper and is
a readily accessible API, reference datasets are available on figshare < https://figshare.com/articles/English_Wikipedia_Quality_Asssessment_Dataset/...
and also in the GitHub repository https://github.com/wikimedia/articlequality. Now the benchmark to compare against, as in the three other papers listed below.
- Dang, Q. V., & Ignat, C. L. Measuring quality of collaboratively
edited documents: the case of Wikipedia. CIC 2016. [Shows that adding readability features can improve predictions]
- Dang, Q. V., & Ignat, C. L. An end-to-end learning solution for
assessing the quality of Wikipedia articles. OpenSym 2017. [Shows the performance of RNNs, also contains an important discussion of performance, interpretability, etc]
I also came across this recent paper by Schmidt and Zangerle that reports significant improvements, but haven't yet had the time to read the paper closely:
- Schmidt, M., & Zangerle, E. Article quality classification on
Wikipedia: introducing document embeddings and content features. OpenSym 2019.
Footnotes:
- Typically without A-class articles due to how few of them they are.
Cheers, Morten
On Mon, 23 Sep 2019 at 13:09, Ludovic Bocken lbocken@gmail.com wrote:
Hello,
I am finishing my PhDs and I think that you could be interested in my
last
main work about the quality of Wikipedia :
https://www.linkedin.com/pulse/standardization-wikipedia-articles-according-...
and in a future collaboration.
I would be very grateful for your feedbacks ! Several publications are in preparation... Let me know if you are interested in following this thread...
Have a nice week,
Ludovic BOCKEN lbocken@gmail.com www.ludovicbocken.com Skype: ludovic.bocken http://www.linkedin.com/in/ludovicbocken 2222 Rue Hochelaga, Montréal, QC H2K 4N8 +1 (514) 649 0755
*Avis de confidentialité*
Le présent message transmis par télécopie est confidentiel, et son
contenu
peut être protégé par le secret professionnel. Il est à l’usage exclusif
de
son ou sa destinataire. Toute autre personne est par les présentes avisée qu’il lui est strictement interdit de le diffuser, de le distribuer ou de le reproduire. Si la ou le destinataire ne peut être joint ou vous est inconnu, nous vous prions d’en informer immédiatement l’expéditeur ou l’expéditrice et de détruire ce message et toute copie de celui-ci. _______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Subject: Digest Footer
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
End of Wiki-research-l Digest, Vol 169, Issue 12
wiki-research-l@lists.wikimedia.org