Re: [Wiki-research-l] Identifying Wikipedia stubs in various languages

20 Sep 2016


      en:WP:DYK has a measure of 1,500+ characters of prose, which is a useful
cutoff. There is weaponised javascript to measure that at en:WP:Did you
know/DYKcheck
Probably doesn't translate to CJK languages which have radically different
information content per character.
cheers
stuart
--
...let us be heard from red core to black sky
On Tue, Sep 20, 2016 at 9:26 PM, Robert West west@cs.stanford.edu wrote:
...
Hi everyone,
Does anyone know if there's a straightforward (ideally
language-independent) way of identifying stub articles in Wikipedia?
Whatever works is ok, whether it's publicly available data or data
accessible only on the WMF cluster.
I've found lists for various languages (e.g., Italian
https://it.wikipedia.org/wiki/Categoria:Stub or English
https://en.wikipedia.org/wiki/Category:All_stub_articles), but the
lists are in different formats, so separate code is required for each
language, which doesn't scale.
I guess in the worst case, I'll have to grep for the respective stub
templates in the respective wikitext dumps, but even this requires to know
for each language what the respective template is. So if anyone could point
me to a list of stub templates in different languages, that would also be
appreciated.
Thanks!
Bob
--
Up for a little language game? -- http://www.unfun.me

Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Re: [Wiki-research-l] Identifying Wikipedia stubs in various languages