New subject: Identifying Wikipedia stubs in various languages

20 Sep 2016

Hi everyone,

Does anyone know if there's a straightforward (ideally
language-independent) way of identifying stub articles in Wikipedia?

Whatever works is ok, whether it's publicly available data or data
accessible only on the WMF cluster.

I've found lists for various languages (e.g., Italian
<https://it.wikipedia.org/wiki/Categoria:Stub> or English
<https://en.wikipedia.org/wiki/Category:All_stub_articles>), but the lists
are in different formats, so separate code is required for each language,
which doesn't scale.

I guess in the worst case, I'll have to grep for the respective stub
templates in the respective wikitext dumps, but even this requires to know
for each language what the respective template is. So if anyone could point
me to a list of stub templates in different languages, that would also be
appreciated.

Thanks!
Bob

-- 
Up for a little language game? -- http://www.unfun.me