Yes, that's the idea more or less, but I'm not sure that our search
engine is able to search for headings, though I might be wrong. I suspect,
however, that it will be required to process dumps article by article (or at
least a random sample), and in big projects this could be extremely time
consuming.But maybe there's a faster way of which I am not aware?
--
Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי
“We're living in pieces,
I want to live in peace.” – T. Moore
2015-07-13 23:41 GMT+03:00 Pine W <wiki.pine(a)gmail.com>om>:
Would it be possible to run a search on the full text of Wikipedias for
lines that start and end with "==", "===", "====", and
lines that start with
";", then make a list of those strings, and count the number of times that
each title appears in the list?
Pine
On Jul 13, 2015 10:29 AM, "Jonathan Morgan" <jmorgan(a)wikimedia.org>
wrote:
>
> Cross-posting this request to wiki-research-l. Anyone have data on
> frequently used section titles in articles (any language), or know of
> datasets/publications that examined this?
>
> I'm not aware of any off the top of my head, Amir.
>
> - Jonathan
>
> ---------- Forwarded message ----------
> From: Amir E. Aharoni <amir.aharoni(a)mail.huji.ac.il>
> Date: Sat, Jul 11, 2015 at 3:29 AM
> Subject: [Wikitech-l] statistics about frequent section titles
> To: Wikimedia developers <wikitech-l(a)lists.wikimedia.org>
>
>
> Hi,
>
> Did anybody ever try to collect statistics about frequent section
> titles in
> Wikimedia projects?
>
> For Wikipedia, for example, titles such as "Biography", "Early
life",
> "Bibliography", "External links", "References",
"History", etc., appear
> in
> a lot of articles, and their counterparts appear in a lot of languages.
>
> There are probably similar things in Wikivoyage, Wiktionary and
> possibly
> other projects.
>
> Did anybody ever try to collect statistics of the most frequent section
> titles in each language and project?
>
> --
> Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי
>
http://aharoni.wordpress.com
> “We're living in pieces,
> I want to live in peace.” – T. Moore
> _______________________________________________
> Wikitech-l mailing list
> Wikitech-l(a)lists.wikimedia.org
>
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
>
>
> --
> Jonathan T. Morgan
> Senior Design Researcher
> Wikimedia Foundation
> User:Jmorgan (WMF)
>
>
> _______________________________________________
> Wiki-research-l mailing list
> Wiki-research-l(a)lists.wikimedia.org
>
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
_______________________________________________
Wiki-research-l mailing list
Wiki-research-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
_______________________________________________
Wiki-research-l mailing list
Wiki-research-l(a)lists.wikimedia.org