Following on from my previous posts about trying to classify the scope and coverage of humanities subjects in Wikipedia, I have a practical question: is it possible to query the Wikipedia database in such a way as to get a list of all articles (current version)? Even better, with a second, larger list that indexes each article with a list of categories it belongs to. Example
List 1
Name , ID Thomas Aquinas, 1 William of Ockham, 2
List 2
ID, category 1, 1225 births 1, 1274 deaths [...] 2, 1285 births 2, 1347 deaths 2, 13th century philosophers
and so on. I appreciate the second list may be up to 20 times the size of the first, thus 60 million rows. Perhaps there is a way to limit the number of categories, I don't know.
This would allow me to see exactly what was there under the humanities. My hunch is that most articles in Wikipedia are obscure stubs (from using the random article function), and that the coverage of humanities subjects, possibly other areas, is actually no different to a conventional encyclopedia.
On 20.09.2010 21:19, Peter Damian wrote:
Following on from my previous posts about trying to classify the scope and coverage of humanities subjects in Wikipedia, I have a practical question: is it possible to query the Wikipedia database in such a way as to get a list of all articles (current version)? Even better, with a second, larger list that indexes each article with a list of categories it belongs to.
This is not a foundation issue. Please reserve this list for global affairs of the WMF. Go to the enWP-list with your local stuff.
Ciao Henning (deWP)
---- Original Message ----- From: "Henning Schlottmann" h.schlottmann@gmx.net To: foundation-l@lists.wikimedia.org Sent: Monday, September 20, 2010 8:26 PM Subject: Re: [Foundation-l] Classifying what is on Wikipedia
On 20.09.2010 21:19, Peter Damian wrote:
Following on from my previous posts about trying to classify the scope and coverage of humanities subjects in Wikipedia, I have a practical question: is it possible to query the Wikipedia database in such a way as to get a list of all articles (current version)? Even better, with a second, larger list that indexes each article with a list of categories it belongs to.
This is not a foundation issue. Please reserve this list for global affairs of the WMF. Go to the enWP-list with your local stuff.
I have just explained in the previous thread why it is a foundation issue. It affects all Wikipedia projects equally.
Hoi, That may be, but such practical things are local issues. Thanks, GerardM
On 20 September 2010 21:43, Peter Damian peter.damian@btinternet.comwrote:
---- Original Message ----- From: "Henning Schlottmann" h.schlottmann@gmx.net To: foundation-l@lists.wikimedia.org Sent: Monday, September 20, 2010 8:26 PM Subject: Re: [Foundation-l] Classifying what is on Wikipedia
On 20.09.2010 21:19, Peter Damian wrote:
Following on from my previous posts about trying to classify the scope and coverage of humanities subjects in Wikipedia, I have a practical question: is it possible to query the Wikipedia database in such a way as to get a list of all articles (current version)? Even better, with a second, larger list that indexes each article with a list of categories it belongs to.
This is not a foundation issue. Please reserve this list for global affairs of the WMF. Go to the enWP-list with your local stuff.
I have just explained in the previous thread why it is a foundation issue. It affects all Wikipedia projects equally.
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Questions that affect all Wikipedias are suitable for wikipedia-l. Research and data-related questions such as this might also be appropriate for the research list.
Peter D - this is a fascinating question, but this thread may be more suited to the research list until you find the technical answers you are looking for.
SJ
On Mon, Sep 20, 2010 at 4:22 PM, Gerard Meijssen gerard.meijssen@gmail.com wrote:
Hoi, That may be, but such practical things are local issues. Thanks, GerardM
On 20 September 2010 21:43, Peter Damian peter.damian@btinternet.comwrote:
---- Original Message ----- From: "Henning Schlottmann" h.schlottmann@gmx.net To: foundation-l@lists.wikimedia.org Sent: Monday, September 20, 2010 8:26 PM Subject: Re: [Foundation-l] Classifying what is on Wikipedia
On 20.09.2010 21:19, Peter Damian wrote:
Following on from my previous posts about trying to classify the scope and coverage of humanities subjects in Wikipedia, I have a practical question: is it possible to query the Wikipedia database in such a way as to get a list of all articles (current version)? Even better, with a second, larger list that indexes each article with a list of categories it belongs to.
This is not a foundation issue. Please reserve this list for global affairs of the WMF. Go to the enWP-list with your local stuff.
I have just explained in the previous thread why it is a foundation issue. It affects all Wikipedia projects equally.
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
wikimedia-l@lists.wikimedia.org