Hi,
I would like to download all articles about cities and then do some
machine learning fun with the text
For example:
https://en.wikipedia.org/wiki/New_York_City
https://en.wikipedia.org/wiki/Rio_de_Janeiro
...
Do you have an idea how I can get a list of articles which are about a city?
Regards,
Thomas Güttler
On Wed, Aug 12, 2020 at 9:18 PM Thomas Güttler Lists < guettliml@thomas-guettler.de> wrote:
Do you have an idea how I can get a list of articles which are about a city?
Since this is a huge list anyway, so you don’t really want a list of _all_ of them, there are many options to approach it. For start, you can try https://w.wiki/ZQd as an example.
-- [[cs:User:Mormegil | Petr Kadlec]]
Envoyé depuis mon smartphone Samsung Galaxy. -------- Message d'origine --------De : petr.kadlec@gmail.com Date : 12/08/2020 21:39 (GMT+01:00) À : For developers discussing technical aspects and organization of Wikimedia projects wikitech-l@lists.wikimedia.org Objet : Re: [Wikitech-l] List of all articles about cities? On Wed, Aug 12, 2020 at 9:18 PM Thomas Güttler Lists guettliml@thomas-guettler.de wrote:> Do you have an idea how I can get a list of articles which are about a> city?>Since this is a huge list anyway, so you don’t really want a list of _all_of them, there are many options to approach it. For start, you can tryhttps://w.wiki/ZQd as an example.-- [[cs:User:Mormegil | Petr Kadlec]]_______________________________________________Wikitech-l mailing listWikitech-l@lists.wikimedia.orghttps://lists.wikimedia.org/mailman/listinfo/wikitech-l
Dear all,I thank you for your efforts. I am sorry for my previous email message mistakenly sent. The SPARQL query you liked is https://w.wiki/ZT2. For any question concerning Wikidata support, I invite you to join the Wikidata Telegram Group. You will find there advanced users that can answer any Wikidata question.Yours Sincerely,Houcemeddine Turki -------- Message d'origine --------De : turkiabdelwaheb turkiabdelwaheb@hotmail.fr Date : 13/08/2020 10:10 (GMT+01:00) À : For developers discussing technical aspects and organization of Wikimedia projects wikitech-l@lists.wikimedia.org Objet : Re: [Wikitech-l] List of all articles about cities? Envoyé depuis mon smartphone Samsung Galaxy.-------- Message d'origine --------De : petr.kadlec@gmail.com Date : 12/08/2020 21:39 (GMT+01:00) À : For developers discussing technical aspects and organization of Wikimedia projects wikitech-l@lists.wikimedia.org Objet : Re: [Wikitech-l] List of all articles about cities? On Wed, Aug 12, 2020 at 9:18 PM Thomas Güttler Lists guettliml@thomas-guettler.de wrote:> Do you have an idea how I can get a list of articles which are about a> city?>Since this is a huge list anyway, so you don’t really want a list of _all_of them, there are many options to approach it. For start, you can tryhttps://w.wiki/ZQd as an example.-- [[cs:User:Mormegil | Petr Kadlec]]_______________________________________________Wikitech-l mailing listWikitech-l@lists.wikimedia.orghttps://lists.wikimedia.org/mailman/listinfo/wikitech-l_____________________... mailing listWikitech-l@lists.wikimedia.orghttps://lists.wikimedia.org/mailman/listinfo/wikitech-l
Another way to do it is:
Start at https://en.wikipedia.org/wiki/Category:Cities
Click through the category tree.. for example from there to "Category:Cities by country" then "Category:Cities in the United States" -> "Category:Cities in the United States by county" -> and so on until you get to a leaf category like "Category:Cities in Pemiscot County, Missouri" that contains actual pages instead of more subcategories.
Now go to:
https://en.wikipedia.org/wiki/Special:Export
and paste the Category:... name into the form field "Add pages from category:". Click "Add" and then "Export".
Now you have a single XML file with the contents of the articles of these cities.
https://en.wikipedia.org/wiki/Special:Export
-> "Add pages from category:" <enter 'cities'> and click 'add' button.
wikitech-l@lists.wikimedia.org