Thank you, your answer helped me to find the next step.
Thomas
Am 09.08.20 um 22:12 schrieb AntiCompositeNumber:
This is not the best list for this.
<https://www.mediawiki.org/wiki/Mailing_lists/Wikitech-l> would be
better.
If you're wanting to download articles to be able to read them, my
current recommendation is to use <https://www.kiwix.org/en/>. If
you're looking to do something else, a data dump from
<https://dumps.wikimedia.org/> may be a better idea.
If neither of those options are what you are looking for, you can find
the list of pages by number of views at
<https://pageviews.toolforge.org/topviews/?project=en.wikipedia.org&platform=all-access&date=last-year&excludes=>.
You should limit your requests to a reasonable rate, typically no more
than 60-75 requests/minute (but a slower rate if possible is
appreciated). Make sure to also comply with
<https://meta.wikimedia.org/wiki/User-Agent_policy> so we can yell at
you specifically if there's a problem.
AntiCompositeNumber
On Sun, Aug 9, 2020 at 3:52 PM Thomas Güttler Lists
<guettliml(a)thomas-guettler.de> wrote:
Hi,
I would like to sync the 100k most popular articles to my local machine.
Maybe I was blind, but I could not find a suitable documentation about this.
Maybe the list with the 100k most popular articles could be the start.
If I have this list, I could download the articles with 100k http calls.
But before I do this, I would kindly ask you about the correct way to do
this.
I don't want that you think I am doing a DoS attack :-)
If there is a better place for this question, please tell me!
Regards,
Thomas Güttler
--
Thomas Guettler
http://www.thomas-guettler.de/
I am looking for feedback:
https://github.com/guettli/programming-guidelines
_______________________________________________
Wikimedia Cloud Services mailing list
Cloud(a)lists.wikimedia.org (formerly labs-l(a)lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud
_______________________________________________
Wikimedia Cloud Services mailing list
Cloud(a)lists.wikimedia.org (formerly labs-l(a)lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud