Hi Ostapuk Natalia,
Thanks for writing the list. There may be some alternative approaches - I'll follow up with you off list.
-Adam
On Mon, Oct 24, 2016 at 4:11 AM, Остапук Наталья nataxane@yandex-team.ru wrote:
Hello,
My name is Natalia, I'm a software developer at Yandex ( https://www.yandex.ru https://www.yandex.ru/)&). My team is building a large database, which contains objects from real world, their attributes and different types of relationships between them.
Wikipedia is one of our main data sources. Until recently we've been downloading data from it in a half-manual mode, but now we want to make it automatically with our bots. It works fine for all what we need except for API pages. As robots.txt states "Friendly, low-speed bots are welcome viewing article pages, but not dynamically-generated pages please."
Our bot uses crawl-delay so it won't bother you more often than you allow. Moreover, we are planning to make no more than 20-30 requests per 5 minutes.
Is there some way to add an exception to robots.txt for our bot?
Thank you for your time.
Best regards, Ostapuk Natalia
Mediawiki-api mailing list Mediawiki-api@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-api
mediawiki-api@lists.wikimedia.org