Ryan Shaw wrote:
Hello,
I trying to get earliest revision dates for a set of approximately 4,000,000 Wikipedia articles. Looking at the MediaWiki API, it seems that the only way to get information on earliest revisions is to query on one article at a time. For 4,000,000 articles, this will take far too long...
So, my question is: is there any way to query for first revisions in batches? Alternatively, if there some other source of this information, such as data dumps? Since I am only looking for the timestamps of the first revision of each article, I'd like to avoid downloading complete histories for each article.
Thanks, Ryan
You can download the stub dumps. They contain metadata for all revisions, but no revision text.