Ryan Shaw wrote:
Hello,
I trying to get earliest revision dates for a set of approximately
4,000,000 Wikipedia articles. Looking at the MediaWiki API, it seems
that the only way to get information on earliest revisions is to query
on one article at a time. For 4,000,000 articles, this will take far
too long...
So, my question is: is there any way to query for first revisions in
batches? Alternatively, if there some other source of this
information, such as data dumps? Since I am only looking for the
timestamps of the first revision of each article, I'd like to avoid
downloading complete histories for each article.
Thanks,
Ryan
You can download the stub dumps. They contain metadata for all
revisions, but no revision text.