Page content in histories are not stored on the toolserv. Toolserv does NOT get full datadump. That's the main problem here.
Someone with a fast connection should run this query. I wasn't able to convince any such person yet.
- White Cat
On Sat, Apr 5, 2008 at 2:56 AM, Chris Howie cdhowie@gmail.com wrote:
On Fri, Apr 4, 2008 at 7:33 PM, White Cat wikipedia.kawaii.neko@gmail.com wrote:
That may be very difficult. Such a query would be very expensive both CPU-wise and BW-wise.
It could be run over several days, giving the server some time between requests to avoid DoSing it. The list of redirects could be obtained with a simple script using < http://en.wikipedia.org/w/api.php?action=query&list=allpages&apfilte...
as a base, setting the apfrom parameter as necessary. Then < http://en.wikipedia.org/w/api.php?action=query&prop=revisions&titles...
for looking at revisions. It would not be a great system but give it a week or so and you'd have a good chunk of data to look at.
Optionally, someone with toolserver access could cook up a nice SQL query to kill the DB server with. :)
-- Chris Howie http://www.chrishowie.com http://en.wikipedia.org/wiki/User:Crazycomputers
WikiEN-l mailing list WikiEN-l@lists.wikimedia.org To unsubscribe from this mailing list, visit: https://lists.wikimedia.org/mailman/listinfo/wikien-l