Hi there!
(my first mail seems to be gone into nirvana)
For the German Wikipedia CD (see http://meta.wikimedia.org/wiki/Wikipedia_auf_CD) we need a least of all authors for each article (as GFDL requires). I am not that good in SQL so can anybody help?
I thoght of a statement like
"This ariticle has been edited X times by logged in users and Y times by anonymous user. The authors are: Userxy, Foouser, Userbar..."
My untested SQL statements up to now
CREATE TABLE edit_count ( article VARCHAR(255), edited_by_users INTEGER, edited_by_IP INTEGER );
CREATE TABLE has_edited ( user VARCHAR(255), article VARCHAR(255) );
INSERT INTO has_edited SELECT DISTINCT old_user_text AS user, old_title AS article WHERE old_namespace=0 UNION SELECT DISTINCT cur_user_text AS user, cur_title AS article WHERE old_namespace=0
How not to get all the IP numbers??
What is the best method to get such a list of authors and how long will it take to determine it for every article? By the way my little SQL statement to count the number of links to each article is still running and running for hours.
Thanks, Jakob
wikitech-l@lists.wikimedia.org