Hi
Felipe is working on analyzing logging.xml in order to evaluate flagged revisions. He has a few questions about what means what, see below.
Is the format of logging.xml documented anywhere?
-- daniel
-------- Original-Nachricht -------- Betreff: Already started, some doubts Datum: Thu, 18 Jun 2009 11:09:41 +0200 Von: Felipe Ortega jfelipe@gsyc.es Organisation: Libresoft (URJC) An: Daniel Kinzler daniel.kinzler@wikimedia.de
Hi, Daniel.
Anyway, I already started to work. The fresh dump as of 2009-06 is loaded and ready, so it is the metadata parsed from loggin.xml.
Some doubts about the logging info. I first assumed that entries with type "review" are flagged confirmations, and action "approve-a" automatic approval made by bots, like:
<logitem> <id>11882840</id> <timestamp>2009-05-25T03:02:24Z</timestamp> <contributor> <username>MauritsBot</username> <id>627499</id> </contributor> <comment>(automatisch) [Status:&#32;gesichtet]</comment> <type>review</type> <action>approve-a</action> <logtitle>Caroline Bonaparte</logtitle> <params xml:space="preserve">60402136 59720804</params> </logitem>
Whereas action "approve" corresponds to logged users (and it's the target of our study) like:
<logitem> <id>11882844</id> <timestamp>2009-05-25T03:03:25Z</timestamp> <contributor> <username>ParaDox</username> <id>104845</id> </contributor> <comment>[Status:&#32;gesichtet]</comment> <type>review</type> <action>approve</action> <logtitle>Datei:ParaDox - Diagramm - Artikel-Sichtungen und Artikel-Entwürfe - 2009-01.svg</logtitle> <params xml:space="preserve">60402132 60367249</params> </logitem>
Finally, I deduced that the first number in <params> content is the new flagged rev_id for that article, whilst the second number is the previous flagged rev_id (checked that with 3 random samples).
Please, confirm that all these assumptions are correct to start the EDA stage.
Best,
F.
mediawiki-l@lists.wikimedia.org