Hi
Felipe is working on analyzing logging.xml in order to evaluate flagged
revisions. He has a few questions about what means what, see below.
Is the format of logging.xml documented anywhere?
-- daniel
-------- Original-Nachricht --------
Betreff: Already started, some doubts
Datum: Thu, 18 Jun 2009 11:09:41 +0200
Von: Felipe Ortega <jfelipe(a)gsyc.es>
Organisation: Libresoft (URJC)
An: Daniel Kinzler <daniel.kinzler(a)wikimedia.de>
Hi, Daniel.
Anyway, I already started to work. The fresh dump as of 2009-06 is
loaded and ready, so it is the metadata parsed from loggin.xml.
Some doubts about the logging info. I first assumed that entries with
type "review" are flagged confirmations, and action "approve-a"
automatic approval made by bots, like:
<logitem>
<id>11882840</id>
<timestamp>2009-05-25T03:02:24Z</timestamp>
<contributor>
<username>MauritsBot</username>
<id>627499</id>
</contributor>
<comment>(automatisch) [Status:&#32;gesichtet]</comment>
<type>review</type>
<action>approve-a</action>
<logtitle>Caroline Bonaparte</logtitle>
<params xml:space="preserve">60402136
59720804</params>
</logitem>
Whereas action "approve" corresponds to logged users (and it's the
target of our study) like:
<logitem>
<id>11882844</id>
<timestamp>2009-05-25T03:03:25Z</timestamp>
<contributor>
<username>ParaDox</username>
<id>104845</id>
</contributor>
<comment>[Status:&#32;gesichtet]</comment>
<type>review</type>
<action>approve</action>
<logtitle>Datei:ParaDox - Diagramm - Artikel-Sichtungen und
Artikel-Entwürfe - 2009-01.svg</logtitle>
<params xml:space="preserve">60402132
60367249</params>
</logitem>
Finally, I deduced that the first number in <params> content is the new
flagged rev_id for that article, whilst the second number is the
previous flagged rev_id (checked that with 3 random samples).
Please, confirm that all these assumptions are correct to start the EDA
stage.
Best,
F.