----- Mensaje original ---- De: Ariel T. Glenn ariel@wikimedia.org Para: Felipe Ortega glimmer_phoenix@yahoo.es CC: Platonides platonides@gmail.com; xmldatadumps-l@lists.wikimedia.org Enviado: lun,16 mayo, 2011 10:07 Asunto: Re: [Xmldatadumps-l] Malformed revision items
Στις 07-05-2011, ημέρα Σαβ, και ώρα 16:46 +0100, ο/η Felipe Ortega έγραψε:
----- Mensaje original ----
De: Platonides platonides@gmail.com Para: Felipe Ortega glimmer_phoenix@yahoo.es CC: xmldatadumps-l@lists.wikimedia.org Enviado: vie,6 mayo, 2011 23:39 Asunto: Re: [Xmldatadumps-l] Malformed revision items
El 06/05/11 22:16, Felipe Ortega escribió:
complete collection (I haven't checked explicitly, but it looks like there is no pattern in these errors and they are produced randomly).
Let me know in you need more info that can be of help to solve this issue.
Best
Maybe those usernames were RevDeleted? See in http://en.wikipedia.org/wiki/?oldid=233494693 how it is not shown in the wiki either "This is an old revision of this page, as edited by (Username or IP removed) at 07:55, 22 August 2008."
So the dumps are right.
Interesting. I hadn't thought about this possibility and it's a good explanation.
Felipe.
Please excuse my delay in replying.
Indeed these revisions were oversighted and the user name hidden. A number of fields including the edit summary and the username can be hidden by oversighters; you'll want to adjust your code to account for these. Typically the xml tag will have the contents 'missing' (although this can also occur for other reasons).
Yeap, I'll use some kind of default value for missing fields.
Thanks for your answers.
Felipe.
Ariel