The dumps public and/or mirrored need fixed retention policies attached and checked for compliance.
Private information is present in talk page edits which get selectively removed/edited.

GDPR issues cannot truly be adhered to if removal of content is actioned since dumps/mirrored information is not updated.
GDPR reports need to be published with all article refs related and the dumps/mirrored updated to reflect compliance of removal.


On 4 Mar 2019, at 09:24, Ariel Glenn WMF <> wrote:

All of the information in these mirrored dump files is publicly available to any user; no private information is provided. For GDPR-specific issues, please contact

On Mon, Mar 4, 2019 at 11:03 AM colin johnston <> wrote:
How is GDPR issue handled with this mirrored information ?
How is retention guidelines followed with this mirrored information ?


On 4 Mar 2019, at 08:52, Ariel Glenn WMF <> wrote:

Excuse this very late reply. The index.html page is out of date but the mirrored directories for various current runs are there. I'm checking with a colleague about making sure the index page gets copied over.


On Wed, Feb 6, 2019 at 1:14 PM Mariusz "Nikow" Klinikowski <> wrote:
Greetings XML Dump users and contributors!

Looks like is not updated from
2017-11-26. I think, maybe somebody should delete it from mirror list or
contact bytemark notify them?

Best regards,
Mariusz "Nikow" Klinikowski.

Xmldatadumps-l mailing list
Xmldatadumps-l mailing list