OK, reports a few minutes old: http://tools.wmflabs.org/betacommand-dev/reports/commonswiki_svg_list.txt.7z
On Sat, Jun 20, 2015 at 1:38 AM, Ariel T. Glenn aglenn@wikimedia.org wrote:
Στις 20-06-2015, ημέρα Σαβ, και ώρα 00:46 +0200, ο/η Federico Leva (Nemo) έγραψε:
D. Hansen, 19/06/2015 23:09:
One suggestion was to downloadcommonswiki-20150417-all-titles, which I did.
But this file does contain deleted names and renamed names, and names the partly have "File:" and some that don't have "File:" or a similiar indicator at the start. Doing just a small sample resulted in 5 correct names, and around 7 deleted and 7 renamed names.
That shouldn't happen... Was your sample all from the bottom (or top?) of the list? If so, maybe these are recent files, which are typically
more liable to deletion. If this happens throughout the list of titles, then there's something wrong in the query used and the bug should be filed. You can also query https://www.mediawiki.org/wiki/Manual:Page_table#page_title yourself on labsdb, e.g. via http://quarry.wmflabs.org/
Nemo
The files in this directory may be interesting to you (and I need to do some cleanup on them some day too):
http://dumps.wikimedia.org/other/imageinfo
They are produced a few times a month. For each wiki, the names of all images uploaded locally to the project are saved in the <wikiname> -local-wikiqueries.gz file, and those stored on commons but used on the wiki are in <wikiname>-remote-wikiqueries.gz
You may still find later that some titles have been renamed or removed by the time you look at the contents.
Hope that helps,
Ariel
Xmldatadumps-l mailing list Xmldatadumps-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l