Στις 20-06-2015, ημέρα Σαβ, και ώρα 00:46 +0200, ο/η Federico Leva
(Nemo) έγραψε:
> D. Hansen, 19/06/2015 23:09:
> > One suggestion was to downloadcommonswiki-20150417-all-titles,
> > which I did.
> >
> > But this file does contain deleted names and renamed names, and
> > names
> > the partly have "File:" and some that don't have "File:" or a
> > similiar
> > indicator at the start.
> > Doing just a small sample resulted in 5 correct names, and around 7
> > deleted and 7 renamed names.
>
> That shouldn't happen... Was your sample all from the bottom (or
> top?)
> of the list? If so, maybe these are recent files, which are typically
>
> more liable to deletion. If this happens throughout the list of
> titles,
> then there's something wrong in the query used and the bug should be
> filed.
> You can also query
> https://www.mediawiki.org/wiki/Manual:Page_table#page_title yourself
> on
> labsdb, e.g. via http://quarry.wmflabs.org/
>
> Nemo
The files in this directory may be interesting to you (and I need to do
some cleanup on them some day too):
http://dumps.wikimedia.org/other/imageinfo
They are produced a few times a month. For each wiki, the names of all
images uploaded locally to the project are saved in the <wikiname>
-local-wikiqueries.gz file, and those stored on commons but used on the
wiki are in <wikiname>-remote-wikiqueries.gz
You may still find later that some titles have been renamed or removed
by the time you look at the contents.
Hope that helps,
Ariel
_______________________________________________
Xmldatadumps-l mailing list
Xmldatadumps-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l