Re: [Wikitech-ambassadors] [Commons-l] File metadata cleanup drive: We now have numbers for Commons

12 Dec 2014


      On Thu, Dec 11, 2014 at 1:16 PM, Guillaume Paumier gpaumier@wikimedia.org
wrote:
...
Yesterday, I finished to implement the script for Commons, and started
to run it. As of today, we have accurate numbers for the quantity of
files missing machine-readable metadata on Commons: ~533,000, out of
~24 million [4]. It may seem like a lot, but I personally think it's a
great testament to the dedication of the Commons community.
Wonderful. Thanks!
...
Now that we have numbers, we can work on going through those files and
fixing them. Many of them are missing the {{information}} template,
but many of those are also part of a batch: either they were uploaded
by the same user, or they were mass-uploaded by a bot. In either case,
this makes it easier to parse the information and add the
{{information}} template automatically with a bot, thus avoiding
painful manual work.
I've been poking at all of this with a stick in my free time, and it's true
that a good number of these images are part of a set of images and the
patterns are readily apparent. Magnus's No Information tool on labs is
enormously helpful for retrieving these pattern sets since it's searchable
by file name or the user/bot who uploaded the images[1]. I highly recommend
it.
...
Once you identify a pattern, you're encouraged to add a section to the
Bot Requests page on Commons, so that a bot owner can fix them:
https://commons.wikimedia.org/wiki/Commons:Bots/Work_requests#Adding_the_Inf...
Challenge accepted[2].
1.
https://tools.wmflabs.org/add-information/no_information.php?language=common...
2.
https://commons.wikimedia.org/w/index.php?title=Commons:Bots/Work_requests&a...
-- 
Keegan Peterzell
Community Liaison, Product
Wikimedia Foundation

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

Re: [Wikitech-ambassadors] [Commons-l] File metadata cleanup drive: We now have numbers for Commons