Commons-l December 2014

commons-l@lists.wikimedia.org

25 participants
9 discussions

Elog.io now up w/ Commons data
by Jonas Öberg 13 Feb '15

13 Feb '15

Dear all, thanks for all your help with answering questions and giving feedback over the last couple of months. I'm happy to say that we're finally at a stage where we've hashed 22,452,638 images from Wikimedia Commons and launched Elog.io in public beta: http://elog.io/ Elog.io is an open API as well as browser plugins, that can query and get information about images using a perceptual hash that's easy and quick to calculate in a browser. What the browser extensions allow you to do is match an image you find "in the wild" against Wikimedia Commons. If it can be matched against an image from Commons, it'll show you the title, author, and license, and give you links back to Wikimedia, the license, and a quick and handy "Copy as HTML" to copy the image and attribution as a HTML snippet for pasting into Word, LibreOffice, Wordpress, etc. Our API provides lookup functions to find information using a URL (the Commons' page name URL) or using the perceptual hash. You get information back as JSON in W3C Media Annotations format. of course, the information you get back is no better than the one provided by the Commons API, so if you already have a page name URL, you may as well query it directly, and rely on our API only for searching by perceptual hashes. The algorithm we use for calculating perceptual hashes, which you'll need to query our API, is at http://blockhash.io/ Sincerely, Jonas

4 5

File metadata cleanup drive: We now have numbers for Commons
by Guillaume Paumier 19 Dec '14

19 Dec '14

Greetings, As many of you are aware, we're currently in the process of collectively adding machine-readable metadata to many files and templates that don't have them, both on Commons and on all other Wikimedia wikis with local uploads [1,2]. This makes it much easier to see and re-use multimedia files consistently with best practices for attribution across a variety of channels (offline, PDF exports, mobile platforms, MediaViewer, WikiWand, etc.) In October, I created a dashboard to track how many files were missing the machine-readable markers on each wiki [3]. Unfortunately, due to the size of Commons, I needed to find another way to count them there. Yesterday, I finished to implement the script for Commons, and started to run it. As of today, we have accurate numbers for the quantity of files missing machine-readable metadata on Commons: ~533,000, out of ~24 million [4]. It may seem like a lot, but I personally think it's a great testament to the dedication of the Commons community. Now that we have numbers, we can work on going through those files and fixing them. Many of them are missing the {{information}} template, but many of those are also part of a batch: either they were uploaded by the same user, or they were mass-uploaded by a bot. In either case, this makes it easier to parse the information and add the {{information}} template automatically with a bot, thus avoiding painful manual work. I invite you to take a look at the list of files at https://tools.wmflabs.org/mrmetadata/commons/commons/index.html and see if you can find such groups and patterns. Once you identify a pattern, you're encouraged to add a section to the Bot Requests page on Commons, so that a bot owner can fix them: https://commons.wikimedia.org/wiki/Commons:Bots/Work_requests#Adding_the_In… I believe we can make a lot of progress rapidly if we dive into the list of files and fix all the groups we can find. The list and statistics will be updated daily so it'll be easy to see our progress. Let me know if you'd like to help but are unsure how! [1] https://meta.wikimedia.org/wiki/File_metadata_cleanup_drive [2] https://blog.wikimedia.org/2014/11/07/cleaning-up-file-metadata-for-humans-… [3] https://tools.wmflabs.org/mrmetadata/ [4] https://tools.wmflabs.org/mrmetadata/commons/commons/index.html -- Guillaume Paumier

3 5

Re: [Commons-l] [Wikitech-ambassadors] File metadata cleanup drive: We now have numbers for Commons
by Guillaume Paumier 12 Dec '14

12 Dec '14

On Fri, Dec 12, 2014 at 2:41 AM, Ricordisamoa <ricordisamoa(a)openmailbox.org> wrote: > Il 11/12/2014 23:28, Dan Garry ha scritto: >> >> THIS IS AWESOME >> >> Do you know when we are going to be able to start querying this via an API > in production? >> >> The Mobile Apps Team would love to consume this data, as opposed to the > present data exposed via the CommonsMetadata API (which is scraped, eugh). > > As far as I understand the information Guillaume is talking about is exactly > the one scraped by CommonsMetadata. > See https://tools.wmflabs.org/mrmetadata/how_it_works.html: > «The script needs to go through all file description pages of a wiki, and > check for machine-readable metadata by querying the CommonsMetadata > extension.» That's correct. However, just to be clear, CommonsMetadata doesn't just scrape the HTML (or the wikitext), it scrapes the HTML to look for the machine-readable markers, and exposes that information through the API. Until we have Structured Data (which is /at least/ a year out), CommonsMetadata is still the best way to access that information. -- Guillaume Paumier

1 0

Fwd: Re: Elog.io now up w/ Commons data
by Jonas Öberg 11 Dec '14

11 Dec '14

Now with accurate sender, apologies that this didn't go through at first. ---------- Forwarded message ---------- From: "Jonas Öberg" <jonas(a)shuttleworthfoundation.org> Date: 11 Dec 2014 16:01 Subject: Re: [Commons-l] Elog.io now up w/ Commons data To: "Wikimedia Commons Discussion List" <commons-l(a)lists.wikimedia.org> Cc: <wikitech-l(a)lists.wikimedia.org> Hi Cornelius! For images which it match against the catalog, it should give accurate information. If it doesn't, use the "report" link to let us know! You're right though that for images it doesn't find in its catalog, we don't provide any information. That's the equivalent of saying "this picture may or may not be openly licensed, but right now we have no information to tell either way" Sincerely, Jonas On 11 Dec 2014 15:57, "Cornelius Kibelka" <cornelius.kibelka(a)wikimedia.de> wrote: > Wow, what a nice and interesting browser extension. Congrats! > > Just a question: as far as I can see the tool doens't give the complete > and correction licensing information, as the source is missing. Or I'm > missleading? > > Best > Cornelius > > 2014-12-10 19:30 GMT+01:00 Jonas Öberg <jonas(a)commonsmachinery.se>: > >> Dear all, >> >> thanks for all your help with answering questions and giving feedback >> over the last couple of months. I'm happy to say that we're finally at >> a stage where we've hashed 22,452,638 images from Wikimedia Commons >> and launched Elog.io in public beta: http://elog.io/ >> >> Elog.io is an open API as well as browser plugins, that can query and >> get information about images using a perceptual hash that's easy and >> quick to calculate in a browser. >> >> What the browser extensions allow you to do is match an image you find >> "in the wild" against Wikimedia Commons. If it can be matched against >> an image from Commons, it'll show you the title, author, and license, >> and give you links back to Wikimedia, the license, and a quick and >> handy "Copy as HTML" to copy the image and attribution as a HTML >> snippet for pasting into Word, LibreOffice, Wordpress, etc. >> >> Our API provides lookup functions to find information using a URL (the >> Commons' page name URL) or using the perceptual hash. You get >> information back as JSON in W3C Media Annotations format. of course, >> the information you get back is no better than the one provided by the >> Commons API, so if you already have a page name URL, you may as well >> query it directly, and rely on our API only for searching by >> perceptual hashes. >> >> The algorithm we use for calculating perceptual hashes, which you'll >> need to query our API, is at http://blockhash.io/ >> >> >> Sincerely, >> Jonas >> >> _______________________________________________ >> Commons-l mailing list >> Commons-l(a)lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/commons-l >> > > > > -- > Cornelius Kibelka > > International Affairs > Werkstudent | student trainee > > Wikimedia Deutschland e.V. > Tempelhofer Ufer 23-24 > 10963 Berlin > > Tel.: +49 30 219158260 > http://wikimedia.de > > <http://wikimedia.de/>Stellen Sie sich eine Welt vor, in der jeder Mensch > freien Zugang zu der > Gesamtheit des Wissens der Menschheit hat. Helfen Sie uns dabei! > http://spenden.wikimedia.de/ > > Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. > Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg > unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt > für Körperschaften I Berlin, Steuernummer 27/681/51985. > > _______________________________________________ > Commons-l mailing list > Commons-l(a)lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/commons-l > >

1 0

access to data on Wikidata is coming on Dec 2nd
by Lydia Pintscher 11 Dec '14

11 Dec '14

Hey everyone :) I've been asked to enable access to the data on Wikidata for Commons. I'm happy to make that happen. We'll enable access on December 2nd. What does this mean? You will be able to access data from an item on Wikidata like the date of birth of an artist or the name of a city in different languages. Where and how much you make use of that is up for you to decide. You will be able to access the data in two ways. The first one is the #property parser function (https://meta.wikimedia.org/wiki/Wikidata/Notes/Inclusion_syntax). The second one is via Lua (https://www.mediawiki.org/wiki/Extension:Wikibase_Client/Lua). There are two big caveats at this point. 1) You will only be able to access data for items that are connected via a sitelink to the page you want to show the data on. We're currently working on allowing accessing data from any item. This should be available around January/February. 2) You can not use this to store meta data (like the date a picture was taken or who took it) about individual files. This will in the future be stored on Commons itself as part of the structured data project (https://commons.wikimedia.org/wiki/Commons:Structured_data). Please let me know if you have any questions. I am looking forward to more integration between Commons and Wikidata and all the things this will make possible. It'd be great if you could help with updating and expanding https://commons.wikimedia.org/wiki/Commons:Wikidata. The relevant page on Wikidata is https://www.wikidata.org/wiki/Wikidata:Wikimedia_Commons. Cheers Lydia -- Lydia Pintscher - http://about.me/lydia.pintscher Product Manager for Wikidata Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.

7 7

Out now: Open Content – A Practical Guide to Using Creative Commons Licences
by Katja Ullrich 10 Dec '14

10 Dec '14

[sorry for cross-posting] Hi there, maybe some of you have seen it already: Wikimedia Deutschland, the German Commission for UNESCO and North Rhine-Westphalian Library Service Centre just published a guide on how to correctly use Creative Commons licenses. You can read all about it here: https://blog.wikimedia.org/2014/12/09/using-licenses-easy-and-legal/. The guide also has a pretty nice Meta page ( https://meta.wikimedia.org/wiki/Open_Content_-_A_Practical_Guide_to_Using_C…) where you can read the full text or download the PDF. Thanks to Jean-Fred for turning on the translation tool! I am looking forward to the guide being available in many, many languages. If you have any comments or questions, please get in touch with me via e-mail or the talk page on Meta ( https://meta.wikimedia.org/wiki/Talk:Open_Content_-_A_Practical_Guide_to_Us… ). Best, Katja -- Katja Ullrich Politik & Gesellschaft ------------------------------------- Wikimedia Deutschland e.V. | Tempelhofer Ufer 23-24 | 10963 Berlin Telefon 030 - 219 158 26-0 www.wikimedia.de Stellen Sie sich eine Welt vor, in der jeder Mensch an der Menge allen Wissens frei teilhaben kann. Helfen Sie uns dabei! http://spenden.wikimedia.de/ Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.

1 0

Duplicate removal?
by Jonas Öberg 05 Dec '14

05 Dec '14

Hi everyone, In our work with Elog.io[1], we've come across a number of duplicate files in Commons. Some of them are explainable, such as PNGs which also have a thumbnail as JPG[2], but others seem to be more clear-cut duplicated uploads, like [3] and [4], and yet others are the same work but different sizes like [5] and [6]. Going through this is quite an effort, and likely requires a bit of manual work. Is there any organised structure/group of people, that deal with duplicate works? We'd love to contribute our findings to such an effort once we clean up our data a bit. [1] http://elog.io/ [2] Like https://commons.wikimedia.org/wiki/File:Island_House,_Bellows_Falls,_by_P._… [3] https://commons.wikimedia.org/wiki/File:Defense.gov_News_Photo_090910-N-842… [4] https://commons.wikimedia.org/wiki/File:US_Navy_090910-N-8420M-038_Students… [5] https://commons.wikimedia.org/wiki/File:P0772931871(37827)(NRCS_Photo_Galle… [6] https://commons.wikimedia.org/wiki/File:NRCSMT01082(18769)(NRCS_Photo_Galle… -- Jonas Öberg, Founder & Shuttleworth Foundation Fellow Commons Machinery | jonas(a)commonsmachinery.se E-mail is the fastest way to my attention

10 18

Re: [Commons-l] Duplicate removal?
by bawolff 05 Dec '14

05 Dec '14

> > > > Message: 4 > > Date: Thu, 4 Dec 2014 14:58:37 -0500 > > From: "Sreejith K." <sreejithk2000(a)gmail.com> > > To: Wikimedia Commons Discussion List <commons-l(a)lists.wikimedia.org> > > Subject: Re: [Commons-l] Duplicate removal? > > Message-ID: > > <CAN8yy7Mtte+FPJ5N=hq= rQC3onOq5Vvtcixzt+mZ2kxfDAcdKQ(a)mail.gmail.com> > > Content-Type: text/plain; charset="utf-8" > > > > I am using Wikimedia APIs to create a gallery of duplicates and routinely > > clean them. You can see the results here. > > > > https://commons.wikimedia.org/wiki/User:Sreejithk2000/Duplicates > > > > The page also has a link to the script. If anyone is interested in using > > this script, let me know and I can work with you to customize it. > > > > - Sreejith K. > > > > > See also https://commons.wikimedia.org/wiki/Special:ListDuplicatedFiles which lists files that have the most byte for byte duplicates (really most of the time those should use file redirects). -- Thanks Jonas for experimenting with this sort of thing. I always wished we did something with preceptual hashes internally in addition to the sha1 hashes we do currently. --bawolff

1 0

Re: [Commons-l] [Wikivideo-l] Results of video-challenge on Wikipedia
by Fabrice Florin 02 Dec '14

02 Dec '14

Hi Jesse, Thanks for sharing this nice success story! I’m really happy to hear that over 400 videos were added to articles within three weeks of your event. This is a great accomplishment, given that there is still not a lot of video on Wikipedia at this time. Nicely done! I just tweeted it here, if you’d like to retweet: https://twitter.com/fabriceflorin/status/539847861178212353 I also recommended it to our social media team, and am sharing it with our multimedia and commons mailing lists. Keep up the great work :) Be well, Fabrice > On Dec 2, 2014, at 5:24 AM, Jesse de Vos <jdvos(a)beeldengeluid.nl> wrote: > > Hi all, > > I have written a blogpost about our positive(!) experiences with organizing a video-challenge on the UNESCO World day for Audiovisual Heritage. You can find it here: > > http://www.beeldengeluid.nl/en/blogs/research-amp-development-en/201412/vid… <http://www.beeldengeluid.nl/en/blogs/research-amp-development-en/201412/vid…> > > Most notably, over 400 videos were added to articles within three weeks. > We're very open to suggestions how we can improve these type of 'contests' and perhaps there are people who would like to join in for next years' World Day of Audiovisual Heritage (7th October 2015)? :) > > Best, > Jesse > -- > Met vriendelijke groet, > > Jesse de Vos > GLAM-wiki coördinator > > T 035 - 677 39 37 > Aanwezig: ma, di, do > > <http://www.beeldengeluid.nl/> > Nederlands Instituut voor Beeld en Geluid > Media Parkboulevard 1, 1217 WE Hilversum | Postbus 1060, 1200 BB Hilversum | beeldengeluid.nl <http://www.beeldengeluid.nl/> > _______________________________________________ > Wikivideo-l mailing list > Wikivideo-l(a)lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikivideo-l _______________________________ Fabrice Florin Product Manager, Multimedia Wikimedia Foundation https://www.mediawiki.org/wiki/User:Fabrice_Florin_(WMF)

1 0

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

Commons-l December 2014