Re: [Wikimedia-l] Giving Commons a bigger public

24 May 2020

      Hey everyone,
first of all, thanks for all the compliments on my tool! I'm really
glad that people think it is of use.
First of all, i made a couple of updates to the tool, including adding
i18n support, category search and links to Petscan. I've posted an
update on Twitter:
https://twitter.com/hayify/status/1264647593218408449
Just a couple of general observations and thoughts about this project.
When i started development i had two main goals:
1) To showcase the beauty and richness of the media on Wikimedia Commons
There are many files on Wikicommons that are of exceptional quality,
but unfortunately the current search results page doesn't really do a
lot of effort to showcase that, compared to search engines like Google
or Yandex. This is why the Structured Search results overview focuses
on big thumbnails and not on the metadata (i doubt many users are
interested in the filesize or resolution for example, which are shown
on the current Commons results page). This has been a big grudge of me
for many years, i've even came across a Phabricator ticket from almost
five(!) years ago where i was already proposing something like this
(https://phabricator.wikimedia.org/T104565).
2) To showcase the usefulness of structured data
Currently the only way to search for structured data is by using the
'haswbstatement' filter in the search engine. I doubt many people know
this (i didn't until Maarten Dammers pointed me to it) and the
user-friendliness is pretty low, given that you need to fill in
property and item numbers by hand. The best way to find media based on
structured data would be using a SPARQL endpoint (just as on
Wikidata), but unfortunately work on that has been slow (see
https://phabricator.wikimedia.org/T141602). So this tool is basically
a stopgap to author Commons search queries using haswbstatement until
we have a proper SPARQL endpoint for SDoC.
A couple of other points that might be of use:
* Structured Search queries are completely compatible with Wikimedia
Commons search engine queries. It's the exact same format. Any query
that can be done using a Commons search query can be done on
Structured Search (given that it uses the 'File' namespace). I've made
a Commons gadget you can use if you want a link from Commons search
results directly to the tool
(https://commons.wikimedia.org/wiki/User:Husky/sdsearch.js)
* I'm really pleased with the translations that have already been
made. Updating the localisations is still a manual process, i'll
change that to something more automatic in the future.
* I don't intend this to be the replacement of the regular search on
Commons, i just hope it can serve as an inspiration and as a solution
to people who want to have something more visual than the current
search UI.
I don't read this mailing list too much, so for feature requests or
bug reports it's probably best if you submit an issue on Github
(https://github.com/hay/wiki-tools) or reach out to me on Twitter
(@hayify) or via Wikimail.
Kind regards,
-- Hay
On Sun, May 24, 2020 at 9:20 PM Gerard Meijssen
gerard.meijssen@gmail.com wrote:
...
Hoi,
I have been a professional developer for much of my working life. From what
I know of what Hay has done, I know you are wrong depending on the approach
taken. Building this functionality can be an iterative process, it does not
need to be all singing all dancing from the start. At one time the WMF used
the agile methodology and you can break up a project like this in parts,
functional parts.

The first part is a hack to replace the current code with

internationalised code and have it localised in Translatewiki.net.

You then want to build in functionality that measures its use. It also

measures the extend labeling expands in Wikidata because of this tool. In
essence this is not essential.

As the tool becomes more popular, it follows that the database may need

tuning to allow for its expanded use

A next phase is for the code to be made into an extension enabling the

native use in MediaWiki projects.  This does not mean Commons, it may be in
any language projects that cares to use it. It is particularly the small
languages (less than 100.000 articles).

Given that measurements are in place, it then follows that we learn what

it takes to expand the usage of images. Not only but also for our projects.
For a first time the small languages take precedence.. The primary reason
is that for those languages there are few pictures that they find when they
google or bing.

When there is an expressed demand for bigger languages < 1.000.000

articles, we add these on the basis of a first come, first served basis.
This is to ensure a steady growth path in the usage.

Once we understand the scaling issues, we can expand to Commons itself

and any and all projects.

Once we consider sharing freely licensed media files a priority, we can

speed the process up within the limits of what is technically feasible.
At the same time, we keep the standalone function available. It will serve
a subset of our potential public. This will help us understand the pent up
demand for a service like this. When the WMF is truly  "agile" in its
development, it is a business decision what priority it gets. Much of what
I describe has been done by us before; it is not rocket science. The first
phase could be done within a month. Scaling up the usage and integrating it
in existing code and projects may indeed take the best of a year. Again,
that is not so much a technical but much more a business consideration. As
always, technical issues may crop up and they are refactored in an agile
process.
Thanks,
      GerardM
On Sun, 24 May 2020 at 20:36, Michael Peel email@mikepeel.net wrote:
...
Hi Gerard,
I mostly agree with you. However, I disagree with this:
...
This proof of concept is largely based on existing WMF functionality so
it
...
takes very little for the Wikimedia Foundation to adopt the code, do it
properly particularly for the Internationalisation.
Turning prototype code into production code is never trivial. When you’re
writing a prototype, you get to skip all performance and edge case
concerns, and you don’t need to integrate it into existing code, you’re
just interested in getting something working. I hope (and expect) that the
WMF will make improvements to Commons’ multilingual search in the future,
but it’s definitely not a “very little” amount of work that needs doing,
it’s a year or more worth of developer time.
Thanks,
Mike
_______________________________________________
Wikimedia-l mailing list, guidelines at:
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and
https://meta.wikimedia.org/wiki/Wikimedia-l
New messages to: Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe

Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/wiki/Wikimedia-l
New messages to: Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

Re: [Wikimedia-l] Giving Commons a bigger public