Hey everyone, first of all, thanks for all the compliments on my tool! I'm really glad that people think it is of use.
First of all, i made a couple of updates to the tool, including adding i18n support, category search and links to Petscan. I've posted an update on Twitter: https://twitter.com/hayify/status/1264647593218408449
Just a couple of general observations and thoughts about this project. When i started development i had two main goals:
1) To showcase the beauty and richness of the media on Wikimedia Commons There are many files on Wikicommons that are of exceptional quality, but unfortunately the current search results page doesn't really do a lot of effort to showcase that, compared to search engines like Google or Yandex. This is why the Structured Search results overview focuses on big thumbnails and not on the metadata (i doubt many users are interested in the filesize or resolution for example, which are shown on the current Commons results page). This has been a big grudge of me for many years, i've even came across a Phabricator ticket from almost five(!) years ago where i was already proposing something like this (https://phabricator.wikimedia.org/T104565).
2) To showcase the usefulness of structured data Currently the only way to search for structured data is by using the 'haswbstatement' filter in the search engine. I doubt many people know this (i didn't until Maarten Dammers pointed me to it) and the user-friendliness is pretty low, given that you need to fill in property and item numbers by hand. The best way to find media based on structured data would be using a SPARQL endpoint (just as on Wikidata), but unfortunately work on that has been slow (see https://phabricator.wikimedia.org/T141602). So this tool is basically a stopgap to author Commons search queries using haswbstatement until we have a proper SPARQL endpoint for SDoC.
A couple of other points that might be of use: * Structured Search queries are completely compatible with Wikimedia Commons search engine queries. It's the exact same format. Any query that can be done using a Commons search query can be done on Structured Search (given that it uses the 'File' namespace). I've made a Commons gadget you can use if you want a link from Commons search results directly to the tool (https://commons.wikimedia.org/wiki/User:Husky/sdsearch.js) * I'm really pleased with the translations that have already been made. Updating the localisations is still a manual process, i'll change that to something more automatic in the future. * I don't intend this to be the replacement of the regular search on Commons, i just hope it can serve as an inspiration and as a solution to people who want to have something more visual than the current search UI.
I don't read this mailing list too much, so for feature requests or bug reports it's probably best if you submit an issue on Github (https://github.com/hay/wiki-tools) or reach out to me on Twitter (@hayify) or via Wikimail.
Kind regards, -- Hay
On Sun, May 24, 2020 at 9:20 PM Gerard Meijssen gerard.meijssen@gmail.com wrote:
Hoi, I have been a professional developer for much of my working life. From what I know of what Hay has done, I know you are wrong depending on the approach taken. Building this functionality can be an iterative process, it does not need to be all singing all dancing from the start. At one time the WMF used the agile methodology and you can break up a project like this in parts, functional parts.
- The first part is a hack to replace the current code with
internationalised code and have it localised in Translatewiki.net.
- You then want to build in functionality that measures its use. It also
measures the extend labeling expands in Wikidata because of this tool. In essence this is not essential.
- As the tool becomes more popular, it follows that the database may need
tuning to allow for its expanded use
- A next phase is for the code to be made into an extension enabling the
native use in MediaWiki projects. This does not mean Commons, it may be in any language projects that cares to use it. It is particularly the small languages (less than 100.000 articles).
- Given that measurements are in place, it then follows that we learn what
it takes to expand the usage of images. Not only but also for our projects. For a first time the small languages take precedence.. The primary reason is that for those languages there are few pictures that they find when they google or bing.
- When there is an expressed demand for bigger languages < 1.000.000
articles, we add these on the basis of a first come, first served basis. This is to ensure a steady growth path in the usage.
- Once we understand the scaling issues, we can expand to Commons itself
and any and all projects.
- Once we consider sharing freely licensed media files a priority, we can
speed the process up within the limits of what is technically feasible.
At the same time, we keep the standalone function available. It will serve a subset of our potential public. This will help us understand the pent up demand for a service like this. When the WMF is truly "agile" in its development, it is a business decision what priority it gets. Much of what I describe has been done by us before; it is not rocket science. The first phase could be done within a month. Scaling up the usage and integrating it in existing code and projects may indeed take the best of a year. Again, that is not so much a technical but much more a business consideration. As always, technical issues may crop up and they are refactored in an agile process. Thanks, GerardM
On Sun, 24 May 2020 at 20:36, Michael Peel email@mikepeel.net wrote:
Hi Gerard,
I mostly agree with you. However, I disagree with this:
This proof of concept is largely based on existing WMF functionality so
it
takes very little for the Wikimedia Foundation to adopt the code, do it properly particularly for the Internationalisation.
Turning prototype code into production code is never trivial. When you’re writing a prototype, you get to skip all performance and edge case concerns, and you don’t need to integrate it into existing code, you’re just interested in getting something working. I hope (and expect) that the WMF will make improvements to Commons’ multilingual search in the future, but it’s definitely not a “very little” amount of work that needs doing, it’s a year or more worth of developer time.
Thanks, Mike _______________________________________________ Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/wiki/Wikimedia-l New messages to: Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/wiki/Wikimedia-l New messages to: Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe