Pronunciation recording tool wanted - Wiktionary-l

13 Mar 2013


      In Wiktionary, it's very convenient that some words
have sound illustrations, e.g.
http://en.wiktionary.org/wiki/go%C3%BBter
These audio bites are simple 2-3 second OGG files, e.g.
http://commons.wikimedia.org/wiki/File:Fr-go%C3%BBter.ogg
but they are limited in number. It would be very
easy to record more of them, but before you get
started it takes some time to learn the details,
and then you need to upload to Commons and specify
a license, and provide a description, ... It's not
very likely that the person who does all that is
also a good voice in each desired language.
Here's a better plan:
Provide a tool on the toolserver, or any other
server, having a simple link syntax that specifies
the language code and the text, e.g.
http://toolserver.org/mytool.php?lang=fr&text=gouter
The tool uses a cookie, that remembers that this
user has agreed to submit contributions using cc0.
At the first visit, this question is asked as a
click-through license.
The user is now prompted with the text (from the URL)
and recording starts when pressing a button. The
user says the word, and presses the button again.
The tool saves the OGG sound, uploads it to Commons
with the filename fr-gouter-XYZ789.ogg and
the cc0 declaration and all metadata, placing it
in a category of recorded but unverified words.
Another user can record the same word, and it will
be given another random letter-digit code.
As a separate part of the tool, other volunteers are
asked to verify or rate (1 to 5 stars) the recordings
available in a given language. The rating is stored
as categories on commons.
Now, a separate procedure (manual or a bot job) can
pick words that need new or improved recordings,
and list them (with links to the tool) on a normal
wiki page.
I know HTML supports uploading of a file, but I don't
know how to solve the recording of sound directly to
a web service. Perhaps this could be a Skype application?
I have no idea. Please just be creative. It should be
solvable, because this is 2013 and not 2003.
-- 
   Lars Aronsson (lars@aronsson.se)
   Aronsson Datateknik - http://aronsson.se