[Wikimedia-l] The case for supporting open source machine translation
Mathieu Stumpf
psychoslave at culture-libre.org
Wed Apr 24 13:37:39 UTC 2013
Le 2013-04-24 12:35, Denny Vrandečić a écrit :
> 3) Wiktionary could be an even more amazing resource if we would
> finally
> tackle the issue of structuring its content more appropriately. I
> think
> Wikidata opened a few venues to structure planning in this direction
> and
> provide some software, but this would have the potential to provide
> more
> support for any external project than many other things we could
> tackle
If you have any information/idea related to Wikitionary structuration,
please share it on https://meta.wikimedia.org/wiki/Wiktionary_future
> One idea I have been mulling over for years is basically how can we
> use
> this advantage for the task of creating content available in many
> languages. Wikidata is an obvious attempt at that, but it really goes
> only
> so far. The system I am really aiming at is a different one, and
> there has
> been plenty of related work in this direction: imagine a wiki where
> you
> enter or edit content, sentence by sentence, but the natural language
> representation is just a surface syntax for an internal structure.
I don't understand what you mean. To begin with, I doubt that sentence
is the good scale to translate a natural language discourse. Sure some
time you may translate one word with one word in an other language.
Sometime you may translate a sentence with one sentence. Sometime you
need to grab the whole paragraph, or even more, and sometime you need to
have a whole cultural background to get the meaning of a single word in
the current context. To my mind, natural languages deals with more than
context free language. Could a static "internal structure" deal with
such a dynamics?
> Your
> editing interface is a constrained, but natural language.
This is realy where I don't see how you hope to manage that.
> Now, in order to
> really make this fly, both the rules for the parsers (interpreting
> the
> input) and the serializer (creating the output) would need to be
> editable
> by the community - in addition to the content itself. There are a
> number of
> major challenges involved, but I have by now a fair idea of how to
> tackle
> most of them (and I don't have the time to detail them right now).
Well I'll be curious to have more information, like references I should
read. Otherwise I'm affraid that what you says sounds like the Fermat's
Last Theorem[1] and the famous margin which was too small to contain
Fermat's alleged proof of his "last theorem".
[1] https://en.wikipedia.org/wiki/Fermat%27s_Last_Theorem
--
Association Culture-Libre
http://www.culture-libre.org/
More information about the Wikimedia-l
mailing list