Hi, Andrzej
Send Abstract-Wikipedia mailing list submissions to
abstract-wikipedia@lists.wikimedia.org
To subscribe or unsubscribe via the World Wide Web, visit
https://lists.wikimedia.org/mailman/listinfo/abstract- wikipedia
or, via email, send a message with subject or body 'help' to
abstract-wikipedia-request@lists.wikimedia.org
You can reach the person managing the list at
abstract-wikipedia-owner@lists.wikimedia.org
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Abstract-Wikipedia digest..."
Today's Topics:
1. Re: Comprehension questions (Charles Matthews)
2. Natural Language and Mathematics Generation (Adam Sobieski)
3. Re: Natural Language and Mathematics Generation (Charles Matthews)
4. Loose notes (Andy)
------------------------------------------------------------ ----------
------------------------------
Message: 4
Date: Mon, 3 Aug 2020 12:29:03 +0200
From: Andy <borucki.andrzej@gmail.com>
To: abstract-wikipedia@lists.wikimedia.org
Subject: [Abstract-wikipedia] Loose notes
Message-ID:
<CAE2KeAK00kSL=jJp8gNGPNp_N8KGH0yXXUXKSa6XLM9R-ParvA@ >mail.gmail.com
Content-Type: text/plain; charset="utf-8"
Hi,
Abstract Wikipedia give benefits:
- first, is creating multi-language corpus for machine translation
learning. The big disadvantage of the existing multi-language corpuses is
that most of data is from movie subtitles, which are very inaccurate.
- second, that it will data for Word Sense Disambiguation learning and WSD
in many languages(!).
In abstract form should be graph of senses. Senses will be choosed from
English Wordnet/UNL or English Wiktionary? UNL is piece of good work but is
inactive for years and not evolves. Wiktoinary senses have plus, that are
grouped by etymology – quite different senses are in other etymology group.
Abstract Wikipedia will linked with Wiktionary? Wiktionary senses numbers
should be now persistent , or better have unique idents. Wiktionary has
advantage that senses are translated to other languages, with disadvantage
that its points to words not senses in other language. Alternative Abstract
Wikipedia can have own sense list with idents but how to lik with
Wiktionary?
Graph: should be possibility to create text in many/all laguages. For
example in English is “I saw”, in Polish “widziałemwidziałam” – Polish need
gender, in Abstract form should be gender of verb, even though some
languages not uses it.
Senses dictionary can grow gradually with abstract text. If I edit abstract
text, editor should enforce me add word with senses to dictionary if not
exists and enable me to add new sense if not exists.
Is neede:
abstract text = corpus
growing dictionary of senses
growing senses to national language senses dictionary
possibly link with Wiktionaries
Best regards,
Andrzej
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.wikimedia.org/pipermail/abstract-wikipedia/ >attachments/20200803/2075227b/ attachment.html
------------------------------
Subject: Digest Footer
_______________________________________________
Abstract-Wikipedia mailing list
Abstract-Wikipedia@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/abstract- wikipedia
------------------------------
End of Abstract-Wikipedia Digest, Vol 2, Issue 4
************************************************