New subject: NLP issues severely overlooked (Amir E. Aharoni)

4 Jul 2020

      Hi Amir,
I understand the process is different that usual research. In fact I've seen Wikipedia grown from an unknown website to the biggest encyclopedia it is now. I use it daily in multiple languages and love it. I know what crowd sourcing could achieve.
...
It's also possible that the mere *finding* of these stumbling blocks by such a big, diverse, open, and active community, will itself be a contribution to the scientific knowledge around this subject.
I disagree here. It would be contribution to scientic knowledge if and only if it wasn't discovered before. My email was precisely about that: capitalizing on the knowledge that has already been discovered, to avoid making the same mistake them again. It would not matter for a small project, but this one is really ambitious. We are speaking of 40 years of work by a horde of talented and very knowledgeable people, so this isn't to be dismissed easily.
This thing is, my previous email was a bit abstract, because it were a review of the paper, not of the project itself. I should have made more examples to illustrate where the problem lies.
Let's start with a simple example, in English, with corresponding Wikidata entities in-between parenthesis. I'm also using pseudo-turtle notation with made up relationships.
France (Q142) is a country (Q6256).
<Q142> <rel_is> <Q6256> .
Creating the English sentence is straightforward with the naive approach presented in the paper.
What is the French equivalent?
La France est un pays.
More information is required in the abstract representation: the text generator needs to know about the gender of both nouns (France and pays). So we need to extend the model as such:
<Q142> <rel_gender> <Q1775415> .
<Q6256> <rel_gender> <Q499327> .
Fine! Now what about Chinese?
法國是一個國家。
What we have in the middle of the sentence is a classifier (個). The model needs the following update:
<Q499327> <rel_use_classifier> <Q63153> .
To handle these 3 languages, the model has already 3 additional triples just for accounting for linguistic facts occuring in these languages. Wikipedia exists in more than 300 languages, and the world has about 6000 of them, each of them having particularities that must be taken into account. Fortunately they recoup themselves in-between languages. Nonetheless the World Atlas Language Structures (https://wals.info/chapter/s1) count 144 distinct language features. Some are related to speech, but this means there is probably something like a hundred features that must be taken into account in the data model to produce valid natural language sentence.
Note that in the Chinese example, there is also a number (一, one) showing up. This is a phenomenon that must be taken into account; and it's not always appearing when using 是 (to be). How complex the "lambda" system will be just to deal with this issue? Hint: very much. It also needs to be compatible with dozen of other phenomena.
Then each of those features require extensive and complete data. For French, the gender of every noun entity *must* be present, otherwise there is half a chance of producing a wrong sentence each time a noun entity is encountered. For Chinese and Japanese, classifier information must be present for all noun, in case one must be enumerated. Where does the project will get the data from? (we are speaking of millions of item, most not referenced in existing dictionaries) How will this be encoded? Those are real questions that must be answered.
Suppose now we have done the work for "renderers" in these three languages. They both use the more or less similar A X B structure where X is a verb meaning "to be".
What would be the Japanese equivalent?
The more natural structure would be like:
フランスは国(だ)。
What is a play here is a topicalization (Q63105) of France, followed by a predicate (it's a country). This is very different from the previous structure, which, not surprisingly enough, needs it's own representation. To make situation more difficult, the previous (A be B) structure can also exists in Japanese, but would lead to a totally different sentence if used.
The paper states that Figure 1 and 2 are examples that will be more complex in real life. Yet, the use of any existing formalism is dismissed, which mean all the situations I illustrated in this email will need to be dealt with in an ad hoc fashion. Moreover, changing formalism (be it ad hoc or not) will require to change every piece of code/data using it. This will happen everytime a language with unsupported feature(s) is added to the project. It's not hard to see how this will waste a huge amount of time and goodwill from involved people. The very code focussed tone of the paper, the english-centric approach used in the examples and the lack of references shows that the complexity of the task on the NLP front is not sufficiently conceptualized.
Best Regards,
Louis Lecailliez
________________________________
De : Abstract-Wikipedia abstract-wikipedia-bounces@lists.wikimedia.org de la part de abstract-wikipedia-request@lists.wikimedia.org abstract-wikipedia-request@lists.wikimedia.org
Envoyé : samedi 4 juillet 2020 15:06
À : abstract-wikipedia@lists.wikimedia.org abstract-wikipedia@lists.wikimedia.org
Objet : Abstract-Wikipedia Digest, Vol 1, Issue 6
Send Abstract-Wikipedia mailing list submissions to
        abstract-wikipedia@lists.wikimedia.org
To subscribe or unsubscribe via the World Wide Web, visit
        https://lists.wikimedia.org/mailman/listinfo/abstract-wikipedia
or, via email, send a message with subject or body 'help' to
        abstract-wikipedia-request@lists.wikimedia.org
You can reach the person managing the list at
        abstract-wikipedia-owner@lists.wikimedia.org
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Abstract-Wikipedia digest..."
Today's Topics:
1. Re: NLP issues severely overlooked (Charles Matthews)
   2. Use case: generation of short description (Jakob Voß)
   3. Re: NLP issues severely overlooked (Amir E. Aharoni)
----------------------------------------------------------------------
Message: 1
Date: Sat, 4 Jul 2020 14:05:09 +0100 (BST)
From: Charles Matthews charles.r.matthews@ntlworld.com
To: "General public mailing list for the discussion of Abstract
        Wikipedia (aka Wikilambda)" abstract-wikipedia@lists.wikimedia.org
Subject: Re: [Abstract-wikipedia] NLP issues severely overlooked
Message-ID: 2126327926.39940.1593867909152@mail2.virginmedia.com
Content-Type: text/plain; charset="utf-8"
It is interesting to be on a list where one can hear about software issues, and then computational linguistic problems. I'm not an expert in either area.
I do have 17 years of varied Wikimedia experience (and I use my real name there).
...
On 04 July 2020 at 12:25 Louis Lecailliez louis.lecailliez@outlook.fr wrote:
<snip>
...
 Nothing precise is said about linguistic resources in the AW paper except for "These function finally can call the lexicographic knowlegde stored in Wikidata.", which is not very convincing: first because Wiktionary projects themselves severely lacks content and structure for those who has some content at all, secondly since specialized NLP ressources are missing there too (note: I'm not familiar with Wikidata so I could be wrong, however I never saw it cited for the kind of NLP resources I'm talking about).

I can comment about this. Besides Wiktionary, there is the "lexeme" namespace of Wikidata. It is a relatively new part of Wikidata, dealing with verbal forms.
...
To finish on a positive note, I would like to highlight the points I really like in the paper. First, its collaborative and open nature, like all Wikimedia projects, gives him a much higher chance of success than its predecessors.
It is worth saying, for context, that there is a certain style or philosophy coming from the wiki side: more precisely, from the wikis before Wikipedia. There is the slogan "what is the simplest thing that would actually work?" You might argue, plausibly, that Wikipedia at nearly 20 years old, shows that there is a bit more to engineering than that.
On the other hand, looking at Wikidata at seven years old, there is some point to the comment. It has a rather simple approach to linked structured data, compared to the Semantic Web environment. (Really, just write a very large piece of JSON and try to cope with it!) But the number of binary relations used (8K, if you count the "external links" handling) is now quite large, and has grown organically. The data modelling is in a sense primitive, sometimes non-existent. But the range of content handled really is encyclopedic. And in an area like scientific bibliography, at a scale of tens of millions of entities, the advantages of not much ontological fussiness begin to be seen.
Wikidata started as an index of all Wikipedia articles, and is now five times the size needed for that: a very enriched "index".
I suppose the NLP required to code up, for example, 50K chemistry articles about molecules, might be a problem that could be solved, leaving aside the general problems for the moment.
In any case, there is a certain approach, neither academic nor commercial, that comes with Wikimedia and its communities, and it will be interesting to see how the issues are addressed.
Charles Matthews (in Cambridge UK)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://lists.wikimedia.org/pipermail/abstract-wikipedia/attachments/20200704/1113bab0/attachment-0001.html
------------------------------
Message: 2
Date: Sat, 4 Jul 2020 08:18:56 +0200
From: Jakob Voß jakob.voss@gbv.de
To: abstract-wikipedia@lists.wikimedia.org
Subject: [Abstract-wikipedia] Use case: generation of short
        description
Message-ID: 4403bbda-040b-6c89-9cb6-6540139250dc@gbv.de
Content-Type: text/plain; charset="utf-8"
Hi,
I want to auto-generate disambiguation description for African
politicians to be added to Wikidata, e.g. from the country Mozambique
(Q1029) the following descriptions should be generated:
Mozambican politician (en)
Mosambikanischer Politiker (de)
politico mozambicano (it)
...
This could be extended to other professions. My questions:
- Can anyone point me to data sources where to best look up country
adjectives such as "Mozambican"?
- Where/how to best store the lexical information for best reuse with
other renderers
- If a create small renderers for this short descriptions, what
architecture do you prefer for best reuse?
My just-get-it-done solution would be a set of CSV files and a few lines
of Perl code, but maybe this use case can be aligned with Abstract
Wikidata to better learn about it.
Looking forward to collaborate,
Jakob
------------------------------
Message: 3
Date: Sat, 4 Jul 2020 18:03:24 +0300
From: "Amir E. Aharoni" amir.aharoni@mail.huji.ac.il
To: "General public mailing list for the discussion of Abstract
        Wikipedia (aka Wikilambda)" abstract-wikipedia@lists.wikimedia.org
Subject: Re: [Abstract-wikipedia] NLP issues severely overlooked
Message-ID:
        CACtNa8t6kbWe21C980h1MxiWNfUp+0eDE82vPMjDUX2UCgb2gw@mail.gmail.com
Content-Type: text/plain; charset="utf-8"
Hi,
Thanks a lot for the sources. I am not one of the people implementing
Wikilambda, but I am just very curious about it as a member of the wider
Wikimedia community. But there's a good chance that they will be useful to
people who do work on the implementation.
I will dare to add a little thought I have about it, however. It's possible
that the challenge of building a well-functioning natural language
generator is underestimated by the founders, and that they don't pay enough
attention to existing work (although, knowing Denny, there is a good chance
that he actually is aware of at least some of it). But there is something
that the wide Wikimedia community has that I'm not sure that the past
projects in this field did: The community itself. A big, worldwide, and
diverse group of passionate volunteers, who love the idea of spreading free
knowledge and who love their languages. Quite a lot of them also know some
programming, and in the past they proved unbelievably creative and
productive when writing code for Wikimedia projects as a community, in the
form of templates, modules, gadgets, bots, extensions, and other tools. I'm
quite sure that once the new tools become usable, this community will start
doing creative things again, and it will also start reporting bugs and
limitations.
So yes, while it's possible that along the way both the core developers and
the volunteer community will find all kinds of stumbling blocks, I'm pretty
sure that they will also have all kinds of surprising success stories. It's
also possible that the mere *finding* of these stumbling blocks by such a
big, diverse, open, and active community, will itself be a contribution to
the scientific knowledge around this subject. And don't underestimate the
"open" part—that's where we really shine. This won't be a theoretical work
in a lab, published in a paywalled and copyright-restricted academic
journal, but fully optimized for accessibility to everyone.
Yes, this whole email from me is incredibly naïve, but it's the same
attitude that got us to writing the biggest and most multilingual
encyclopedia in history, so maybe we can do something cool again :)
--
Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי
http://aharoni.wordpress.com
‪“We're living in pieces,
I want to live in peace.” – T. Moore‬
‫בתאריך שבת, 4 ביולי 2020 ב-14:26 מאת ‪Louis Lecailliez‬‏ <‪
louis.lecailliez@outlook.fr‬‏>:‬
...
Hello,
my name is Louis Lecailliez, PhD student at Kyoto University in education
technology. I'm a Computer Science and NLP graduate. One thing I do is
working on language learner's knowledge modelling as graphs.
The Abstract Wikipedia project is really interesting. There is however two
very concerning issues I spotted when reading the associated paper draft (
https://arxiv.org/abs/2004.04733). The following email could be read as
negative, but please don't take it as such: my purpose is to avoid spending
people efforts and money for things that can (need to!) be fixed upfront.

Issues with NLP

The main issue is that the difficulty of the NLP task of generating
natural text from an abstract representation is severely overlooked. This
stems from the other main problem: the paper is not based on the decades of
previous work in that space.
As I understand it, the main value proposition of Abstract Wikipedia (AW)
is a computer representation of encyclopedic knowledge that can be
projected into different existing natural languages, with the goal of
supporting a huge number of them. Plus, an editor to make this happen
easily.
This is in fact surprisingly extremely close to what the Universal
Networking Language (UNL) project, which started 20 years ago, aims to do.
UNL provides a language agnostic representation of text that uses
hypergraph. Software (called EnConverter) produce UNL graphs from natural
text in a given language. Another kind of software called DeConverter do
the reverse, that is producing natural text from the abstract
representation. This is exactly the same function of the "renderers" in the
AW paper. The way of doing it is also similar: by applying successive
transformations until the final text string is produced. In general, that
kind of abstract meaning representation is called an Interlingua, and is
widely used in Machine Translation (MT) systems.
Disregarding two decades of work, in the UNL  case, on the same problem
space (rule-based machine translation, here from an abstract language as
fixed source language), which was itself based on few other decades of
work, doesn't seem to be a wise move to start a new project. For a start,
the graph representation used in the AW will likely not be expressive
enough to encode linguistic knowledge; this is why UNL uses hypergraphs
instead of graphs.
The problem is glaring when looking at the references list: the list is
bloated with irrelevant references (such as those to programming languages
[27, 37, 41, 77], Turing completeness being the worst offender [11, 17, 23,
...]) while containing only two references [7, 85] to the really hard part
of the project: generating natural language from the abstract
representation. There are few more relevant references about natural
language generation, but this isn't enough.
Interestingly, [85] is an UNL paper, but not the main one. Moreover, it is
cited in Section 9 "Opening future research". This should be instead placed
in a "Previous work" section which is missing from the paper.
To fill a part of this section yet to be written, I propose the following
references:
[*1] Uchida, H., Zhu, M., & Della Senta, T. (1999). A gift for a
millennium. IAS/UNU, Tokyo.
https://www.researchgate.net/profile/Hiroshi_Uchida2/publication/239328725_A...
[*2] Wang-Ju Tsai (2004) La coédition langue-UNL pour partager la révision
entre langues d'un document multilingue. [Language-UNL coedition to share
revisions in a multilingual document] Thèse de doctorat. Grenoble.
https://pdfs.semanticscholar.org/b030/ea4662e393657b9a134c006ca5b08e8a23b3.p...
[3*] Boitet, C., & Tsai, W. J. (2002). La coédition langue<—> UNL pour
partager la révision entre les langues d'un document multilingue: un
concept unificateur. Proc. TALN-02, Nancy, 22-26.
http://www.afcp-parole.org/doc/Archives_JEP/2002_XXIVe_JEP_Nancy/talnrecital...
[4*] Tomokiyo, M., Mangeot, M., & Boitet, C. (2019). Development of a
classifiers/quantifiers dictionary towards French-Japanese MT. arXiv
preprint arXiv:1902.08061.
https://arxiv.org/pdf/1902.08061.pdf
[5*] Boguslavsky, I. (2005). Some controversial issues of UNL: Linguistic
aspects. Research on Computer Science, 12, 77-100.
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.212.2058&rep=re...
[6*] Boitet, C. (2002). A rationale for using UNL as an interlingua and
more in various domains. In Proc. LREC-02 First International Workshop on
UNL, other Interlinguas, and their Applications, Las Palmas (pp. 26-31).
https://www.cicling.org/2005/unl-book/Papers/003.pdf
[7*] Dhanabalan, T., & Geetha, T. V. (2003, December). UNL deconverter for
Tamil. In International Conference on the Convergences of Knowledge,
Culture, Language and Information Technologies.
http://www.cfilt.iitb.ac.in/convergence03/all%20data/paper%20032-372.pdf
[8*] Singh, S., Dalal, M., Vachhani, V., Bhattacharyya, P., & Damani, O.
P. (2007). Hindi generation from Interlingua (UNL). Machine Translation
Summit XI.
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.78.979&rep=rep1...
[9*] Banarescu, L., Bonial, C., Cai, S., Georgescu, M., Griffitt, K.,
Hermjakob, U., ... & Schneider, N. (2013, August). Abstract meaning
representation for sembanking. In Proceedings of the 7th linguistic
annotation workshop and interoperability with discourse (pp. 178-186).
https://www.aclweb.org/anthology/W13-2322.pdf
[10*] Berment, V., & Boitet, C. (2012). Heloise—An Ariane-G5 Compatible
Rnvironment for Developing Expert MT Systems Online. In Proceedings of
COLING 2012: Demonstration Papers (pp. 9-16).
https://www.aclweb.org/anthology/C12-3002.pdf
[11*] Berment, V. (2005). Online Translation Services for the Lao
Language. In Proceedings of the First International Conference on Lao
Studies. De Kalb, Illinois, USA (pp. 1-11).
https://www.researchgate.net/profile/Vincent_Berment/publication/242140227_O...
[*1] is the paper that describes UNL. [2*] is a doctoral thesis discussing
a core problem AW is trying to address too. [3*] is a short paper done in
the scope of [2*], even if you don't understand French you can have a look
at the figures: two of them are about an editor similar in principe to what
AW wants to incorporate.
[5*] Insights about UNL expressivity issues, 10 years after the project's
start. [6*] More UNL, with short history and context in which it is used.
[4*] shows how deep natural language conversion goes: this paper addresses
the issue of classifiers in French and Japanese. This is just one
linguistic issue and there are dozens if not hundreds of such. An important
point is that both of the languages involved need to be taken into account
when modelling the abstract encoding, otherwise too much information is
lost for producing a correct output.
[7*] [8*] are very valuable examples of real world deconverter systems for
UNL. As it's visible on [7*]'s Figure 1 and [8*]'s Figure 2, the process is
*way* more complicated than a single "renderers" box. Moreover, there are
very distinct identifiable steps, informed by linguistics. The AW does not
describe any such structuration of natural text generation processing
steps, everything is supposed to be happening in some unstructured "lambda"
system. Also missing are the specialized resources (UNL-Hindi dictionary,
Tamil Word dictionary, co-occurrence dictionary, etc.) required for the
task.  Nothing precise is said about linguistic resources in the AW paper
except for "These function finally can call the lexicographic knowlegde
stored in Wikidata.", which is not very convincing: first because
Wiktionary projects themselves severely lacks content and structure for
those who has some content at all, secondly since specialized NLP
ressources are missing there too (note: I'm not familiar with Wikidata so I
could be wrong, however I never saw it cited for the kind of NLP resources
I'm talking about).
[10*] is a translation system built with "specialised languages for
linguistic programming (SLLPs)" which is the service Wikilambda is supposed
to provide for Abstract Wikipedia. [11*] gives the estimation of 2500 hours
for the development (by a specialist) of three linguistic modules for Lao
processing.
So, in regard to the difficulty of the task, and previous work in the
literature, the AW paper does not provide any convincing evidence that the
technology on which it is supposed to be built can even reach the
state-of-art. Dismissing every existing formal and software systems on the
ground of "no consensus commiting to any specific linguistic theory" is not
gonna work: this will result in ad hoc implementation-driven formalism that
will have hard time fullfilling its goal.
The NLP part (generating sentences from abstract representation) is the
hardest of the project, yet it’s by far the least convincing one. "Abstract
Wikipedia is indeed firmly within this tradition, and in preparation for
this project we studied numerous predecessors." I would like to believe so,
but the lack of corresponding reference as well as lack of previous work
section tends to prove the contrary.
While I can't advice for a switch to UNL, as I'm not specialist of it, it
would be smart to capitalize on the work done on it by highly skilled (PhD
level) individuals. As the UNL system is built on hypergraphs, it probably
could be made interoperable easily with RDF knowledge graphs if named
graphs are used. By having a UNL/RDF specification (yet to be written), the
vision exposed in the AW paper may be reached sooner by reusing existing
software (we are speaking of thousands man-year of work as per [11*]), and
almost as importantly, an existing formalism that has been "debugged" for
decades. There are probably other systems I'm unaware of that are worth
investigating too, some like [9*] having more specialized usage. In any
case, there is a strong need to back the paper and the project on the
existing (huge) literature.

Other issues

"In order to evaluate a function call, an evaluator can choose from a
multitude of backends: it may evaluate the function
call in the browser, in the cloud, on the servers of the Wikimedia
Foundation, on a distributed peer-to-peer evaluation
platform, or natively on the user’s machine in a dedicated hosting
runtime, which could be a mobile app or a server on the
user’s computer."
This part is big technical creep. There is no reason to turn the project
into a distributed heterogenous computing platform with a dedicated
runtime, which could be a research project on its own, when the stated goal
is to provide abstract multilingual encyclopedic content. All the
computation can be done on servers (cloud is servers too) and cached. This
is way easier to implement, test and deliver than to implement 10 different
backends with various progress in implementation, incompatibilities and
runtime characteristics.
The paper presents AW as sitting on top on WL. Both are big projects.
Sitting a big project on top of another one is really risky, as it means a
significant milestone must first be reached in the dependency (here WL),
which would likely took some years, before even starting the work on the
other project. AW can be realised with current tools and engineering
practices.
"One obstacle in the democratization of programming has been that almost
every programming language requires first to learn some basic English."
This strong affirmation needs to be sourced. Programming languages, save
for a few keywords, doesn't rely much on English. The vast insuccess of
localized version of programming languages (such as French Basic) as well
as the heavy use of existing programming language in countries that doesn't
even use the Latin alphabet (China, Russia) tends to prove that English is
not all a bottleneck for the democratization of programming. [53] is cited
later in the paper but is a pop-linguistic article from an online
newspaper, not an academic article.

Final words

To finish on a positive note, I would like to highlight the points I
really like in the paper. First, its collaborative and open nature, like
all Wikimedia projects, gives him a much higher chance of success than its
predecessors. If UNL is not too well-known, it’s not because it didn't
yield research achievements, but because one selected institution per
language is working on it and keep the resources and software within the
lab walls. Secondly, there are some very welcome out-of-scope features:
conversion from natural language, restriction to encyclopedic style text.
This will allow for more focused effort towards the end goal, making it
more achievable. And finally, the choice to go with symbolic/rule-based
system with a touch of other ML where useful. This is, as said in the
paper, a big win for explainability and using human contributions to build
the system. This will also keep the computing cost to a more sane baseline
than what the current deep learning models require.
I think the project can succeed thanks to its openess, yet there is are
real dangers visible in the paper: on the NLP side to reinvent a wheel that
took 40 years to build, and on the technical side to lose time and effort
on a project not required per se for AW to be build.
As I spend a significant time (~10 hours) gathering references and writing
this email (which is 5 pages long in Word), I would like to be mentioned as
co-author in the final paper if any idea or references presented here is
used in it.
Best regards,
Louis Lecailliez
PS:
4. Typos

"These two projects will considerably expand the capabilities of the

Wikimedia platform to enable every single human being to freely share share
in the sum of all knowledge." => duplicate share

"The content is than turned into" => The content is then turned into
"[26] Charles J Fillmore, Russell Lee-Goldman, and Russell Rhodes. The

framenet constructicon. Sign-based construction grammar, pages 309–372,
2012." => The framenet construction

"These function finally can call the lexicographic knowlegde stored in

Wikidata." => These function finally can call the lexicographic knowledge
stored in Wikidata

"[102] George Kinsley Zipf. Human Behavior and the Pirnciple of Least

Effort. Addison-Wesley, 1949." => [102] George Kinsley Zipf. Human Behavior
and the Principle of Least Effort. Addison-Wesley, 1949.

"Allowing the individual language Wikipedias to call Wikilambda has an

addtional benefit." => Allowing the individual language Wikipedias to call
Wikilambda has an additional benefit.
_______________________________________________
Abstract-Wikipedia mailing list
Abstract-Wikipedia@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/abstract-wikipedia
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://lists.wikimedia.org/pipermail/abstract-wikipedia/attachments/20200704/5cb85890/attachment.html
------------------------------
Subject: Digest Footer
_______________________________________________
Abstract-Wikipedia mailing list
Abstract-Wikipedia@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/abstract-wikipedia
------------------------------
End of Abstract-Wikipedia Digest, Vol 1, Issue 6
************************************************

Re: [Abstract-wikipedia] NLP issues severely overlooked (Amir E. Aharoni)