Dear Mr.,
I thank you for your answer. I have just discussed with Mahir Morshed who is a proficient contributor to Lexicographical Data and has an advanced knowledge on natural language processing techniques. We found out that the matter is caused by the use of Unicode NFC Normalization that seems to alter the order of Arabic diacritics in words. I ask whether this can be fixed now or not.
Yours Sincerely,
Houcemeddine Turki

De : Cory Massaro <cmassaro@wikimedia.org>
Envoyé : lundi 20 septembre 2021 19:54
À : General public mailing list for the discussion of Abstract Wikipedia and Wikifunctions <abstract-wikipedia@lists.wikimedia.org>
Objet : [Abstract-wikipedia] Re: Abstract Wikipedia and Arabic
 
Hi Houcemeddine,

It is awesome that you're writing these functions already! I'm curious about your first point concerning the order of diacritics in Arabic. Can you say more? Can you perhaps give an example of what you tried to do, what you expected to happen, and the result you got? I would like to try to reproduce (and hopefully fix) that problem!

Thank you,
Cory

On Sun, Sep 19, 2021 at 3:48 AM Houcemeddine A. Turki <turkiabdelwaheb@hotmail.fr> wrote:
Dear all,
I thank you for your contributions to the Wikifunctions Project. As an end user of the Wikifunctions Project, I have been invited to speak at WikiArabia about Wikifunctions and Abstract Wikipedia in Arabic. That is why I developed and implemented several linguistic functions for Arabic Languages:
  • Root and Pattern-Based Generator of Lexemes for Arabic Languages (Z10157)
  • Pattern-Root Compatibility Verifier for Arabic Languages (Z10160)
  • IPA Generator for Diacritized Arabic Script Texts in Tunisian Arabic (Z10163)
This implies the creation of Python codes for the three functions, the development of test functions and the description of the developed functions. When developing the functions, I have found several matters that can be solved in the next few months:
  1. When a word assigns two Arabic Diacritics to a letter, this can cause a deficiency to the system. For example, كَرَّر has two Arabic diacritics (a shaddah and a fatha) on its second letter. The shaddah should be below the Fatha as its effect should come first. The Wikifunctions compilers do not efficiently consider that and this can harm the processing of the languages using the Arabic Script. This should be fixed.
  2. The identation of the source code should be done by hand after pasting the code into the field. There is no automatic identation for pasted source codes. This can alter the user experience.  
  3. The mobile edition of the website does not work. Lucas Werkmeister has raised a ticket about this (T291325).
  4. All these linguistic functions are taken from reference grammar books. It will be interesting to have a function that assigns a Wikidata item as a reference of a Wikifunctions function.
  5. The runtime of the website is signficantly important. Several efforts should be done to make this project quicker.
  6. It will be interesting to align inputs with their corresponding Wikidata items to have better semantics for the functions.
  7. System messages are not absolutely user-friendly. This can be fixed.
  8. The token for the connection to NotWikiLambda does not allow a long connection. It almost disconnects every fifteen minutes.
Yours Sincerely,
Houcemeddine Turki
_______________________________________________
Abstract-Wikipedia mailing list -- abstract-wikipedia@lists.wikimedia.org
List information: https://lists.wikimedia.org/postorius/lists/abstract-wikipedia.lists.wikimedia.org/