Saluton ĉiuj kundisvolvantoj,
I think the subject summarize it all, so here are more details on what
I'm trying to do and what I'm looking for.
# Context
You might skip this section if you are not interested in contextual
verbiage. If you would like to react to anything stated in this section,
please change the email subject to reflect that.
So I'm currently meditating ways to improve factorization of knowledge
stored in Wiktionary.
I'm taking a multi-approach experimentation there. On the one hand, I
just began a Wikiversity project
<https://fr.wikiversity.org/wiki/Recherche:Recueil_lexicologique_%C3%A0_l%E2%80%99usage_des_Wiktionnaires>
(in French) to establish a specification of how a DBMS should be
structured to be useful for Wiktionaries. It mainly emerged from my
point of view that the current data model
<https://www.mediawiki.org/wiki/Extension:WikibaseLexeme/Data_Model>
proposed for the wikidata for wiktionary
<https://www.wikidata.org/wiki/Wikidata:Wiktionary> does not fit needs
of Wiktionary contributors. I did made some alternative proposals
<https://www.mediawiki.org/wiki/Extension_talk:WikibaseLexeme/Data_Model>,
and tried to gather a first feedback from the French wiktionary
<https://fr.wiktionary.org/wiki/Wiktionnaire:Wikid%C3%A9mie/septembre_2017#Vers_la_conception_d.E2.80.99une_base_de_donn.C3.A9e_relationnelle_con.C3.A7u_pour_servir_de_support_aux_Wiktionnaires>
on this model too, which led me to the creation Wikiversity research
project because I was pointed to the lack of "specify extensively the
needs before you model".
Now, on an other hand, I'm also trying to factorize some data within the
Wikitionary with current available tools. One driving topic for that is
fixing gender gap
<https://fr.wiktionary.org/wiki/Discussion_Projet:Parit%C3%A9_des_genres>,
and more broadly inflection-form gap. That is a feminine form will
generally be summarized in a laconic "feminine form of *some-term*",
rather than being treated as an entry of it's own. That's all the more
problematic in cases where a word only share a subset of relevant
definitions depending on which gender(/inflection-form) it applies to.
# What I'm trying to do
I am trying to factorize data which pertains to several
inflection-forms. This way each form can use it to build a stand-alone
article about a term. The current approach tends to be gathering
everything under a single lemme, although some statements will only
pertains to some specific forms.
So far I experimented with transclusion of subpages to share
definitions, examples and so on between inflection-forms. Well, from a
consultation point of view it works. But from an editing point of view,
it's all but fine.
What I would think interesting, is to store this data in a scribunto
data module (at least for now), and enable user to change them while
editing an lexical entry article. That might be, when using the visual
editor, through something like a model popup. Wikitext editors will
probably be skilled enough to edit the relevant module, but for the sake
of convenience, it might be interesting to allow to give a parameter to
the model, which would at publishing time modify the data module and
remove the parameter from the wikitext generated.
Let's take an example to make a bit clearer. Let's take the French pair
"contributeur/contributrice". In both article, I would like that the
definition could be generated from transclusion with something like
{{definition|vocable=contributrice|lang=French|gloss=contributor}}. Note
that this template might, by default, take into account the name of the
calling page, thus avoiding the "vocable" parameter. Also, the lang
would be required in contributrice, as this is a vocable which exist in
at least in French and Italian. But it would not be required in
"contributeur", nor "contributore". Finaly gloss is a string whose
purpose is to distinguish a given term in case of homonymy. When no
homonym exist, it might be skiped. So in "contributeur", one might
simply use {{definition}}, but in "contributrice", one should at least
use {{definition|lang=French}}. Now, that's for the purely consultative
side of the data.
On the backend side, my idea would be to store this data, at least for
now, in scribunto data module. So for example in
"Module:Vocable/contributrice", one might store all descriptive data
about this vocable. I didn't thought yet about the exact structure of
what would be stored in this kind of module, but the idea is that the
misc. templates such as *definition*, *example*, and so on would serve
as interface for this modules, so most contributors would not need to
care about this structure.
So, precisely, in case of {{definition}}, one should be able to wikitext
edit the "contributeur" article and to write something like
{{definition|value=A person who contribute}}. And on publish, it would
store the given value in appropriate module and change the wikicode so
it will only retain {{definition}}. Also, if someone would write
{{definition|lang=French|value=Someone who [[contribute]]}}, then the
same module entry should be changed and the resulting wikicode should
substitute the template invocation with {{definition|lang=French}}. The
same parameter conservation should be applied for the gloss parameter.
Of course from a visual editor point of view, all that should be even
more easy, with the "value" parameter being mandatory and always filled
with the matching module value.
Well, at least all that is my current goal. If you have other
suggestions, I would be glad to read them. Anyway, I would be also very
interested to know if what I just described is currently possible. If it
is, what documentation/existing module I should look at in order to
achieve it. If it's not, what about adding software support in order to
make it possible?
Kind regards,
mathieu