Saluton ĉiuj kundisvolvantoj,
I think the subject summarize it all, so here are more details on what I'm trying to do and what I'm looking for.
# Context
You might skip this section if you are not interested in contextual verbiage. If you would like to react to anything stated in this section, please change the email subject to reflect that.
So I'm currently meditating ways to improve factorization of knowledge stored in Wiktionary.
I'm taking a multi-approach experimentation there. On the one hand, I just began a Wikiversity project https://fr.wikiversity.org/wiki/Recherche:Recueil_lexicologique_%C3%A0_l%E2%80%99usage_des_Wiktionnaires (in French) to establish a specification of how a DBMS should be structured to be useful for Wiktionaries. It mainly emerged from my point of view that the current data model https://www.mediawiki.org/wiki/Extension:WikibaseLexeme/Data_Model proposed for the wikidata for wiktionary https://www.wikidata.org/wiki/Wikidata:Wiktionary does not fit needs of Wiktionary contributors. I did made some alternative proposals https://www.mediawiki.org/wiki/Extension_talk:WikibaseLexeme/Data_Model, and tried to gather a first feedback from the French wiktionary https://fr.wiktionary.org/wiki/Wiktionnaire:Wikid%C3%A9mie/septembre_2017#Vers_la_conception_d.E2.80.99une_base_de_donn.C3.A9e_relationnelle_con.C3.A7u_pour_servir_de_support_aux_Wiktionnaires on this model too, which led me to the creation Wikiversity research project because I was pointed to the lack of "specify extensively the needs before you model".
Now, on an other hand, I'm also trying to factorize some data within the Wikitionary with current available tools. One driving topic for that is fixing gender gap https://fr.wiktionary.org/wiki/Discussion_Projet:Parit%C3%A9_des_genres, and more broadly inflection-form gap. That is a feminine form will generally be summarized in a laconic "feminine form of *some-term*", rather than being treated as an entry of it's own. That's all the more problematic in cases where a word only share a subset of relevant definitions depending on which gender(/inflection-form) it applies to.
# What I'm trying to do
I am trying to factorize data which pertains to several inflection-forms. This way each form can use it to build a stand-alone article about a term. The current approach tends to be gathering everything under a single lemme, although some statements will only pertains to some specific forms.
So far I experimented with transclusion of subpages to share definitions, examples and so on between inflection-forms. Well, from a consultation point of view it works. But from an editing point of view, it's all but fine.
What I would think interesting, is to store this data in a scribunto data module (at least for now), and enable user to change them while editing an lexical entry article. That might be, when using the visual editor, through something like a model popup. Wikitext editors will probably be skilled enough to edit the relevant module, but for the sake of convenience, it might be interesting to allow to give a parameter to the model, which would at publishing time modify the data module and remove the parameter from the wikitext generated.
Let's take an example to make a bit clearer. Let's take the French pair "contributeur/contributrice". In both article, I would like that the definition could be generated from transclusion with something like {{definition|vocable=contributrice|lang=French|gloss=contributor}}. Note that this template might, by default, take into account the name of the calling page, thus avoiding the "vocable" parameter. Also, the lang would be required in contributrice, as this is a vocable which exist in at least in French and Italian. But it would not be required in "contributeur", nor "contributore". Finaly gloss is a string whose purpose is to distinguish a given term in case of homonymy. When no homonym exist, it might be skiped. So in "contributeur", one might simply use {{definition}}, but in "contributrice", one should at least use {{definition|lang=French}}. Now, that's for the purely consultative side of the data.
On the backend side, my idea would be to store this data, at least for now, in scribunto data module. So for example in "Module:Vocable/contributrice", one might store all descriptive data about this vocable. I didn't thought yet about the exact structure of what would be stored in this kind of module, but the idea is that the misc. templates such as *definition*, *example*, and so on would serve as interface for this modules, so most contributors would not need to care about this structure.
So, precisely, in case of {{definition}}, one should be able to wikitext edit the "contributeur" article and to write something like {{definition|value=A person who contribute}}. And on publish, it would store the given value in appropriate module and change the wikicode so it will only retain {{definition}}. Also, if someone would write {{definition|lang=French|value=Someone who [[contribute]]}}, then the same module entry should be changed and the resulting wikicode should substitute the template invocation with {{definition|lang=French}}. The same parameter conservation should be applied for the gloss parameter.
Of course from a visual editor point of view, all that should be even more easy, with the "value" parameter being mandatory and always filled with the matching module value.
Well, at least all that is my current goal. If you have other suggestions, I would be glad to read them. Anyway, I would be also very interested to know if what I just described is currently possible. If it is, what documentation/existing module I should look at in order to achieve it. If it's not, what about adding software support in order to make it possible?
Kind regards, mathieu
Well, I have investigated a bit, and so far the only reference where I found a description of saving/retrieving data from a Scribunto module is SemanticScribunto.
https://github.com/SemanticMediaWiki/SemanticScribunto https://upload.wikimedia.org/wikipedia/mediawiki/7/75/EMWCon_Spring_2017_-_I...
It's build on Semantic Mediawiki, well, I don't know much about that. All that seems interesting, although a far more large than what I was looking for, at least to begin with. What I would like is a way to quickly prototype and drop some trials. This extension seems fine but it already requires to install extension, so it would be more difficult to put that on an existing wiki so I can have feedback on prototypes. Maybe the least resistance path would be to install a mediawiki instance on toolforge…
Le 17/09/2017 à 14:05, mathieu stumpf guntz a écrit :
Saluton ĉiuj kundisvolvantoj,
I think the subject summarize it all, so here are more details on what I'm trying to do and what I'm looking for.
# Context
You might skip this section if you are not interested in contextual verbiage. If you would like to react to anything stated in this section, please change the email subject to reflect that.
So I'm currently meditating ways to improve factorization of knowledge stored in Wiktionary.
I'm taking a multi-approach experimentation there. On the one hand, I just began a Wikiversity project https://fr.wikiversity.org/wiki/Recherche:Recueil_lexicologique_%C3%A0_l%E2%80%99usage_des_Wiktionnaires (in French) to establish a specification of how a DBMS should be structured to be useful for Wiktionaries. It mainly emerged from my point of view that the current data model https://www.mediawiki.org/wiki/Extension:WikibaseLexeme/Data_Model proposed for the wikidata for wiktionary https://www.wikidata.org/wiki/Wikidata:Wiktionary does not fit needs of Wiktionary contributors. I did made some alternative proposals https://www.mediawiki.org/wiki/Extension_talk:WikibaseLexeme/Data_Model, and tried to gather a first feedback from the French wiktionary https://fr.wiktionary.org/wiki/Wiktionnaire:Wikid%C3%A9mie/septembre_2017#Vers_la_conception_d.E2.80.99une_base_de_donn.C3.A9e_relationnelle_con.C3.A7u_pour_servir_de_support_aux_Wiktionnaires on this model too, which led me to the creation Wikiversity research project because I was pointed to the lack of "specify extensively the needs before you model".
Now, on an other hand, I'm also trying to factorize some data within the Wikitionary with current available tools. One driving topic for that is fixing gender gap https://fr.wiktionary.org/wiki/Discussion_Projet:Parit%C3%A9_des_genres, and more broadly inflection-form gap. That is a feminine form will generally be summarized in a laconic "feminine form of *some-term*", rather than being treated as an entry of it's own. That's all the more problematic in cases where a word only share a subset of relevant definitions depending on which gender(/inflection-form) it applies to.
# What I'm trying to do
I am trying to factorize data which pertains to several inflection-forms. This way each form can use it to build a stand-alone article about a term. The current approach tends to be gathering everything under a single lemme, although some statements will only pertains to some specific forms.
So far I experimented with transclusion of subpages to share definitions, examples and so on between inflection-forms. Well, from a consultation point of view it works. But from an editing point of view, it's all but fine.
What I would think interesting, is to store this data in a scribunto data module (at least for now), and enable user to change them while editing an lexical entry article. That might be, when using the visual editor, through something like a model popup. Wikitext editors will probably be skilled enough to edit the relevant module, but for the sake of convenience, it might be interesting to allow to give a parameter to the model, which would at publishing time modify the data module and remove the parameter from the wikitext generated.
Let's take an example to make a bit clearer. Let's take the French pair "contributeur/contributrice". In both article, I would like that the definition could be generated from transclusion with something like {{definition|vocable=contributrice|lang=French|gloss=contributor}}. Note that this template might, by default, take into account the name of the calling page, thus avoiding the "vocable" parameter. Also, the lang would be required in contributrice, as this is a vocable which exist in at least in French and Italian. But it would not be required in "contributeur", nor "contributore". Finaly gloss is a string whose purpose is to distinguish a given term in case of homonymy. When no homonym exist, it might be skiped. So in "contributeur", one might simply use {{definition}}, but in "contributrice", one should at least use {{definition|lang=French}}. Now, that's for the purely consultative side of the data.
On the backend side, my idea would be to store this data, at least for now, in scribunto data module. So for example in "Module:Vocable/contributrice", one might store all descriptive data about this vocable. I didn't thought yet about the exact structure of what would be stored in this kind of module, but the idea is that the misc. templates such as *definition*, *example*, and so on would serve as interface for this modules, so most contributors would not need to care about this structure.
So, precisely, in case of {{definition}}, one should be able to wikitext edit the "contributeur" article and to write something like {{definition|value=A person who contribute}}. And on publish, it would store the given value in appropriate module and change the wikicode so it will only retain {{definition}}. Also, if someone would write {{definition|lang=French|value=Someone who [[contribute]]}}, then the same module entry should be changed and the resulting wikicode should substitute the template invocation with {{definition|lang=French}}. The same parameter conservation should be applied for the gloss parameter.
Of course from a visual editor point of view, all that should be even more easy, with the "value" parameter being mandatory and always filled with the matching module value.
Well, at least all that is my current goal. If you have other suggestions, I would be glad to read them. Anyway, I would be also very interested to know if what I just described is currently possible. If it is, what documentation/existing module I should look at in order to achieve it. If it's not, what about adding software support in order to make it possible?
Kind regards, mathieu
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Le 19/09/2017 à 00:38, mathieu stumpf guntz a écrit :
Well, I have investigated a bit, and so far the only reference where I found a description of saving/retrieving data from a Scribunto module is SemanticScribunto.
Hum, it was a bit late when I wrote that, so of course *load* data from a Scribunto module is no big deal. The real issue which is on my way is the ability to *save* data. As far as I know, a module has currently no way to save data. No doubt it's a good thing that they can't make arbitrary change any page on the wiki. But having ability to write a limited amount of bytes in a single data module per script call, and possibly others safeguard limits, wouldn't be that risky, would it?
So if this is something already possible, then please let me know.
If it's not, please provide me some feed back on the proposal to add such a function, and if I should document such a proposal elsewhere, please let me know.
Kind regards, mathieu
https://github.com/SemanticMediaWiki/SemanticScribunto https://upload.wikimedia.org/wikipedia/mediawiki/7/75/EMWCon_Spring_2017_-_I...
It's build on Semantic Mediawiki, well, I don't know much about that. All that seems interesting, although a far more large than what I was looking for, at least to begin with. What I would like is a way to quickly prototype and drop some trials. This extension seems fine but it already requires to install extension, so it would be more difficult to put that on an existing wiki so I can have feedback on prototypes. Maybe the least resistance path would be to install a mediawiki instance on toolforge…
Le 17/09/2017 à 14:05, mathieu stumpf guntz a écrit :
Saluton ĉiuj kundisvolvantoj,
I think the subject summarize it all, so here are more details on what I'm trying to do and what I'm looking for.
# Context
You might skip this section if you are not interested in contextual verbiage. If you would like to react to anything stated in this section, please change the email subject to reflect that.
So I'm currently meditating ways to improve factorization of knowledge stored in Wiktionary.
I'm taking a multi-approach experimentation there. On the one hand, I just began a Wikiversity project https://fr.wikiversity.org/wiki/Recherche:Recueil_lexicologique_%C3%A0_l%E2%80%99usage_des_Wiktionnaires (in French) to establish a specification of how a DBMS should be structured to be useful for Wiktionaries. It mainly emerged from my point of view that the current data model https://www.mediawiki.org/wiki/Extension:WikibaseLexeme/Data_Model proposed for the wikidata for wiktionary https://www.wikidata.org/wiki/Wikidata:Wiktionary does not fit needs of Wiktionary contributors. I did made some alternative proposals https://www.mediawiki.org/wiki/Extension_talk:WikibaseLexeme/Data_Model, and tried to gather a first feedback from the French wiktionary https://fr.wiktionary.org/wiki/Wiktionnaire:Wikid%C3%A9mie/septembre_2017#Vers_la_conception_d.E2.80.99une_base_de_donn.C3.A9e_relationnelle_con.C3.A7u_pour_servir_de_support_aux_Wiktionnaires on this model too, which led me to the creation Wikiversity research project because I was pointed to the lack of "specify extensively the needs before you model".
Now, on an other hand, I'm also trying to factorize some data within the Wikitionary with current available tools. One driving topic for that is fixing gender gap https://fr.wiktionary.org/wiki/Discussion_Projet:Parit%C3%A9_des_genres, and more broadly inflection-form gap. That is a feminine form will generally be summarized in a laconic "feminine form of *some-term*", rather than being treated as an entry of it's own. That's all the more problematic in cases where a word only share a subset of relevant definitions depending on which gender(/inflection-form) it applies to.
# What I'm trying to do
I am trying to factorize data which pertains to several inflection-forms. This way each form can use it to build a stand-alone article about a term. The current approach tends to be gathering everything under a single lemme, although some statements will only pertains to some specific forms.
So far I experimented with transclusion of subpages to share definitions, examples and so on between inflection-forms. Well, from a consultation point of view it works. But from an editing point of view, it's all but fine.
What I would think interesting, is to store this data in a scribunto data module (at least for now), and enable user to change them while editing an lexical entry article. That might be, when using the visual editor, through something like a model popup. Wikitext editors will probably be skilled enough to edit the relevant module, but for the sake of convenience, it might be interesting to allow to give a parameter to the model, which would at publishing time modify the data module and remove the parameter from the wikitext generated.
Let's take an example to make a bit clearer. Let's take the French pair "contributeur/contributrice". In both article, I would like that the definition could be generated from transclusion with something like {{definition|vocable=contributrice|lang=French|gloss=contributor}}. Note that this template might, by default, take into account the name of the calling page, thus avoiding the "vocable" parameter. Also, the lang would be required in contributrice, as this is a vocable which exist in at least in French and Italian. But it would not be required in "contributeur", nor "contributore". Finaly gloss is a string whose purpose is to distinguish a given term in case of homonymy. When no homonym exist, it might be skiped. So in "contributeur", one might simply use {{definition}}, but in "contributrice", one should at least use {{definition|lang=French}}. Now, that's for the purely consultative side of the data.
On the backend side, my idea would be to store this data, at least for now, in scribunto data module. So for example in "Module:Vocable/contributrice", one might store all descriptive data about this vocable. I didn't thought yet about the exact structure of what would be stored in this kind of module, but the idea is that the misc. templates such as *definition*, *example*, and so on would serve as interface for this modules, so most contributors would not need to care about this structure.
So, precisely, in case of {{definition}}, one should be able to wikitext edit the "contributeur" article and to write something like {{definition|value=A person who contribute}}. And on publish, it would store the given value in appropriate module and change the wikicode so it will only retain {{definition}}. Also, if someone would write {{definition|lang=French|value=Someone who [[contribute]]}}, then the same module entry should be changed and the resulting wikicode should substitute the template invocation with {{definition|lang=French}}. The same parameter conservation should be applied for the gloss parameter.
Of course from a visual editor point of view, all that should be even more easy, with the "value" parameter being mandatory and always filled with the matching module value.
Well, at least all that is my current goal. If you have other suggestions, I would be glad to read them. Anyway, I would be also very interested to know if what I just described is currently possible. If it is, what documentation/existing module I should look at in order to achieve it. If it's not, what about adding software support in order to make it possible?
Kind regards, mathieu
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
On Tue, Sep 19, 2017 at 2:48 AM, mathieu stumpf guntz < psychoslave@culture-libre.org> wrote:
But having ability to write a limited amount of bytes in a single data
module per script call, and possibly others safeguard limits, wouldn't be that risky, would it?
It would break T67258 https://phabricator.wikimedia.org/T67258. I also think it's probably a very bad idea to be trying to have page parses make edits to the wiki.
If it's not, please provide me some feed back on the proposal to add such a function, and if I should document such a proposal elsewhere, please let me know.
You're free to file a task in Phabricator, but it will be closed as Declined. There are too many potential issues there for far too little benefit.
Your initial idea of somehow hooking into the editor (whether that's the wikitext editor or VE) with JavaScript to allow humans to make edits to the data module while editing another page was much better.
Le 19/09/2017 à 15:29, Brad Jorsch (Anomie) a écrit :
On Tue, Sep 19, 2017 at 2:48 AM, mathieu stumpf guntz < psychoslave@culture-libre.org> wrote:
But having ability to write a limited amount of bytes in a single data
module per script call, and possibly others safeguard limits, wouldn't be that risky, would it?
It would break T67258 https://phabricator.wikimedia.org/T67258. I also think it's probably a very bad idea to be trying to have page parses make edits to the wiki.
Well, actually, depending on what you mean with have page parses make edits to the wiki, I'm not sure what I'm looking for fall under this umbrella.
What I would like is a way to do something like
local data = mw.loadData( 'Module:Name/data/entry' ) -- do some stuff with `data` mw.saveData( 'Module:Name/data/entry', data)
That's it.
If it's not, please provide me some feed back on the proposal to add such a function, and if I should document such a proposal elsewhere, please let me know.
You're free to file a task in Phabricator, but it will be closed as Declined. There are too many potential issues there for far too little benefit.
If you do think it's an horrible akward awfully disgusting idea , I would be interested to know more technical details on what problems it might lead to (not the global result of nightmarish hell on earth that I'm obviously targeting )
Your initial idea of somehow hooking into the editor (whether that's the wikitext editor or VE) with JavaScript to allow humans to make edits to the data module while editing another page was much better.
I didn't even thought about JS actually. For the wikitext removal of updating parameter, I had in mind some inplace template substitution. Does javascript allow to change an other page on the wiki, especially a data module, at some point when the user edit/save an article?
On Tue, Sep 19, 2017 at 8:12 PM, mathieu stumpf guntz < psychoslave@culture-libre.org> wrote:
Well, actually, depending on what you mean with have page parses make edits to the wiki, I'm not sure what I'm looking for fall under this umbrella.
What I would like is a way to do something like
local data = mw.loadData( 'Module:Name/data/entry' ) -- do some stuff with `data` mw.saveData( 'Module:Name/data/entry', data)
That's it.
And that, like all Scribunto modules, would run during the parse of the page.
If you do think it's an horrible akward awfully disgusting idea , I would be interested to know more technical details on what problems it might lead to (not the global result of nightmarish hell on earth that I'm obviously targeting )
Confusion for users when purging a page results in edits somewhere else. Misattribution of the resulting edits. Opportunities for vandals to misuse it. Performance issues. And T67258.
Your initial idea of somehow hooking into the editor (whether that's the wikitext editor or VE) with JavaScript to allow humans to make edits to the data module while editing another page was much better.
I didn't even thought about JS actually. For the wikitext removal of updating parameter, I had in mind some inplace template substitution. Does javascript allow to change an other page on the wiki, especially a data module, at some point when the user edit/save an article?
That's what your original message sounded like you were talking about. A JavaScript gadget of whatever sort could make edits by using the action API.
Le 20/09/2017 à 14:56, Brad Jorsch (Anomie) a écrit :
On Tue, Sep 19, 2017 at 8:12 PM, mathieu stumpf guntz < psychoslave@culture-libre.org> wrote:
Well, actually, depending on what you mean with have page parses make edits to the wiki, I'm not sure what I'm looking for fall under this umbrella.
What I would like is a way to do something like
local data = mw.loadData( 'Module:Name/data/entry' ) -- do some stuff with `data` mw.saveData( 'Module:Name/data/entry', data)
That's it.
And that, like all Scribunto modules, would run during the parse of the page.
Ok. Probably I miss knowledge about the page generation pipeline, so if there are information you think I should be aware of to understand problems it might generate, please provide them, links are appropriate answer so far as I'm concerned.
If you do think it's an horrible akward awfully disgusting idea , I would be interested to know more technical details on what problems it might lead to (not the global result of nightmarish hell on earth that I'm obviously targeting )
Confusion for users when purging a page results in edits somewhere else.
How would purging a page lead to such a result ? Since, in my proposal, templates with an updating parameter would actually never be recorded in the wikitext, but removed through a inplace template replacement.
Misattribution of the resulting edits.
That sounds more like a point of specification that an implementation must respect than a technical impossibility, or maybe I miss something.
Opportunities for vandals to misuse it.
Sure but vandals can misuse anything, including bots which on this regard offer far more latitude.
Performance issues.
Well, it's hard to gauge on mere specification, isn't it?
And T67258.
Shoul be discussed on the ticket itself.
Your initial idea of somehow hooking into the editor (whether that's the wikitext editor or VE) with JavaScript to allow humans to make edits to the data module while editing another page was much better.
I didn't even thought about JS actually. For the wikitext removal of updating parameter, I had in mind some inplace template substitution. Does javascript allow to change an other page on the wiki, especially a data module, at some point when the user edit/save an article?
That's what your original message sounded like you were talking about. A JavaScript gadget of whatever sort could make edits by using the action API.
Ok, then that might be an avenue worth exploring for me. I guess https://www.mediawiki.org/wiki/Manual:Interface/JavaScript is a good starting point but any documentation link is welcome.
wikitech-l@lists.wikimedia.org