Hello!
I noticed at sr.wikipedia there is an option "Variant" under "Internationalization" at the preferences. How is that different from the 'sr', 'sr-ec' and 'sr-el' which are shown at "Language" option (also under "Internationalization")?
I'm interested in this because there are some differences between "Brazilian Portuguese" ('pt-br') and "Portuguese of Portugal" ('pt') which usually cause troubles for the admins at the Portuguese projects, who needs to warn the users not to change the wording of the texts from one variant to another (this usually happens, mainly from anonymous contributions), because some differences between the variants seems to be [at a first glance] a typo, and they want to "correct" it...
So, I would like to know if there is currently any feature which could help us to avoid the problem of having a divided community of users ('pt' x 'pt-br') "fighting" with each other ad infinitum... (and to avoid proposals like that [1] of a new "Brazilian Wikipedia", which IMHO will not have any good result, and is not the better way of solving the problem...)
I found [http://strategy.wikimedia.org/w/index.php?title=Proposal_talk%3AA_Brazilian_... a comment] about the existence of "on-the-fly translation" for some languages (Chinese and Serbian), but I don't know how it works, and if it solves or improve the situation.
And before this I was also thinking of use (a possible enhanced version of) a procedure like this: considering that currently it is possible to show a system message using {{int:MESSAGE}} in the wikitext in a way that the result changes according to the user's language, would it be possible to create new messages at "MediaWiki:" Namespace just for defining language variants of words which usually appears at the content of the projects? For example, would it be possible to create "MediaWiki:WORD/pt-br" and "MediaWiki:WORD/pt", and use them (with {{int:WORD}}) instead of the actual word variant in wikitext? This isn't likely to be the better solution, but it could be a first step towards a solution...
Any thoughts on how could Portuguese community improve the situation at pt.* projects? (is there any other list I should ask about this?)
Helder
[1] http://strategy.wikimedia.org/wiki/Proposal:A_Brazilian_Portuguese_Wikipedia
2009/9/9 Helder Geovane Gomes de Lima heldergeovane@gmail.com:
Hello!
I noticed at sr.wikipedia there is an option "Variant" under "Internationalization" at the preferences. How is that different from the 'sr', 'sr-ec' and 'sr-el' which are shown at "Language" option (also under "Internationalization")?
I'm interested in this because there are some differences between "Brazilian Portuguese" ('pt-br') and "Portuguese of Portugal" ('pt') which usually cause troubles for the admins at the Portuguese projects, who needs to warn the users not to change the wording of the texts from one variant to another (this usually happens, mainly from anonymous contributions), because some differences between the variants seems to be [at a first glance] a typo, and they want to "correct" it...
sr-ec and sr-el refer to the Latin and Cyrillic variants of Serbian (not sure which is which), and AFAIK the software can convert everything, even article text, because the conversion rules are so simple that a computer can execute them. Basically, sr-ec and sr-el have the same text in the same language, but in different alphabets. (This is my understanding, which may be completely wrong; in that case, please correct me.)
The difference between pt and pt-br are more delicate than that, and the two can't be autoconverted between by a computer, because of differences in spelling word usage and grammar(?).
So, I would like to know if there is currently any feature which could help us to avoid the problem of having a divided community of users ('pt' x 'pt-br') "fighting" with each other ad infinitum... (and to avoid proposals like that [1] of a new "Brazilian Wikipedia", which IMHO will not have any good result, and is not the better way of solving the problem...)
No. We already offer users the choice between having the interface in pt or pt-br (or any other language, really), but such a choice doesn't exist for the content.
I found [http://strategy.wikimedia.org/w/index.php?title=Proposal_talk%3AA_Brazilian_... a comment] about the existence of "on-the-fly translation" for some languages (Chinese and Serbian), but I don't know how it works, and if it solves or improve the situation.
That's the alphabet variant thing I mentioned earlier. If the majority of the differences between pt and pt-br can be summed up with simple rules that a computer can handle, we might be able to work something out. However, that's usually not the case; I don't know Portugese, but I do know that handling even simple differences between en-us and en-gb is too complex already: a system that would successfully convert 'realise' to 'realize' may also try to wrongfully convert 'disguise'.
And before this I was also thinking of use (a possible enhanced version of) a procedure like this: considering that currently it is possible to show a system message using {{int:MESSAGE}} in the wikitext in a way that the result changes according to the user's language, would it be possible to create new messages at "MediaWiki:" Namespace just for defining language variants of words which usually appears at the content of the projects? For example, would it be possible to create "MediaWiki:WORD/pt-br" and "MediaWiki:WORD/pt", and use them (with {{int:WORD}}) instead of the actual word variant in wikitext? This isn't likely to be the better solution, but it could be a first step towards a solution...
This sounds like it could work, but only if the /langcode trick actually works (I don't know what that depends on) and if there's a relatively small set of words that makes a relatively big difference (otherwise it'd be more trouble than it's worth IMO; but that's up to the community).
Roan Kattouw (Catrope)
2009/9/9 Roan Kattouw roan.kattouw@gmail.com:
2009/9/9 Helder Geovane Gomes de Lima heldergeovane@gmail.com:
So, I would like to know if there is currently any feature which could help us to avoid the problem of having a divided community of users ('pt' x 'pt-br') "fighting" with each other ad infinitum... (and to avoid proposals like that [1] of a new "Brazilian Wikipedia", which IMHO will not have any good result, and is not the better way of solving the problem...)
No. We already offer users the choice between having the interface in pt or pt-br (or any other language, really), but such a choice doesn't exist for the content.
This is a community issue. Having a single pt:wp is a win because there's more content in one place and it avoids local-POV bias, same as there's one en:wp rather than US-English and Commonwealth-English.
So you need a community rule.
The rule we have on en:wp is:
1. It doesn't matter. 2. Use the variant spoken in the location, if relevant. 3. Don't change articles from one to the other except per 2. 4. Try not to worry too much about it.
4. is the important step ;-) It should be simple enough to let new users know the rule and "not to worry about which variant" :-)
- d.
Roan Kattouw wrote:
That's the alphabet variant thing I mentioned earlier. If the majority of the differences between pt and pt-br can be summed up with simple rules that a computer can handle, we might be able to work something out. However, that's usually not the case; I don't know Portugese, but I do know that handling even simple differences between en-us and en-gb is too complex already: a system that would successfully convert 'realise' to 'realize' may also try to wrongfully convert 'disguise'.
I don't know why you're writing this nonsense, you obviously haven't looked at the code at all.
The language variant system that we have could easily convert between US and UK English. In fact it already does convert between a language pair with a far more complex relationship, that is Simplified and Traditional Chinese.
The language conversion system is very simple, it's just a table of translated pairs, where the longest match takes precedence. The translation table in one direction (e.g. UK -> US) can be different to the table in the other direction (US -> UK). You would not list "ize -> ise", you would list every word in the dictionary with an -ize ending that can be translated to -ise without controversy. The current software could handle 50k pairs or so without serious performance problems, and it could be extended and optimised to allow millions of pairs if there was a need for that.
It's possible to handle any pair of languages which are separated only by vocabulary, and transliteration or spelling. It's only differences in grammar, such as word order, that would give it trouble.
-- Tim Starling
Nice! ;-)
Do you think tables like these http://pt.wiktionary.org/wiki/Wikcion%C3%A1rio:Vers%C3%B5es da língua portuguesa/Tabela http://pt.wikipedia.org/wiki/Wikipedia:Vers%C3%B5es da língua portuguesa/tabela could be a start point to a similar conversion system for pt <-> pt-br?
Meanwhile, I was also trying to adapt the Template:LangSwitch from Wikimedia Commons (http://commons.wikimedia.org/wiki/Template:LangSwitch), in order to be able to use the template syntax like this: {{Language variations| pt = word 1| pt-br = word 2}}
For this, I've created two pages: * MediaWiki:Lang, with 'pt' * MediaWiki:Lang/pt-br, with 'pt-br'
and the template code is essentially: {{#switch:{{int:Lang}} |pt-br={{{pt-br|}}} |pt |#default={{{pt|}}} }}
But I wasn't able to create a param "default" in order we could set which of the variants will be shown by default for anonymous users. It would be good if we could use {{Language variations| default = pt-br | pt = word 1| pt-br = word 2}} to get: (a) word 2, for annonimous users; (b) word 1, for logged users which choose 'pt' in their preferences; (c) word 2, for logged users which choose 'pt-br' in their preferences; The option (a) would be necessary if we don't want to change an existing text from 'pt-br' to 'pt' (for anonymous users) just because we want the logged users to be able to choose the "content variant".
Is there any way of detect if the reader is logged in with something in the style {{#if: <what?> | foo| bar}}? (the problem with {{int:Lang}} is that for anonymous users and for users who choose 'pt' the result is the same: 'pt', so I can't distinguish these two cases at the template...)
Anyway, I think it would be better to have some kind of an automatized conversion system, even if it doesn't convert all cases ( at least for the words in the tables above it would be useful)
Thank you for all,
Helder
2009/9/9 Tim Starling tstarling@wikimedia.org:
Roan Kattouw wrote:
That's the alphabet variant thing I mentioned earlier. If the majority of the differences between pt and pt-br can be summed up with simple rules that a computer can handle, we might be able to work something out. However, that's usually not the case; I don't know Portugese, but I do know that handling even simple differences between en-us and en-gb is too complex already: a system that would successfully convert 'realise' to 'realize' may also try to wrongfully convert 'disguise'.
I don't know why you're writing this nonsense, you obviously haven't looked at the code at all.
The language variant system that we have could easily convert between US and UK English. In fact it already does convert between a language pair with a far more complex relationship, that is Simplified and Traditional Chinese.
The language conversion system is very simple, it's just a table of translated pairs, where the longest match takes precedence. The translation table in one direction (e.g. UK -> US) can be different to the table in the other direction (US -> UK). You would not list "ize -> ise", you would list every word in the dictionary with an -ize ending that can be translated to -ise without controversy. The current software could handle 50k pairs or so without serious performance problems, and it could be extended and optimised to allow millions of pairs if there was a need for that.
It's possible to handle any pair of languages which are separated only by vocabulary, and transliteration or spelling. It's only differences in grammar, such as word order, that would give it trouble.
-- Tim Starling
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Helder Geovane Gomes de Lima wrote:
But I wasn't able to create a param "default" in order we could set which of the variants will be shown by default for anonymous users. It would be good if we could use {{Language variations| default = pt-br | pt = word 1| pt-br = word 2}} to get: (a) word 2, for annonimous users; (b) word 1, for logged users which choose 'pt' in their preferences; (c) word 2, for logged users which choose 'pt-br' in their preferences; The option (a) would be necessary if we don't want to change an existing text from 'pt-br' to 'pt' (for anonymous users) just because we want the logged users to be able to choose the "content variant".
There's no difference. Anonymous users get the default language. What you could do is having thee "languages": pt (generic Portuguese, default), pt-pt and pt-br.
Is there any way of detect if the reader is logged in with something in the style {{#if: <what?> | foo| bar}}?
No.
On Wed, Sep 9, 2009 at 6:50 PM, Tim Starling tstarling@wikimedia.org wrote:
I don't know why you're writing this nonsense, you obviously haven't looked at the code at all.
This paragraph is unnecessary.
The language variant system that we have could easily convert between US and UK English. In fact it already does convert between a language pair with a far more complex relationship, that is Simplified and Traditional Chinese.
The language conversion system is very simple, it's just a table of translated pairs, where the longest match takes precedence. The translation table in one direction (e.g. UK -> US) can be different to the table in the other direction (US -> UK). You would not list "ize -> ise", you would list every word in the dictionary with an -ize ending that can be translated to -ise without controversy. The current software could handle 50k pairs or so without serious performance problems, and it could be extended and optimised to allow millions of pairs if there was a need for that.
It's possible to handle any pair of languages which are separated only by vocabulary, and transliteration or spelling. It's only differences in grammar, such as word order, that would give it trouble.
Is there any reason nobody's tried adding such support for us/uk English? It would resolve some long-standing tension on enwiki. Would anons have to be given one variant or the other, or would they get untransformed text or what? Does the variant transformation apply to the edit page as well?
On 9/10/09 10:06 AM, Aryeh Gregor wrote:
On Wed, Sep 9, 2009 at 6:50 PM, Tim Starlingtstarling@wikimedia.org wrote:
I don't know why you're writing this nonsense, you obviously haven't looked at the code at all.
This paragraph is unnecessary.
Seriously! Please read things aloud before clicking send. You will hopefully then be able to better detect when it's time to take a break, eat some fruit and take it down a notch.
The language variant system that we have could easily convert between US and UK English. In fact it already does convert between a language pair with a far more complex relationship, that is Simplified and Traditional Chinese.
The language conversion system is very simple, it's just a table of translated pairs, where the longest match takes precedence. The translation table in one direction (e.g. UK -> US) can be different to the table in the other direction (US -> UK). You would not list "ize -> ise", you would list every word in the dictionary with an -ize ending that can be translated to -ise without controversy. The current software could handle 50k pairs or so without serious performance problems, and it could be extended and optimised to allow millions of pairs if there was a need for that.
It's possible to handle any pair of languages which are separated only by vocabulary, and transliteration or spelling. It's only differences in grammar, such as word order, that would give it trouble.
Is there any reason nobody's tried adding such support for us/uk English? It would resolve some long-standing tension on enwiki. Would anons have to be given one variant or the other, or would they get untransformed text or what? Does the variant transformation apply to the edit page as well?
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
The variant system seems poorly understood by most people (including me) which often tends to cause something (like it for instance) to also be under-utilized...
Perhaps we need more information on what it intends to provide the user. All I find in Google on this topic are blurbs about configuration variables and lots of people confused as to what language variants even are...
Is there some awesome documentation somewhere I have yet to find?
- Trevor
On Thu, Sep 10, 2009 at 1:39 PM, Trevor Parscal tparscal@wikimedia.org wrote:
On 9/10/09 10:06 AM, Aryeh Gregor wrote:
On Wed, Sep 9, 2009 at 6:50 PM, Tim Starlingtstarling@wikimedia.org wrote:
I don't know why you're writing this nonsense, you obviously haven't looked at the code at all.
This paragraph is unnecessary.
Seriously! Please read things aloud before clicking send. You will hopefully then be able to better detect when it's time to take a break, eat some fruit and take it down a notch.
The language variant system that we have could easily convert between US and UK English. In fact it already does convert between a language pair with a far more complex relationship, that is Simplified and Traditional Chinese.
The language conversion system is very simple, it's just a table of translated pairs, where the longest match takes precedence. The translation table in one direction (e.g. UK -> US) can be different to the table in the other direction (US -> UK). You would not list "ize -> ise", you would list every word in the dictionary with an -ize ending that can be translated to -ise without controversy. The current software could handle 50k pairs or so without serious performance problems, and it could be extended and optimised to allow millions of pairs if there was a need for that.
It's possible to handle any pair of languages which are separated only by vocabulary, and transliteration or spelling. It's only differences in grammar, such as word order, that would give it trouble.
Is there any reason nobody's tried adding such support for us/uk English? It would resolve some long-standing tension on enwiki. Would anons have to be given one variant or the other, or would they get untransformed text or what? Does the variant transformation apply to the edit page as well?
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
The variant system seems poorly understood by most people (including me) which often tends to cause something (like it for instance) to also be under-utilized...
Perhaps we need more information on what it intends to provide the user. All I find in Google on this topic are blurbs about configuration variables and lots of people confused as to what language variants even are...
Is there some awesome documentation somewhere I have yet to find?
- Trevor
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Nope, but there's a bug asking for documentation :)
https://bugzilla.wikimedia.org/show_bug.cgi?id=19044
I certainly agree that it's completely undocumented and thus not usable to many people. The vast majority of devs--myself included--don't even understand how it works, much less how to use it. Maybe if we had docs, it'd be more usable outside of the (very) small minority who do use and maintain it.
-Chad
2009/9/10 Trevor Parscal tparscal@wikimedia.org:
On 9/10/09 10:06 AM, Aryeh Gregor wrote:
On Wed, Sep 9, 2009 at 6:50 PM, Tim Starlingtstarling@wikimedia.org wrote:
I don't know why you're writing this nonsense, you obviously haven't looked at the code at all.
This paragraph is unnecessary.
Seriously! Please read things aloud before clicking send. You will hopefully then be able to better detect when it's time to take a break, eat some fruit and take it down a notch.
In Tim's defense: I had indeed not looked at the code at all, and what I wrote was incorrect, so what he wrote was completely true. I also mentioned that my understanding of the variant conversion system was limited, and that I might be completely wrong. Turns out I was, and Tim corrected me. It's true that he probably didn't use the most friendly tone in the world, but I've seen much worse, so I don't really care. Let's just drop this before it turns into a flame war; I'd like to keep those off wikitech-l.
The variant system seems poorly understood by most people (including me) which often tends to cause something (like it for instance) to also be under-utilized...
Seems I'm not the only one who had a completely wrong idea about how variants work. We definitely need more documentation and fame for this system, so its potential doesn't go to waste.
Roan Kattouw (Catrope)
On Thu, Sep 10, 2009 at 6:44 PM, Roan Kattouw roan.kattouw@gmail.com wrote:
Seems I'm not the only one who had a completely wrong idea about how variants work. We definitely need more documentation and fame for this system, so its potential doesn't go to waste.
I theoretically knew that it was just a string-replace system, but it didn't occur to me that it would be useful for more than transliteration. It makes sense now that Tim pointed that out. How would it handle word breaks, though? It would just ignore them, so color -> colour also changes uncolored -> uncoloured? What about things like HTML id's or even attribute/property names (<span style="color:red">)? I'm sure I could dig through the code to find the answers to these, but actually I'm not even sure offhand where the code *is*.
Hello!
I think the code is these: http://svn.wikimedia.org/doc/LanguageConverter_8php-source.html#l00018 http://svn.wikimedia.org/doc/LanguageZh_8php-source.html#l00009
and a comment at http://svn.wikimedia.org/doc/LanguageConverter_8php-source.html#l00258 says:
00271 /* we convert everything except: 00272 1. html markups (anything between < and >) 00273 2. html entities 00274 3. place holders created by the parser 00275 */
So, I don't think it will convert <span style="color:red">. But I'm not sure, because I'm still learning php...
By the way, I can't understand Chinese, but (after using an on-line translator) I think the page they have for documenting the system is this: http://zh.wikipedia.org/wiki/Help:%E4%B8%AD%E6%96%87%E7%BB%B4%E5%9F%BA%E7%99...
Helder
2009/9/10 Aryeh Gregor Simetrical+wikilist@gmail.com
On Thu, Sep 10, 2009 at 6:44 PM, Roan Kattouw roan.kattouw@gmail.com wrote:
Seems I'm not the only one who had a completely wrong idea about how variants work. We definitely need more documentation and fame for this system, so its potential doesn't go to waste.
I theoretically knew that it was just a string-replace system, but it didn't occur to me that it would be useful for more than transliteration. It makes sense now that Tim pointed that out. How would it handle word breaks, though? It would just ignore them, so color -> colour also changes uncolored -> uncoloured? What about things like HTML id's or even attribute/property names (<span style="color:red">)? I'm sure I could dig through the code to find the answers to these, but actually I'm not even sure offhand where the code *is*.
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Aryeh Gregor wrote:
On Thu, Sep 10, 2009 at 6:44 PM, Roan Kattouw roan.kattouw@gmail.com wrote:
Seems I'm not the only one who had a completely wrong idea about how variants work. We definitely need more documentation and fame for this system, so its potential doesn't go to waste.
I theoretically knew that it was just a string-replace system, but it didn't occur to me that it would be useful for more than transliteration. It makes sense now that Tim pointed that out. How would it handle word breaks, though? It would just ignore them, so color -> colour also changes uncolored -> uncoloured?
Neither of the implementations so far has required any knowledge of word breaks, and so it has not been implemented. In theory you could just list every larger word that contains a smaller transformed word, e.g.
humor -> humour humorous -> humorous
But it might be better to just add a word segmentation feature.
What about things like HTML id's or even attribute/property names (<span style="color:red">)? I'm sure I could dig through the code to find the answers to these, but actually I'm not even sure offhand where the code *is*.
languages/LanguageConverter.php. There are some rather inelegant regexes to deal with cases like these, they seem to work. The converter operates at a near-HTML stage of the parser, so it's not too hard to skip attributes.
Note that the FastStringSearch extension is important for acheiving good performance, especially in Chinese.
-- Tim Starling
The differences between the UK and American varieties of English are not limited just to spelling and vocabulary.
Ariel
Στις 10-09-2009, ημέρα Πεμ, και ώρα 13:06 -0400, ο/η Aryeh Gregor έγραψε:
On Wed, Sep 9, 2009 at 6:50 PM, Tim Starling tstarling@wikimedia.org wrote:
I don't know why you're writing this nonsense, you obviously haven't looked at the code at all.
This paragraph is unnecessary.
The language variant system that we have could easily convert between US and UK English. In fact it already does convert between a language pair with a far more complex relationship, that is Simplified and Traditional Chinese.
The language conversion system is very simple, it's just a table of translated pairs, where the longest match takes precedence. The translation table in one direction (e.g. UK -> US) can be different to the table in the other direction (US -> UK). You would not list "ize -> ise", you would list every word in the dictionary with an -ize ending that can be translated to -ise without controversy. The current software could handle 50k pairs or so without serious performance problems, and it could be extended and optimised to allow millions of pairs if there was a need for that.
It's possible to handle any pair of languages which are separated only by vocabulary, and transliteration or spelling. It's only differences in grammar, such as word order, that would give it trouble.
Is there any reason nobody's tried adding such support for us/uk English? It would resolve some long-standing tension on enwiki. Would anons have to be given one variant or the other, or would they get untransformed text or what? Does the variant transformation apply to the edit page as well?
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
On Thu, Sep 10, 2009 at 2:23 PM, Ariel T. Glenn ariel@wikimedia.org wrote:
The differences between the UK and American varieties of English are not limited just to spelling and vocabulary.
Those account for the large majority of the more noticeable differences, however.
2009/9/10 Aryeh Gregor <Simetrical+wikilist@gmail.comSimetrical%2Bwikilist@gmail.com
On Thu, Sep 10, 2009 at 2:23 PM, Ariel T. Glenn ariel@wikimedia.org wrote:
The differences between the UK and American varieties of English are not limited just to spelling and vocabulary.
Those account for the large majority of the more noticeable differences, however.
I think this is also the case for Portuguese ('pt' x 'pt-br'). So, even if the table doesn't solves every case, what it solves is sufficiently good...
2009/9/10 Aryeh Gregor <Simetrical+wikilist@gmail.comSimetrical%2Bwikilist@gmail.com
Is there any reason nobody's tried adding such support for us/uk English? It would resolve some long-standing tension on enwiki. Would anons have to be given one variant or the other, or would they get untransformed text or what? Does the variant transformation apply to the edit page as well?
I have the same questions...
Helder
It might be possible to make it apply to the edit page as well, but in zh.wp, sr.wp, and kk.wp currently it does not. I'm guessing (could be wrong) that it would eat up a lot more resources.
Mark
skype: node.ue
On Thu, Sep 10, 2009 at 11:49 AM, Helder Geovane Gomes de Lima heldergeovane@gmail.com wrote:
2009/9/10 Aryeh Gregor <Simetrical+wikilist@gmail.comSimetrical%2Bwikilist@gmail.com
On Thu, Sep 10, 2009 at 2:23 PM, Ariel T. Glenn ariel@wikimedia.org wrote:
The differences between the UK and American varieties of English are not limited just to spelling and vocabulary.
Those account for the large majority of the more noticeable differences, however.
I think this is also the case for Portuguese ('pt' x 'pt-br'). So, even if the table doesn't solves every case, what it solves is sufficiently good...
2009/9/10 Aryeh Gregor <Simetrical+wikilist@gmail.comSimetrical%2Bwikilist@gmail.com
Is there any reason nobody's tried adding such support for us/uk English? It would resolve some long-standing tension on enwiki. Would anons have to be given one variant or the other, or would they get untransformed text or what? Does the variant transformation apply to the edit page as well?
I have the same questions...
Helder _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Ariel T. Glenn wrote:
The differences between the UK and American varieties of English are not limited just to spelling and vocabulary.
Note that the -{...}- structure is available in wikitext to translate article-specific fragments of text, so you can also translate worldview:
A popular game played with a bat and ball is -{en-gb:Cricket; en-us:Baseball}-.
-- Tim Starling
Tim Starling wrote:
Ariel T. Glenn wrote:
The differences between the UK and American varieties of English are not limited just to spelling and vocabulary.
Note that the -{...}- structure is available in wikitext to translate article-specific fragments of text, so you can also translate worldview:
A popular game played with a bat and ball is -{en-gb:Cricket; en-us:Baseball}-.
That reminds me... some time ago, someone proposed to enable LanguageConverter on Commons (but without any automatic conversion, presumably) and to (ab?)use it to replace the existing autotranslation hacks based on {{int:lang}}. Would that be in any sense feasible?
There would presumably be two major use cases: the easy one, which I do believe the converter should handle just fine, would be to replace the current http://commons.wikipedia.org/wiki/Template:LangSwitch, generally used to autotranslate short phrases, with syntax like:
-{de:Eigene Arbeit; en:Own work; fi:Oma teos; fr:Travail personnel; etc.}-
(See http://commons.wikipedia.org/wiki/Template:Own for the source of the example.)
The not-so-simple case would be replacing http://commons.wikipedia.org/wiki/Template:Autotranslate, which is used to translate entire templates, usually (though by no means necessarily) combined with a long list of links to the various translations so that users can easily browse them if the automatically chosen version is no good or something. A naive implementation of that would look something like:
-{af: {{GFDL/af}}; als: {{GFDL/als}}; an: {{GFDL/an}}; ar: {{GFDL/ar}}; ast: {{GFDL/ast}}; be: {{GFDL/be}}; be-tarask: {{GFDL/be-tarask}}; <!-- ...and so on for about 70 more languages -->}-
(Source: http://commons.wikipedia.org/wiki/Template:GFDL.)
I'd like to hope that there might be some better way of doing it, though, even if I can't offhand think of what it might look like.
Still, would something like that work, even in theory, and would it be an improvement over the way these things are currently done (which is hacky enough itself)?
Ilmari Karonen wrote:
A popular game played with a bat and ball is -{en-gb:Cricket; en-us:Baseball}-.
That reminds me... some time ago, someone proposed to enable LanguageConverter on Commons (but without any automatic conversion, presumably) and to (ab?)use it to replace the existing autotranslation hacks based on {{int:lang}}. Would that be in any sense feasible?
There would presumably be two major use cases: the easy one, which I do believe the converter should handle just fine, would be to replace the current http://commons.wikipedia.org/wiki/Template:LangSwitch, generally used to autotranslate short phrases, with syntax like:
-{de:Eigene Arbeit; en:Own work; fi:Oma teos; fr:Travail personnel; etc.}-
(See http://commons.wikipedia.org/wiki/Template:Own for the source of the example.)
I don't think it's really a saner syntax.
The not-so-simple case would be replacing http://commons.wikipedia.org/wiki/Template:Autotranslate, which is used to translate entire templates, usually (though by no means necessarily) combined with a long list of links to the various translations so that users can easily browse them if the automatically chosen version is no good or something. A naive implementation of that would look something like:
-{af: {{GFDL/af}}; als: {{GFDL/als}}; an: {{GFDL/an}}; ar: {{GFDL/ar}}; ast: {{GFDL/ast}}; be: {{GFDL/be}}; be-tarask: {{GFDL/be-tarask}}; <!-- ...and so on for about 70 more languages -->}-
(Source: http://commons.wikipedia.org/wiki/Template:GFDL.)
I'd like to hope that there might be some better way of doing it, though, even if I can't offhand think of what it might look like.
Still, would something like that work, even in theory, and would it be an improvement over the way these things are currently done (which is hacky enough itself)?
I don't think so. It's terribly ugly. You would want something like {{GFDL/{{ENABLEDVARIANT}}}} (no, such magic word doesn't seem to exist yet). But you would still have the problem of having people *choose* them. You wouldn't put dozens of tabs to choose the variant. Which in fact isn't a variant.
These are languages, variant system is not appropiate for them.
"Platonides" Platonides@gmail.com wrote in message news:h8eg97$eh0$1@ger.gmane.org...
Ilmari Karonen wrote:
I don't think it's really a saner syntax.
That's not the point. It's a *safer* syntax. Using {{int:lang}} breaks cache integrity: if you put {{SomeTemplate/{{int:lang}}} (or equally some {{USERLANGUAGE}} magic word if it existed) on a page and save it, the link that's added to the templatelinks table is the template subpage the *editor* gets, but a viewer with a different language can get a different page. I assume (before Tim shouts at me too, no I haven't read the code either) that "The converter operates at a near-HTML stage of the parser" implies that it's *way* after template expansion... are the "-{...}-" strings stripmarked-out at that stage? Essentially, the key is that they can't affect the transclusion structure of the rest of the page.
-{af: {{GFDL/af}}; als: {{GFDL/als}}; an: {{GFDL/an}}; ar: {{GFDL/ar}}; ast: {{GFDL/ast}}; be: {{GFDL/be}}; be-tarask: {{GFDL/be-tarask}}; <!-- ...and so on for about 70 more languages -->}-
The above begs the question, of course, would this switch actually work? And if it does, how does it affect the cache and linktables? More investigation needed, methinks....
I think the obstructions to implementing en-gb/en-us conversion on enwiki would be social rather than technical. They've just gone through six months of hell over date autoformatting, culminating in a decision to scrap the system entirely and hence not support users being able to choose between American and International *date formats*. If they don't even want to support those, getting a full language conversion supported *would* be like herding cats...
--HM
Happy-melon wrote:
Ilmari Karonen wrote:
-{af: {{GFDL/af}}; als: {{GFDL/als}}; an: {{GFDL/an}}; ar: {{GFDL/ar}}; ast: {{GFDL/ast}}; be: {{GFDL/be}}; be-tarask: {{GFDL/be-tarask}}; <!-- ...and so on for about 70 more languages -->}-
The above begs the question, of course, would this switch actually work? And if it does, how does it affect the cache and linktables? More investigation needed, methinks....
Indeed, that was what I was wondering about too. Without actually trying it out, my guess would be that it would indeed work, but at a cost: it'd first parse all the 75 or so subtemplates and then throw all but one of them away.
Of course, that's what one would have to do anyway, to get full link table consistency.
It does seem to me that it might not be *that* inefficient, *if* the page were somehow cached in its pre-languageconverted state but after the expensive template parsing has been done. Does such a cache actually exist, or, if not, could one be added with reasonable ease?
Hoi, When we are to do this for English and have digitise and digitize, we have to keep in mind that this ONLY deals with issues that are differences between GB and US English. There are other varieties of English that may make this more complicated.
Given the size of the GB and US populations it would split the cache and effectively double the cache size. There are more languages where this would provide serious benefits. I can easily imagine that the German, Spanish and Portuguese community would be interested.. Then there are many of the "other" languages that may have an interest.. The first order of business is not can it be done but who will implement and maintain the language part of this. Thanks, GerardM
2009/9/12 Ilmari Karonen nospam@vyznev.net
Happy-melon wrote:
Ilmari Karonen wrote:
-{af: {{GFDL/af}}; als: {{GFDL/als}}; an: {{GFDL/an}}; ar: {{GFDL/ar}}; ast: {{GFDL/ast}}; be: {{GFDL/be}}; be-tarask: {{GFDL/be-tarask}}; <!-- ...and so on for about 70 more languages -->}-
The above begs the question, of course, would this switch actually work? And if it does, how does it affect the cache and linktables? More investigation needed, methinks....
Indeed, that was what I was wondering about too. Without actually trying it out, my guess would be that it would indeed work, but at a cost: it'd first parse all the 75 or so subtemplates and then throw all but one of them away.
Of course, that's what one would have to do anyway, to get full link table consistency.
It does seem to me that it might not be *that* inefficient, *if* the page were somehow cached in its pre-languageconverted state but after the expensive template parsing has been done. Does such a cache actually exist, or, if not, could one be added with reasonable ease?
-- Ilmari Karonen
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Hoi, What you see is that I care about performance. This is not the kind of feature that should be implemented without careful consideration of the consequences. So yes, performance may override language support for some time. When it is clear that this feature is to be implemented, it takes some careful planning and the implementation of the extra infrastructure before it goes live. There is an end to the time that performance remains a valid argument that prevents this kind of language support
In summary YES, but.. Thanks, GerardM
2009/9/12 Domas Mituzas midom.lists@gmail.com
Given the size of the GB and US populations it would split the cache and effectively double the cache size.
Did I just see you putting performance ahead of language support? Just checkin'
Domas
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Hi! An user of my extension had reported an error that occurs during calling registered parser hook under MediaWiki 1.15.1 in PHP 5.3:
function qp_RenderPoll( $input, $argv, &$parser ) { ... }
Parameter 3 expected to be a reference, value given on line 3243 in Parser.php
There are two fixed proposed: http://www.mediawiki.org/wiki/Extension:AccessControl#See_also
a patch to Parser.php. I've checked the revision 55629 and don't see the patch integrated there.
Patch to the extension: http://www.mediawiki.org/wiki/Extension_talk:Group_Based_Access_Control#Prob...
To me it's easy to release the new version, patch is one letter. But, I've studied other extensions and for example very professional Semantic MediaWiki also declares third parameter of tag hook method by reference: static public function doAskHook($querytext, $params, &$parser) { ... } maybe not the latest version - v1.4.1, not too old, though. I will check the latest SMW, too.
What should I do, wait for the Parser.php patch or to use less efficient passing of large Parser object by value? Dmitriy
On 9/12/09 7:06 AM, Dmitriy Sintsov wrote:
To me it's easy to release the new version, patch is one letter. But, I've studied other extensions and for example very professional Semantic MediaWiki also declares third parameter of tag hook method by reference: static public function doAskHook($querytext, $params,&$parser) {
Most of those are old leftovers from PHP 4, which would copy objects instead of using sane value-reference semantics. These are being cleaned up in dev work for 1.6 as people are more actively testing with PHP 5.3.0.
References for hook parameters should *only* be used for out-parameters where the hook needs to be able to return a new value to the caller, *not* for simply passing objects.
What should I do, wait for the Parser.php patch or to use less efficient passing of large Parser object by value?
Using a PHP reference here is actually less efficient and more error-prone.
Passing objects "by value" in PHP 5 works the same as it does in Java, Python, etc -- that is, you're actually passing around a reference to the object... but if you were to, say, assign a different object to the variable in your function, *that* change would not propagate back to the caller, as you're passing the reference by value.
Yeah I know, it's confusing. ;) Just stay away from references unless you're realllllly sure you need em. ;)
-- brion
Hi! I've read that usability is important for MediaWiki. Why don't integrate wikitext syntax highlighting then? That will greatly improve editing of the pages. There is Extension:WikEd which has most of the work implemented already. http://www.mediawiki.org/wiki/File:WikEd_screenshot.png I know that it should be possible to install the extension separately. But I remember that important extensions are sometimes integrated into MediaWiki. There's also FCKeditor, but it cannot perform diffs, and visual editing is not always desirable (non-technical users like it, but I personally prefer to edit wikitext).
It just comes to me everytime I edit source PHP with hightlighting, then editing wikitext without highlighting :-( Dmitriy
2009/9/15 Dmitriy Sintsov questpc@rambler.ru:
Hi! I've read that usability is important for MediaWiki. Why don't integrate wikitext syntax highlighting then?
We're planning to do exactly that in our third release (Citron). Right now, we're working on bugfixing and deploying our second release (Babaco).
That will greatly improve editing of the pages. There is Extension:WikEd which has most of the work implemented already. http://www.mediawiki.org/wiki/File:WikEd_screenshot.png
Yeah, we knew about wikEd, and we'll definitely be looking at it.
I know that it should be possible to install the extension separately. But I remember that important extensions are sometimes integrated into MediaWiki. There's also FCKeditor, but it cannot perform diffs, and visual editing is not always desirable (non-technical users like it, but I personally prefer to edit wikitext).
There's still quite a few issues with FCKeditor, and as far as I know it's been decided that the usability project is not gonna cover WYSIWYG; I'm not entirely sure of the official stance here, you'd have to ask Naoko.
Roan Kattouw (Catrope)
* Roan Kattouw roan.kattouw@gmail.com [Tue, 15 Sep 2009 13:34:07 +0200]:
We're planning to do exactly that in our third release (Citron). Right now, we're working on bugfixing and deploying our second release (Babaco).
What do these codenames mean? Citron is v1.16 and Babaco is v1.17 or it's something else? Will these improvements be announced, when available?
Yeah, we knew about wikEd, and we'll definitely be looking at it.
Thanks!
There's still quite a few issues with FCKeditor, and as far as I know it's been decided that the usability project is not gonna cover WYSIWYG; I'm not entirely sure of the official stance here, you'd have to ask Naoko.
I am not really a supporter of WYSIWYG, too. Source markup is better, but, with syntax highlighting :-) Dmitriy
2009/9/15 Dmitriy Sintsov questpc@rambler.ru:
- Roan Kattouw roan.kattouw@gmail.com [Tue, 15 Sep 2009 13:34:07
+0200]:
We're planning to do exactly that in our third release (Citron). Right now, we're working on bugfixing and deploying our second release (Babaco).
What do these codenames mean? Citron is v1.16 and Babaco is v1.17 or it's something else?
No. Acai, Babaco and Citron are names used by the usability initiative, and these "releases" aren't related to MediaWiki versions or releases. Basically, each is a set of new features that we're deploying. Acai is already live, Babaco will hopefully go live in a few weeks, and Citron has yet to be developed. For details as to what's in each release, see http://usability.wikimedia.org/wiki/Releases .
Will these improvements be announced, when available?
Absolutely.
Roan Kattouw (Catrope)
On Tue, 15 Sep 2009 13:34:07 +0200, Roan Kattouw roan.kattouw@gmail.com wrote:
There's still quite a few issues with FCKeditor, and as far as I know it's been decided that the usability project is not gonna cover WYSIWYG; I'm not entirely sure of the official stance here, you'd have to ask Naoko.
Full wysiwyg has lots of fun problems, mainly because the strategy of translating between wiki markup and HTML leads to a lot of edge cases which ends up breaking things. Folks have been trying to tackle it for years and still aren't quite there; Wikia's current work with FCKeditor is pretty good but still has a lot of things that just don't work... conversion can be lossy and the handling of templates, tables, extensions, etc would lead to most pages having to be edited in source mode at exactly the times you least want to touch the raw markup.
Instead, we've got the Usability project focusing on things we think we can really deliver, providing most of what's actually useful about a wysiwyg environment:
* modernizing the look, feel, and interaction model (more live, less post-and-wait) * getting the scariest parts of the markup out of your face * providing humane user interfaces for tasks like finding links and categories, uploading/picking/sizing images, filling out templates, creating and editing tables * context-aware editing (an editor that knows what section you're in, where this link points to and if it exists, what fields this template needs, etc)
-- brion
On Wed, Sep 16, 2009 at 1:35 AM, brion@wikimedia.org wrote:
On Tue, 15 Sep 2009 13:34:07 +0200, Roan Kattouw roan.kattouw@gmail.com wrote:
There's still quite a few issues with FCKeditor, and as far as I know it's been decided that the usability project is not gonna cover WYSIWYG; I'm not entirely sure of the official stance here, you'd have to ask Naoko.
Full wysiwyg has lots of fun problems, mainly because the strategy of translating between wiki markup and HTML leads to a lot of edge cases which ends up breaking things. Folks have been trying to tackle it for years and still aren't quite there; Wikia's current work with FCKeditor is pretty good but still has a lot of things that just don't work... conversion can be lossy and the handling of templates, tables, extensions, etc would lead to most pages having to be edited in source mode at exactly the times you least want to touch the raw markup.
Instead, we've got the Usability project focusing on things we think we can really deliver, providing most of what's actually useful about a wysiwyg environment:
- modernizing the look, feel, and interaction model (more live, less
post-and-wait)
- getting the scariest parts of the markup out of your face
- providing humane user interfaces for tasks like finding links and
categories, uploading/picking/sizing images, filling out templates, creating and editing tables
- context-aware editing (an editor that knows what section you're in,
where this link points to and if it exists, what fields this template needs, etc)
Not sure if this was considered: * Categories (in the page, not in templates), language links, and magic words (NOTOC) are position-independent within a page * They are also relatively easy to extract from the wikitext (regexp should do, after removing HTML comments and nowiki; I did something like that in JS a while ago, should be much easier if supported from PHP) * They clutter the text (even though they tend to be towards the end of the text), and might scare off newbies * They can be represented in separate visual elements (toggles, lists, or some JS as I did with hotcat)
Any plans of separating these on edit, then re-attach them to the text on saving? It's low-hanging fruit IMHO.
Cheers, Magnus
On 16/09/2009, at 10:01 AM, Magnus Manske wrote:
Not sure if this was considered:
- Categories (in the page, not in templates), language links, and
magic words (NOTOC) are position-independent within a page
- They are also relatively easy to extract from the wikitext (regexp
should do, after removing HTML comments and nowiki; I did something like that in JS a while ago, should be much easier if supported from PHP)
- They clutter the text (even though they tend to be towards the end
of the text), and might scare off newbies
- They can be represented in separate visual elements (toggles, lists,
or some JS as I did with hotcat)
Any plans of separating these on edit, then re-attach them to the text on saving? It's low-hanging fruit IMHO.
I've added an experimental AJAX management interface for categories (like HotCat, but good). It requires the JS2 system to be enabled (which might be a while).
I'm not sure that it's strictly necessary to actually pull it all out of the wikitext, because it usually just sits at the bottom, out of everybody's way.
-- Andrew Garrett agarrett@wikimedia.org http://werdn.us/
On 16/09/2009, at 10:01 AM, Magnus Manske wrote:
Not sure if this was considered:
- Categories (in the page, not in templates),
CategorySelect handles this. It separates categories from the wikitext and lets you add categories without having to edit the whole page. Categories remain available to be edited on the edit page if you like. It also provides an easier interface to adding categories, and combines with category suggestions to optionally let you select one instead of having to type it. Things like sort parameters are still supported and there's the option to switch back to code view.
code: https://svn.wikia-code.com/wikia/trunk/extensions/wikia/CategorySelect/ help: http://help.wikia.com/wiki/Help:CategorySelect
Angela
On 16/09/2009, at 2:19 PM, Daniel Schwen wrote:
(like HotCat, but good).
Care to elaborate? I'm not too fond of this kind of inuendo without concrete points. FWIW HotCat is working quite well on Commons.
I'm sure it works fine, but: * It reloads the page 3 times, actually stepping through an edit form, modifying wikitext with JS, and saving. * It is 1100 lines of ugly code. * It isn't localised. * It breaks if translations or aliases of the Category namespace are used. * It adds random text to the category display, instead of using nice icons. * It doesn't prompt for an edit summary, nor does it provide any sort of confirmation.
By contrast, my newer version: * Submits an edit through the API, and selectively reloads the category section with no user disruption other than a progress spinner. * Is 300 lines of simple jQuery code. * Is fully localisable. * Has full support for translations and aliases of the category namespace. * Uses icons for the actions. * Prompts for confirmation and an edit summary before making an edit.
-- Andrew Garrett agarrett@wikimedia.org http://werdn.us/
On Wed, Sep 16, 2009 at 7:01 PM, Magnus Manske magnusmanske@googlemail.com wrote:
Any plans of separating these on edit, then re-attach them to the text on saving? It's low-hanging fruit IMHO.
Good question. IMHO, all position independent stuff (ie, metadata) would be better off saved separately, and edited separately. As a legacy solution, users could still type [[Category:Blah]] in the main edit box, but at save time, it would be moved to the metadata area.
References are another example of location-independent metadata that should be dealt with like that. But honestly, the Usability people seem to be doing a pretty good and know what they're doing. Do they need more ideas from us?
Steve
On Thu, Sep 17, 2009 at 4:00 AM, Steve Bennett stevagewp@gmail.com wrote:
On Wed, Sep 16, 2009 at 7:01 PM, Magnus Manske magnusmanske@googlemail.com wrote:
Any plans of separating these on edit, then re-attach them to the text on saving? It's low-hanging fruit IMHO.
Good question. IMHO, all position independent stuff (ie, metadata) would be better off saved separately, and edited separately. As a legacy solution, users could still type [[Category:Blah]] in the main edit box, but at save time, it would be moved to the metadata area.
I agree that all this should be stored separately; however, that would mean a (major) rewrite of things (least of all the dumping process) and thinking (e.g. the category table suddenly becomes the authoritative storage for that data, not the wikitext). I was thinking about a quickly implemented solution that could simulate these effects for the user without major code revisions.
References are another example of location-independent metadata that should be dealt with like that. But honestly, the Usability people seem to be doing a pretty good and know what they're doing. Do they need more ideas from us?
Yes, they do a pretty good job; that doesn't mean they have all the answers (who has? except me of course!:-) or all the ideas. At the very least, publicly mentioning our ideas can re-enforce their decision to implement it.
Cheers, Magnus
2009/9/17 Steve Bennett stevagewp@gmail.com:
But honestly, the Usability people seem to be doing a pretty good and know what they're doing. Do they need more ideas from us?
We /always/ welcome more ideas, we just can't guarantee we'll agree with you, or that we'll have the time or resources to implement it :)
Roan Kattouw (Catrope)
On 17/09/2009, at 4:00 AM, Steve Bennett wrote:
On Wed, Sep 16, 2009 at 7:01 PM, Magnus Manske magnusmanske@googlemail.com wrote:
Any plans of separating these on edit, then re-attach them to the text on saving? It's low-hanging fruit IMHO.
Good question. IMHO, all position independent stuff (ie, metadata) would be better off saved separately, and edited separately. As a legacy solution, users could still type [[Category:Blah]] in the main edit box, but at save time, it would be moved to the metadata area.
We have an excellent system for storing, versioning, backing up, restoring, editing, and retrieving raw wikitext in one blob.
Why do you want to scrap this, and embark on a several-month-long development effort to reimplement it with more complexity? There certainly isn't any significant technical benefit in it, besides the fact that it seems a little cleaner from the outside.
It's definitely much easier and much safer to store it all together, and then separate it at edit time. We can reliably parse this stuff out and present another interface for editing it.
-- Andrew Garrett agarrett@wikimedia.org http://werdn.us/
On Thu, Sep 17, 2009 at 8:39 AM, Andrew Garrett agarrett@wikimedia.org wrote:
Why do you want to scrap this, and embark on a several-month-long development effort to reimplement it with more complexity? There certainly isn't any significant technical benefit in it, besides the fact that it seems a little cleaner from the outside.
If, for instance, all category links were stored only in categorylinks, we wouldn't have the problem of denormalization. categorylinks isn't fully reliably right now because it duplicates information from the page text, and might fall out of sync. Storing things in a single blob instead of broken into tables -- or duplicating data in multiple places -- is a violation of the relational model and tends to hurt correctness, performance, or both.
On Wed, Sep 16, 2009 at 11:00 PM, Steve Bennett stevagewp@gmail.com wrote:
Good question. IMHO, all position independent stuff (ie, metadata) would be better off saved separately, and edited separately. As a legacy solution, users could still type [[Category:Blah]] in the main edit box, but at save time, it would be moved to the metadata area.
This is simply not feasible as long as we have templates. Templates can contain both position-dependent data and metadata, and the template reference needs to be stored with the data to preserve the position. This means the metadata inherently depends on the page text.
Aryeh Gregor wrote:
On Wed, Sep 16, 2009 at 11:00 PM, Steve Bennett stevagewp@gmail.com wrote:
Good question. IMHO, all position independent stuff (ie, metadata) would be better off saved separately, and edited separately. As a legacy solution, users could still type [[Category:Blah]] in the main edit box, but at save time, it would be moved to the metadata area.
This is simply not feasible as long as we have templates. Templates can contain both position-dependent data and metadata, and the template reference needs to be stored with the data to preserve the position. This means the metadata inherently depends on the page text.
And if it were changed, templates *should* be able to continue being a source of metadata. So I think that requirement blocks changing the way categories are stored.
On Fri, Sep 18, 2009 at 6:16 AM, Platonides Platonides@gmail.com wrote:
And if it were changed, templates *should* be able to continue being a source of metadata. So I think that requirement blocks changing the way categories are stored.
You're right, I completely overlooked that.
So, milder proposal: how do people feel about moving all (non-templated) metadata to the bottom of the page at save time? It's not a huge benefit by itself, but it allows that metadata to be edited separately without having to guess where it came from. And dear god do we need to get references separated out from the main text...
Somehow I suspect the most controversial thing about doing that would be the massive war that would erupt between those who want interwiki links first and those who want categories first...
Steve
On Thu, Sep 17, 2009 at 6:40 PM, Steve Bennett stevagewp@gmail.com wrote:
On Fri, Sep 18, 2009 at 6:16 AM, Platonides Platonides@gmail.com wrote:
And if it were changed, templates *should* be able to continue being a source of metadata. So I think that requirement blocks changing the way categories are stored.
You're right, I completely overlooked that.
So, milder proposal: how do people feel about moving all (non-templated) metadata to the bottom of the page at save time? It's not a huge benefit by itself, but it allows that metadata to be edited separately without having to guess where it came from. And dear god do we need to get references separated out from the main text...
Somehow I suspect the most controversial thing about doing that would be the massive war that would erupt between those who want interwiki links first and those who want categories first...
Steve
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
I could be wrong here, but wasn't Cite recently changed to allow for putting <ref>s inside <references>? If so, I think we could start encouraging people to put all of their refs down in the <references> block and just using <ref name="something"/> inline. /That/ would be a vast improvement, if nothing else.
-Chad
On Fri, Sep 18, 2009 at 8:44 AM, Chad innocentkiller@gmail.com wrote:
I could be wrong here, but wasn't Cite recently changed to allow for putting <ref>s inside <references>? If so, I think we could start encouraging people to put all of their refs down in the <references> block and just using <ref name="something"/> inline. /That/ would be a vast improvement, if nothing else.
Good question, it's certainly been raised quite a few times. I was astonished how much opposition there was, the most recent time.
Ah yes, here it is: http://en.wikipedia.org/wiki/Wikipedia_talk:Citing_sources/Archive_26#Result
So, apparently the code has been written but not implemented on en yet. It will allow this:
Some<ref name="foo" /> text.
... <references /> <ref name="foo">Moo</ref> </references>
Steve
On 17/09/2009, at 11:40 PM, Steve Bennett wrote:
So, milder proposal: how do people feel about moving all (non-templated) metadata to the bottom of the page at save time? It's not a huge benefit by itself, but it allows that metadata to be edited separately without having to guess where it came from.
I'm fine with moving the metadata down there on demand when we actually edit it separately and can't be bothered to put it back where it came from. I think if we do that, we'd avoid the need to separate it all on-save.
And dear god do we need to get references separated out from the main text...
A fix for this went live today. You can now put your <ref name=""> tags into the <references> tag, and then reference them by name.
Might be fun to run a bot or something to move them down there, it'd be a cheap usability improvement.
Somehow I suspect the most controversial thing about doing that would be the massive war that would erupt between those who want interwiki links first and those who want categories first...
We'll argue about anything ;)
-- Andrew Garrett agarrett@wikimedia.org http://werdn.us/
On Fri, Sep 18, 2009 at 8:46 AM, Andrew Garrett agarrett@wikimedia.org wrote:
A fix for this went live today. You can now put your <ref name=""> tags into the <references> tag, and then reference them by name.
Oh, so it did. And it works!
http://en.wikipedia.org/w/index.php?title=Gippsland_Lakes_Discovery_Trail&am...
What's great about this kind of improvement is that it lets you see where the next possible improvements could be: - Separate out infoboxes - Separate out images
*shrug*
Anyway, I look forward to a new era of editing without massive cite templates in my face!
(And yes, a bot should go through and move them all...or at least ones where the definition > X characters)
Steve
On Thu, Sep 17, 2009 at 4:43 PM, Steve Bennett stevagewp@gmail.com wrote:
On Fri, Sep 18, 2009 at 8:46 AM, Andrew Garrett agarrett@wikimedia.org wrote:
A fix for this went live today. You can now put your <ref name=""> tags into the <references> tag, and then reference them by name.
Oh, so it did. And it works!
http://en.wikipedia.org/w/index.php?title=Gippsland_Lakes_Discovery_Trail&am...
What's great about this kind of improvement is that it lets you see where the next possible improvements could be:
- Separate out infoboxes
- Separate out images
*shrug*
Anyway, I look forward to a new era of editing without massive cite templates in my face!
(And yes, a bot should go through and move them all...or at least ones where the definition > X characters)
Before we get all excited about having a bot move all the references, it is worth noting that there is a rare use case that is known to fail when moved inside the references block (bug 20707).
Specifically, it consists of nested refs constructed with the #tag syntax. (Ordinarily, the parser and Cite make it impossible to place a <ref> inside another <ref>, but someone figured out that you can work around this by exploiting the out of order parser evaluations created by #tag to create nested refs.) These nested ref constructions universally fail when moved into the references block, and often do so in a not very informative way.
In practice it is very rare to have a ref be placed inside the content of another ref, so the problem of nested refs will almost never come up, but it is something to be aware of if one is considering any mass effort to relocate refs inside the references block.
It is actually a rather tricky problem to solve. The right answer is probably to build in generic support for nested references but that would require significant changes to Cite's data stack and probably a couple modifications and / or new hooks in the parser itself. If I reach a point of having more free time, this is an issue I've been planning to look at.
-Robert Rohde
On Fri, Sep 18, 2009 at 12:41 PM, Robert Rohde rarohde@gmail.com wrote:
In practice it is very rare to have a ref be placed inside the content of another ref, so the problem of nested refs will almost never come up, but it is something to be aware of if one is considering any mass effort to relocate refs inside the references block.
Hmm, doesn't seem completely unbelievable...I won't contrive an example now, but I can imagine one.
But anyway can a bot detect these cases and just ignore them?
Steve
On Thu, Sep 17, 2009 at 8:27 PM, Steve Bennett stevagewp@gmail.com wrote:
On Fri, Sep 18, 2009 at 12:41 PM, Robert Rohde rarohde@gmail.com wrote:
In practice it is very rare to have a ref be placed inside the content of another ref, so the problem of nested refs will almost never come up, but it is something to be aware of if one is considering any mass effort to relocate refs inside the references block.
Hmm, doesn't seem completely unbelievable...I won't contrive an example now, but I can imagine one.
But anyway can a bot detect these cases and just ignore them?
Yes, the nested ref syntax is sufficiently weird that it shouldn't be too hard to train a bot to recognize and ignore those cases.
Of course, you'd also have to build consensus for any project to mass move refs.
-Robert Rohde
On Fri, Sep 18, 2009 at 2:12 PM, Robert Rohde rarohde@gmail.com wrote:
Of course, you'd also have to build consensus for any project to mass move refs.
Yeah, the strange this is theoretically we should have consensus about how to do things. But without a bot enforcing style rules, the issue never comes to a head, so different communities within the encyclopaedia can each do things their own way, and pretend that everything's fine. Then a bot comes along and everyone gets upset...with the bot.
:)
Steve
Steve Bennett stevagewp@gmail.com wrote:
[...] Anyway, I look forward to a new era of editing without massive cite templates in my face!
(And yes, a bot should go through and move them all...or at least ones where the definition > X characters)
Does this mean that we can switch off section editing soon as it becomes useless?
Tim
Tim Landscheidt wrote:
Steve Bennett stevagewp@gmail.com wrote:
[...] Anyway, I look forward to a new era of editing without massive cite templates in my face!
(And yes, a bot should go through and move them all...or at least ones where the definition > X characters)
Does this mean that we can switch off section editing soon as it becomes useless?
I don't see how would it become useless. In fact, I would like to see table editing or template editing done the same way (I haven't understood if it will be done or not).
On 18/09/2009, at 2:40 PM, Tim Landscheidt wrote:
Steve Bennett stevagewp@gmail.com wrote:
[...] Anyway, I look forward to a new era of editing without massive cite templates in my face!
(And yes, a bot should go through and move them all...or at least ones where the definition > X characters)
Does this mean that we can switch off section editing soon as it becomes useless?
Actually, the navigable TOC stuff being worked on by the usability folks looks like a fine replacement for section editing (not that I think it should be turned off, per-se).
-- Andrew Garrett agarrett@wikimedia.org http://werdn.us/
On Fri, Sep 18, 2009 at 11:40 PM, Tim Landscheidt tim@tim-landscheidt.de wrote:
Does this mean that we can switch off section editing soon as it becomes useless?
What you're saying is there's no point editing just a section, if it relies on references defined elsewhere. But that's not true. It's useful in the following circumstances: * You're not creating or editing any references * You're only creating new, locally-defined, references * You're only editing locally-defined references
Section editing is unhelpful in this case: * You're editing existing, remotely-defined references
The right solution is for a second edit window just for remotely defined references to be used.
Steve
2009/9/17 Steve Bennett stevagewp@gmail.com:
So, milder proposal: how do people feel about moving all (non-templated) metadata to the bottom of the page at save time? It's not a huge benefit by itself, but it allows that metadata to be edited separately without having to guess where it came from. And dear god do we need to get references separated out from the main text...
I don't know about other wikis, but quite a few en:wp cleanup bots and semibots (notably AutoWikiBrowser) do this automatically when doing other stuff.
Somehow I suspect the most controversial thing about doing that would be the massive war that would erupt between those who want interwiki links first and those who want categories first...
%-D
If it's made a preference, don't forget "random" and "interleave"!
- d.
* brion@wikimedia.org [Wed, 16 Sep 2009 00:35:16 +0000]:
Full wysiwyg has lots of fun problems, mainly because the strategy of translating between wiki markup and HTML leads to a lot of edge cases which ends up breaking things. Folks have been trying to tackle it for years and still aren't quite there; Wikia's current work with FCKeditor is
pretty
good but still has a lot of things that just don't work... conversion can be lossy and the handling of templates, tables, extensions, etc would lead to most pages having to be edited in source mode at exactly the times you least want to touch the raw markup.
Instead, we've got the Usability project focusing on things we think
we
can really deliver, providing most of what's actually useful about a wysiwyg environment:
- modernizing the look, feel, and interaction model (more live, less
post-and-wait)
- getting the scariest parts of the markup out of your face
- providing humane user interfaces for tasks like finding links and
categories, uploading/picking/sizing images, filling out templates, creating and editing tables
- context-aware editing (an editor that knows what section you're in,
where this link points to and if it exists, what fields this template needs, etc)
I hope that wikitext syntax highlighting fits to some of these tasks. It was mentioned in the third step of Usability Initiative site. Dmitriy
I've read that usability is important for MediaWiki. Why don't integrate wikitext syntax highlighting then? That will greatly improve editing of the pages. There is Extension:WikEd which has most of the work implemented already. http://www.mediawiki.org/wiki/File:WikEd_screenshot.png I know that it should be possible to install the extension separately. But I remember that important extensions are sometimes integrated into MediaWiki. There's also FCKeditor, but it cannot perform diffs, and visual editing is not always desirable (non-technical users like it, but I personally prefer to edit wikitext).
It just comes to me everytime I edit source PHP with hightlighting, then editing wikitext without highlighting :-( Dmitriy
See: http://usability.wikimedia.org/wiki/Releases
This is listed as one of the features of the Citron release.
V/r,
Ryan Lane
* "Lane, Ryan" Ryan.Lane@ocean.navo.navy.mil [Tue, 15 Sep 2009 12:41:26 -0500]:
See: http://usability.wikimedia.org/wiki/Releases
This is listed as one of the features of the Citron release.
Thanks. I've figured out that will be http://www.mediawiki.org/wiki/Extension:UsabilityInitiative then probably moved to core. I've just confused the codename with release number of MediaWiki. (You know, how these developers of operating systems love to give codenames to their systems - Fedora or Windows usually come with codename) Dmitriy
Are these releases in any way connected to MediaWiki releases though?
I understand that all that gets release on Wikimedia projects, but it'll be great to have the rest of MW user base benefit from these as well (I have personal interest here as you can imagine ;)).
Thank you,
Sergey
-- Sergey Chernyshev http://www.sergeychernyshev.com/
On Tue, Sep 15, 2009 at 2:15 PM, Dmitriy Sintsov questpc@rambler.ru wrote:
- "Lane, Ryan" Ryan.Lane@ocean.navo.navy.mil [Tue, 15 Sep 2009
12:41:26 -0500]:
See: http://usability.wikimedia.org/wiki/Releases
This is listed as one of the features of the Citron release.
Thanks. I've figured out that will be http://www.mediawiki.org/wiki/Extension:UsabilityInitiative then probably moved to core. I've just confused the codename with release number of MediaWiki. (You know, how these developers of operating systems love to give codenames to their systems - Fedora or Windows usually come with codename) Dmitriy
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Are these releases in any way connected to MediaWiki releases though?
I understand that all that gets release on Wikimedia projects, but it'll be great to have the rest of MW user base benefit from these as well (I have personal interest here as you can imagine ;)).
AFAIK the releases are not connected to MediaWiki releases, but instead are phases of the usability initiative.
V/r,
Ryan Lane
2009/9/15 Lane, Ryan Ryan.Lane@ocean.navo.navy.mil:
Are these releases in any way connected to MediaWiki releases though?
I understand that all that gets release on Wikimedia projects, but it'll be great to have the rest of MW user base benefit from these as well (I have personal interest here as you can imagine ;)).
AFAIK the releases are not connected to MediaWiki releases, but instead are phases of the usability initiative.
That is correct. However, all of the features we code are in the UsabilityInitiative extension, which is available from SVN just like any other extension, so other MW users will benefit (and, in fact, they can already do so). It's not compatible with MW 1.15, however. The Vector skin is in MW core, and will be part of the 1.16 release.
Roan Kattouw (Catrope)
On Tue, Sep 15, 2009 at 12:17 PM, Roan Kattouw roan.kattouw@gmail.com wrote:
2009/9/15 Lane, Ryan Ryan.Lane@ocean.navo.navy.mil:
Are these releases in any way connected to MediaWiki releases though?
I understand that all that gets release on Wikimedia projects, but it'll be great to have the rest of MW user base benefit from these as well (I have personal interest here as you can imagine ;)).
AFAIK the releases are not connected to MediaWiki releases, but instead are phases of the usability initiative.
That is correct. However, all of the features we code are in the UsabilityInitiative extension, which is available from SVN just like any other extension, so other MW users will benefit (and, in fact, they can already do so). It's not compatible with MW 1.15, however. The Vector skin is in MW core, and will be part of the 1.16 release.
Is there a road map somewhere for features you plan to include but haven't gotten to yet?
-Robert Rohde
2009/9/15 Robert Rohde rarohde@gmail.com:
Is there a road map somewhere for features you plan to include but haven't gotten to yet?
http://usability.wikimedia.org/wiki/Releases
Roan Kattouw (Catrope)
Doesn't having geographically located page caches reduce the doubling effect in any given location?
Squids located in the US should be caching more en-US than en-GB, and those in Europe should have more en-GB than en-US.
Jared
-----Original Message----- From: wikitech-l-bounces@lists.wikimedia.org [mailto:wikitech-l-bounces@lists.wikimedia.org] On Behalf Of Gerard Meijssen Sent: 12 September 2009 09:48 To: Wikimedia developers Subject: Re: [Wikitech-l] Language variants
Hoi, When we are to do this for English and have digitise and digitize, we have to keep in mind that this ONLY deals with issues that are differences between GB and US English. There are other varieties of English that may make this more complicated.
Given the size of the GB and US populations it would split the cache and effectively double the cache size. There are more languages where this would provide serious benefits. I can easily imagine that the German, Spanish and Portuguese community would be interested.. Then there are many of the "other" languages that may have an interest.. The first order of business is not can it be done but who will implement and maintain the language part of this. Thanks, GerardM
2009/9/12 Ilmari Karonen nospam@vyznev.net
Happy-melon wrote:
Ilmari Karonen wrote:
-{af: {{GFDL/af}}; als: {{GFDL/als}}; an: {{GFDL/an}}; ar: {{GFDL/ar}}; ast: {{GFDL/ast}}; be: {{GFDL/be}}; be-tarask:
{{GFDL/be-tarask}};
<!-- ...and so on for about 70 more languages -->}-
The above begs the question, of course, would this switch
actually work?
And if it does, how does it affect the cache and
linktables? More
investigation needed, methinks....
Indeed, that was what I was wondering about too. Without actually trying it out, my guess would be that it would indeed work, but at a cost: it'd first parse all the 75 or so subtemplates and then throw all but one of them away.
Of course, that's what one would have to do anyway, to get
full link
table consistency.
It does seem to me that it might not be *that* inefficient,
*if* the
page were somehow cached in its pre-languageconverted state
but after
the expensive template parsing has been done. Does such a cache actually exist, or, if not, could one be added with reasonable ease?
-- Ilmari Karonen
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Jared,
Doesn't having geographically located page caches reduce the doubling effect in any given location?
Squids located in the US should be caching more en-US than en-GB, and those in Europe should have more en-GB than en-US.
It doesn't happen with LRU, object accessed 100 times over the hour will be cached in same way as object accessed once. One would need ARC*-kind of thing, to handle it better, and ARC doesn't work well with COSS**, and ...
BR, Domas
* http://en.wikipedia.org/wiki/Adaptive_Replacement_Cache ** http://devel.squid-cache.org/coss/coss-notes.txt
2009/9/9 Tim Starling tstarling@wikimedia.org
The language variant system that we have could easily convert between US and UK English. In fact it already does convert between a language pair with a far more complex relationship, that is Simplified and Traditional Chinese.
The language conversion system is very simple, it's just a table of translated pairs, where the longest match takes precedence. The translation table in one direction (e.g. UK -> US) can be different to the table in the other direction (US -> UK). You would not list "ize -> ise", you would list every word in the dictionary with an -ize ending that can be translated to -ise without controversy. The current software could handle 50k pairs or so without serious performance problems, and it could be extended and optimised to allow millions of pairs if there was a need for that.
Hello again!
What would be needed in order to use pages like MediaWiki:Conversiontable/pt and MediaWiki:Conversiontable/pt-br at the wikimedia projects in Portuguese for the conversion? Is it easy to have the language conversion enabled? Could we gradually create the conversion tables?
Sorry for so many questions...
Helder
wikitech-l@lists.wikimedia.org