Hi,
In MediaWiki, the <nowiki> tag was created for writing characters without having them interpreted as wiki syntax. An obvious and direct use case for this is writing help pages about editing wiki pages in wiki syntax, for example:
Writing <nowiki>'''words between three apostrophes'''</nowiki> will show them in bold font: '''words between three apostrophes'''.
Another related use case is demonstrating how templates work:
<nowiki>This sentence shows the template used at the end.{{Citation needed|reason=Reliable source needed for the whole sentence|date=October 2018}}</nowiki>
However, <nowiki> has less trivial use cases, that are not quite the same as demonstrating wiki syntax. One such usage I'm aware of is linking a part of a long compound German word, for example "[[Schnee]]<nowiki />reichtum". It produces the desired effect, however it is a bit of a hack: the word "nowiki" doesn't have anything to do with dividing compound words. This use is quite common in the German Wikipedia because of the nature of the German language, which has a lot of long compound words.
Are there other languages where comparable hacks with <nowiki> exist, dictated by the nature of the language or by any local policies?
-- Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי http://aharoni.wordpress.com “We're living in pieces, I want to live in peace.” – T. Moore
Amir E. Aharoni amir.aharoni@mail.huji.ac.il ezt írta (időpont: 2018. okt. 4., Cs, 16:18):
<nowiki>This sentence shows the template used at the end.{{Citation needed|reason=Reliable source needed for the whole sentence|date=October 2018}}</nowiki>
However, <nowiki> has less trivial use cases, that are not quite the same as demonstrating wiki syntax. One such usage I'm aware of is linking a part of a long compound German word, for example "[[Schnee]]<nowiki />reichtum". It produces the desired effect, however it is a bit of a hack: the word "nowiki" doesn't have anything to do with dividing compound words. This use is quite common in the German Wikipedia because of the nature of the German language, which has a lot of long compound words.
We have a lot of them in Hungarian Wikipedia, and we have just decided to eradicate them, because this is a non-desired effect. :-)
Thanks. Can you please give some particular examples?
בתאריך יום ה׳, 4 באוק׳ 2018, 17:41, מאת Bináris wikiposta@gmail.com:
Amir E. Aharoni amir.aharoni@mail.huji.ac.il ezt írta (időpont: 2018. okt. 4., Cs, 16:18):
<nowiki>This sentence shows the template used at the end.{{Citation needed|reason=Reliable source needed for the whole sentence|date=October 2018}}</nowiki>
However, <nowiki> has less trivial use cases, that are not quite the same as demonstrating wiki syntax. One such usage I'm aware of is linking a
part
of a long compound German word, for example "[[Schnee]]<nowiki
/>reichtum".
It produces the desired effect, however it is a bit of a hack: the word "nowiki" doesn't have anything to do with dividing compound words. This
use
is quite common in the German Wikipedia because of the nature of the
German
language, which has a lot of long compound words.
We have a lot of them in Hungarian Wikipedia, and we have just decided to eradicate them, because this is a non-desired effect. :-) _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Here is a list of removals. :-) https://hu.wikipedia.org/w/index.php?title=Speci%C3%A1lis:Szerkeszt%C5%91_k%...
Amir E. Aharoni amir.aharoni@mail.huji.ac.il ezt írta (időpont: 2018. okt. 4., Cs, 16:47):
Thanks. Can you please give some particular examples?
בתאריך יום ה׳, 4 באוק׳ 2018, 17:41, מאת Bináris wikiposta@gmail.com:
Amir E. Aharoni amir.aharoni@mail.huji.ac.il ezt írta (időpont: 2018. okt. 4., Cs, 16:18):
<nowiki>This sentence shows the template used at the end.{{Citation needed|reason=Reliable source needed for the whole
sentence|date=October
2018}}</nowiki>
However, <nowiki> has less trivial use cases, that are not quite the
same
as demonstrating wiki syntax. One such usage I'm aware of is linking a
part
of a long compound German word, for example "[[Schnee]]<nowiki
/>reichtum".
It produces the desired effect, however it is a bit of a hack: the word "nowiki" doesn't have anything to do with dividing compound words. This
use
is quite common in the German Wikipedia because of the nature of the
German
language, which has a lot of long compound words.
We have a lot of them in Hungarian Wikipedia, and we have just decided to eradicate them, because this is a non-desired effect. :-) _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
https://hu.wikipedia.org/w/index.php?title=Grafikus_matroid&diff=prev&am... illustrates another use: separating - and { in the unusual case where this string is wanted and you *don't* want language converter markup. ie `-<nowiki/>{foo}-` is different from `-{foo}-`. You don't usually notice this because languageconversion is disabled in many wikis, but it can cause problems if unbalanced syntax is used inside a template argument, like: `{{foo|-{bar}}`. Here you need to use `{{foo|-<nowiki/>{bar}}`, even if LanguageConverter is not enabled.
Amir -- in german, shouldn't they be tweaking the "linktrail" setting on dewiki, instead of using `<nowiki/>`? What are cases where they *do* want the link to include the entire word? Can they be automatically distinguished? --scott
On Thu, Oct 4, 2018 at 11:17 AM Bináris wikiposta@gmail.com wrote:
Here is a list of removals. :-)
https://hu.wikipedia.org/w/index.php?title=Speci%C3%A1lis:Szerkeszt%C5%91_k%...
Amir E. Aharoni amir.aharoni@mail.huji.ac.il ezt írta (időpont: 2018. okt. 4., Cs, 16:47):
Thanks. Can you please give some particular examples?
בתאריך יום ה׳, 4 באוק׳ 2018, 17:41, מאת Bináris wikiposta@gmail.com:
Amir E. Aharoni amir.aharoni@mail.huji.ac.il ezt írta (időpont:
okt. 4., Cs, 16:18):
<nowiki>This sentence shows the template used at the end.{{Citation needed|reason=Reliable source needed for the whole
sentence|date=October
2018}}</nowiki>
However, <nowiki> has less trivial use cases, that are not quite the
same
as demonstrating wiki syntax. One such usage I'm aware of is linking
a
part
of a long compound German word, for example "[[Schnee]]<nowiki
/>reichtum".
It produces the desired effect, however it is a bit of a hack: the
word
"nowiki" doesn't have anything to do with dividing compound words.
This
use
is quite common in the German Wikipedia because of the nature of the
German
language, which has a lot of long compound words.
We have a lot of them in Hungarian Wikipedia, and we have just decided
to
eradicate them, because this is a non-desired effect. :-) _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
-- Bináris _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
I'm really not an expert on German. However, I have been slowly analyzing common trails in some other languages with purpose of doing smarter link trailing some day. It's a very crazy and long term pet project :) In theory, I could do it for German, too.
בתאריך יום ה׳, 4 באוק׳ 2018, 18:39, מאת C. Scott Ananian < cananian@wikimedia.org>:
https://hu.wikipedia.org/w/index.php?title=Grafikus_matroid&diff=prev&am... illustrates another use: separating - and { in the unusual case where this string is wanted and you *don't* want language converter markup. ie `-<nowiki/>{foo}-` is different from `-{foo}-`. You don't usually notice this because languageconversion is disabled in many wikis, but it can cause problems if unbalanced syntax is used inside a template argument, like: `{{foo|-{bar}}`. Here you need to use `{{foo|-<nowiki/>{bar}}`, even if LanguageConverter is not enabled.
Amir -- in german, shouldn't they be tweaking the "linktrail" setting on dewiki, instead of using `<nowiki/>`? What are cases where they *do* want the link to include the entire word? Can they be automatically distinguished? --scott
On Thu, Oct 4, 2018 at 11:17 AM Bináris wikiposta@gmail.com wrote:
Here is a list of removals. :-)
https://hu.wikipedia.org/w/index.php?title=Speci%C3%A1lis:Szerkeszt%C5%91_k%...
Amir E. Aharoni amir.aharoni@mail.huji.ac.il ezt írta (időpont: 2018. okt. 4., Cs, 16:47):
Thanks. Can you please give some particular examples?
בתאריך יום ה׳, 4 באוק׳ 2018, 17:41, מאת Bináris <wikiposta@gmail.com
:
Amir E. Aharoni amir.aharoni@mail.huji.ac.il ezt írta (időpont:
okt. 4., Cs, 16:18):
<nowiki>This sentence shows the template used at the end.{{Citation needed|reason=Reliable source needed for the whole
sentence|date=October
2018}}</nowiki>
However, <nowiki> has less trivial use cases, that are not quite
the
same
as demonstrating wiki syntax. One such usage I'm aware of is
linking
a
part
of a long compound German word, for example "[[Schnee]]<nowiki
/>reichtum".
It produces the desired effect, however it is a bit of a hack: the
word
"nowiki" doesn't have anything to do with dividing compound words.
This
use
is quite common in the German Wikipedia because of the nature of
the
German
language, which has a lot of long compound words.
We have a lot of them in Hungarian Wikipedia, and we have just
decided
to
eradicate them, because this is a non-desired effect. :-) _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
-- Bináris _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
-- (http://cscott.net) _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
I’m not an expert on dewiki, but I assume they still want word-ending links for simple stuff like [[Gesetz]]e (plural), [[Finger]]s (genitive). I would guess these cases are still more common than the long compound words where the <nowiki/> trick is used.
Am Do., 4. Okt. 2018 um 17:44 Uhr schrieb Amir E. Aharoni < amir.aharoni@mail.huji.ac.il>:
I'm really not an expert on German. However, I have been slowly analyzing common trails in some other languages with purpose of doing smarter link trailing some day. It's a very crazy and long term pet project :) In theory, I could do it for German, too.
בתאריך יום ה׳, 4 באוק׳ 2018, 18:39, מאת C. Scott Ananian < cananian@wikimedia.org>:
https://hu.wikipedia.org/w/index.php?title=Grafikus_matroid&diff=prev&am...
illustrates another use: separating - and { in the unusual case where
this
string is wanted and you *don't* want language converter markup. ie `-<nowiki/>{foo}-` is different from `-{foo}-`. You don't usually notice this because languageconversion is disabled in many wikis, but it can
cause
problems if unbalanced syntax is used inside a template argument, like: `{{foo|-{bar}}`. Here you need to use `{{foo|-<nowiki/>{bar}}`, even if LanguageConverter is not enabled.
Amir -- in german, shouldn't they be tweaking the "linktrail" setting on dewiki, instead of using `<nowiki/>`? What are cases where they *do*
want
the link to include the entire word? Can they be automatically distinguished? --scott
On Thu, Oct 4, 2018 at 11:17 AM Bináris wikiposta@gmail.com wrote:
Here is a list of removals. :-)
https://hu.wikipedia.org/w/index.php?title=Speci%C3%A1lis:Szerkeszt%C5%91_k%...
Amir E. Aharoni amir.aharoni@mail.huji.ac.il ezt írta (időpont:
okt. 4., Cs, 16:47):
Thanks. Can you please give some particular examples?
בתאריך יום ה׳, 4 באוק׳ 2018, 17:41, מאת Bináris <
wikiposta@gmail.com
:
Amir E. Aharoni amir.aharoni@mail.huji.ac.il ezt írta (időpont:
okt. 4., Cs, 16:18):
<nowiki>This sentence shows the template used at the
end.{{Citation
needed|reason=Reliable source needed for the whole
sentence|date=October
2018}}</nowiki>
However, <nowiki> has less trivial use cases, that are not quite
the
same
as demonstrating wiki syntax. One such usage I'm aware of is
linking
a
part
of a long compound German word, for example "[[Schnee]]<nowiki
/>reichtum".
It produces the desired effect, however it is a bit of a hack:
the
word
"nowiki" doesn't have anything to do with dividing compound
words.
This
use
is quite common in the German Wikipedia because of the nature of
the
German
language, which has a lot of long compound words.
We have a lot of them in Hungarian Wikipedia, and we have just
decided
to
eradicate them, because this is a non-desired effect. :-) _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
-- Bináris _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
-- (http://cscott.net) _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Am 04.10.2018 um 18:24 schrieb Lucas Werkmeister:
I’m not an expert on dewiki, but I assume they still want word-ending links for simple stuff like [[Gesetz]]e (plural), [[Finger]]s (genitive). I would guess these cases are still more common than the long compound words where the <nowiki/> trick is used.
Well, linktrail is a regex, it could be changed to only match up to a certain length :)
But what length is tricky to decide, and the effect may be surprising / unpredictable. Limiting it to one letter is certainly not enough, what with about [[Heimat]]losigkeit and such....
All silliness aside: while dewiki has many uses for only linking part of a compound, it has MANY MORE uses for linking all of it.
Hey!
The syntax "[[Schnee]]<nowiki />reichtum" is quite common in the German community. There are not many other ways to achieve the same: <span /> or ­ can be used instead.[1] The later is often the better alternative, but an auto-replacement is not possible. For example, "[[Bund]]<nowiki />estag" must become "[[Bund]]es­tag".
Not long ago <b/> was often used. This became a problem with the recent parser updates. All <b/> got replaced with <nowiki />, as far as I'm aware of.
in German, shouldn't they be tweaking the "linktrail" setting on dewiki, instead of using `<nowiki/>`? What are cases where they *do* want the link to include the entire word?
The software feature exists because of English [[word ending]]s. The same exists in German ("viele [[Wiki]]s, viele [[Tisch]]e, viele [[Arbeit]]en"), but is overshadowed by the fact that German is a language with many composites. From my experience, the fact that all linktrails, no matter how long, become part of the link is almost always a problem. It enlarges the click region, which is good, but surprises the reader when he ends at an unexpected article. I guess it would actually be a net-gain when the feature gets turned off or tuned down in German wikis. For example, we could limit the length of the linktrail to 2 characters.
Is somebody interested in creating usage statistics for these linktrails in the German Wikipedia main namespace?
Best Thiemo
[1] https://de.wikipedia.org/wiki/Wikipedia:Verlinken#Verlinkung_von_Teilw%C3%B6...
I once wrote some very, very silly code that kind of works for Hebrew, and could possibly be adapted to other languages:
https://github.com/amire80/znavot
Pull requests welcome :)
בתאריך יום ה׳, 4 באוק׳ 2018, 19:59, מאת Thiemo Kreuz < thiemo.kreuz@wikimedia.de>:
Hey!
The syntax "[[Schnee]]<nowiki />reichtum" is quite common in the German community. There are not many other ways to achieve the same: <span /> or ­ can be used instead.[1] The later is often the better alternative, but an auto-replacement is not possible. For example, "[[Bund]]<nowiki />estag" must become "[[Bund]]es­tag".
Not long ago <b/> was often used. This became a problem with the recent parser updates. All <b/> got replaced with <nowiki />, as far as I'm aware of.
in German, shouldn't they be tweaking the "linktrail" setting on dewiki,
instead of using `<nowiki/>`? What are cases where they *do* want the link to include the entire word?
The software feature exists because of English [[word ending]]s. The same exists in German ("viele [[Wiki]]s, viele [[Tisch]]e, viele [[Arbeit]]en"), but is overshadowed by the fact that German is a language with many composites. From my experience, the fact that all linktrails, no matter how long, become part of the link is almost always a problem. It enlarges the click region, which is good, but surprises the reader when he ends at an unexpected article. I guess it would actually be a net-gain when the feature gets turned off or tuned down in German wikis. For example, we could limit the length of the linktrail to 2 characters.
Is somebody interested in creating usage statistics for these linktrails in the German Wikipedia main namespace?
Best Thiemo
[1] https://de.wikipedia.org/wiki/Wikipedia:Verlinken#Verlinkung_von_Teilw%C3%B6...
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
The problem with all anti-linktrail practices is that they make search (or search and replace) in the source very hard. This relies both to bot owners and humans who use the insource: regex search engine. I think a brand new approach would be necessary. For example, [[foo]]bar would behave as now, generate a linktrail, while [[foo]|]bar (a pipe character between the ckets) not. Another idea would be ]]] (3 ckets), but it could conflict with embedded brackets suchs as an image description with linked text. Thus all antisemantic workarounds for avoiding linktrailing would be unnecessary. We should always keep in mind that we try to approach a semantic wiki (although this is partial except Wikibase).
Am 04.10.2018 um 18:58 schrieb Thiemo Kreuz:
The syntax "[[Schnee]]<nowiki />reichtum" is quite common in the German community. There are not many other ways to achieve the same: <span /> or ­ can be used instead.[1] The later is often the better alternative, but an auto-replacement is not possible. For example, "[[Bund]]<nowiki />estag" must become "[[Bund]]es­tag".
We could introduce new syntax for this, such as &nope; or even &nowiki;.
Or how about {{}} for "this is a syntactic element, but it does nothing"? But if that is mixed in with template expansion, it won't work if it expands to nothing, since template expansion happens before link parsing, right? For better or worse...
-{}- is already commonly used on LanguageConverter wikis for "this is a syntactic element but does nothing except separate a word". The preprocessor already understands it on all wikis, as well. (But then we explicitly serialize it to literally `-{}-` if your content language doesn't have variants defined.) --scott
On Thu, Oct 4, 2018 at 5:25 PM Daniel Kinzler dkinzler@wikimedia.org wrote:
Am 04.10.2018 um 18:58 schrieb Thiemo Kreuz:
The syntax "[[Schnee]]<nowiki />reichtum" is quite common in the German community. There are not many other ways to achieve the same: <span /> or ­ can be used instead.[1] The later is often the better alternative, but an auto-replacement is not possible. For example, "[[Bund]]<nowiki />estag" must become "[[Bund]]es­tag".
We could introduce new syntax for this, such as &nope; or even &nowiki;.
Or how about {{}} for "this is a syntactic element, but it does nothing"? But if that is mixed in with template expansion, it won't work if it expands to nothing, since template expansion happens before link parsing, right? For better or worse...
-- Daniel Kinzler Principal Software Engineer, MediaWiki Platform Wikimedia Foundation
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
On Thu, Oct 4, 2018 at 2:25 PM Daniel Kinzler dkinzler@wikimedia.org wrote:
Or how about {{}} for "this is a syntactic element, but it does nothing"?
Just make a template with a nice name ( {{~}} or something) and put the nowiki in that.
I'm personally a fan of <!>.
I came across it years ago--it's a null comment. Can't find the reference at the moment though.
-Chad
On Thu, Oct 4, 2018, 2:25 PM Daniel Kinzler dkinzler@wikimedia.org wrote:
Am 04.10.2018 um 18:58 schrieb Thiemo Kreuz:
The syntax "[[Schnee]]<nowiki />reichtum" is quite common in the German community. There are not many other ways to achieve the same: <span /> or ­ can be used instead.[1] The later is often the better alternative, but an auto-replacement is not possible. For example, "[[Bund]]<nowiki />estag" must become "[[Bund]]es­tag".
We could introduce new syntax for this, such as &nope; or even &nowiki;.
Or how about {{}} for "this is a syntactic element, but it does nothing"? But if that is mixed in with template expansion, it won't work if it expands to nothing, since template expansion happens before link parsing, right? For better or worse...
-- Daniel Kinzler Principal Software Engineer, MediaWiki Platform Wikimedia Foundation
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Found it :)
https://www.w3.org/MarkUp/SGML/sgml-lex/sgml-lex
Search for "empty comment declaration" :)
-Chad
On Fri, Oct 5, 2018, 11:50 PM Chad innocentkiller@gmail.com wrote:
I'm personally a fan of <!>.
I came across it years ago--it's a null comment. Can't find the reference at the moment though.
-Chad
On Thu, Oct 4, 2018, 2:25 PM Daniel Kinzler dkinzler@wikimedia.org wrote:
Am 04.10.2018 um 18:58 schrieb Thiemo Kreuz:
The syntax "[[Schnee]]<nowiki />reichtum" is quite common in the German community. There are not many other ways to achieve the same: <span /> or ­ can be used instead.[1] The later is often the better alternative, but an auto-replacement is not possible. For example, "[[Bund]]<nowiki />estag" must become "[[Bund]]es­tag".
We could introduce new syntax for this, such as &nope; or even &nowiki;.
Or how about {{}} for "this is a syntactic element, but it does nothing"? But if that is mixed in with template expansion, it won't work if it expands to nothing, since template expansion happens before link parsing, right? For better or worse...
-- Daniel Kinzler Principal Software Engineer, MediaWiki Platform Wikimedia Foundation
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
On Oct 5, 2018 11:50 PM, "Chad" innocentkiller@gmail.com wrote:
I'm personally a fan of <!>.
I came across it years ago--it's a null comment. Can't find the reference at the moment though.
-Chad
On Thu, Oct 4, 2018, 2:25 PM Daniel Kinzler dkinzler@wikimedia.org wrote:
Am 04.10.2018 um 18:58 schrieb Thiemo Kreuz:
The syntax "[[Schnee]]<nowiki />reichtum" is quite common in the German community. There are not many other ways to achieve the same: <span /> or ­ can be used instead.[1] The later is often the better alternative, but an auto-replacement is not possible. For example, "[[Bund]]<nowiki />estag" must become "[[Bund]]es­tag".
We could introduce new syntax for this, such as &nope; or even &nowiki;.
Or how about {{}} for "this is a syntactic element, but it does nothing"? But if that is mixed in with template expansion, it won't work if it expands to nothing, since template expansion happens before link parsing, right? For better or worse...
-- Daniel Kinzler Principal Software Engineer, MediaWiki Platform Wikimedia Foundation
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Alas, no longer valid in XML or HTML5. (Although HTML5 will still parse it as an empty comment, but with a "incorrectly-opened-comment" error.
-- Brian
On Sat, Oct 6, 2018 at 6:57 AM Chad innocentkiller@gmail.com wrote:
Found it :)
https://www.w3.org/MarkUp/SGML/sgml-lex/sgml-lex
Search for "empty comment declaration" :)
-Chad
On Fri, Oct 5, 2018, 11:50 PM Chad innocentkiller@gmail.com wrote:
I'm personally a fan of <!>.
I came across it years ago--it's a null comment. Can't find the reference at the moment though.
-Chad
On Thu, Oct 4, 2018, 2:25 PM Daniel Kinzler dkinzler@wikimedia.org wrote:
Am 04.10.2018 um 18:58 schrieb Thiemo Kreuz:
The syntax "[[Schnee]]<nowiki />reichtum" is quite common in the German community. There are not many other ways to achieve the same: <span /> or ­ can be used instead.[1] The later is often the better alternative, but an auto-replacement is not possible. For example, "[[Bund]]<nowiki />estag" must become "[[Bund]]es­tag".
We could introduce new syntax for this, such as &nope; or even &nowiki;.
Or how about {{}} for "this is a syntactic element, but it does nothing"? But if that is mixed in with template expansion, it won't work if it expands to nothing, since template expansion happens before link parsing, right? For better or worse...
-- Daniel Kinzler Principal Software Engineer, MediaWiki Platform Wikimedia Foundation
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
On Oct 5, 2018 11:50 PM, "Chad" innocentkiller@gmail.com wrote:
I'm personally a fan of <!>.
I came across it years ago--it's a null comment. Can't find the reference at the moment though.
-Chad
On Thu, Oct 4, 2018, 2:25 PM Daniel Kinzler dkinzler@wikimedia.org wrote:
Am 04.10.2018 um 18:58 schrieb Thiemo Kreuz:
The syntax "[[Schnee]]<nowiki />reichtum" is quite common in the German community. There are not many other ways to achieve the same: <span /> or ­ can be used instead.[1] The later is often the better alternative, but an auto-replacement is not possible. For example, "[[Bund]]<nowiki />estag" must become "[[Bund]]es­tag".
We could introduce new syntax for this, such as &nope; or even &nowiki;.
Or how about {{}} for "this is a syntactic element, but it does nothing"? But if that is mixed in with template expansion, it won't work if it expands to nothing, since template expansion happens before link parsing, right? For better or worse...
-- Daniel Kinzler Principal Software Engineer, MediaWiki Platform Wikimedia Foundation
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
... And, more importantly, its form doesn't say "separate the trail from the link". Just like <nowiki>, it only *happened* to do it (I tried on Wikipedia, and it doesn't do it now).
The point I'm trying to make in this thread is that <nowiki> happens to do certain things other than showing wiki syntax without parsing, and is used for them as if it's *intended* for it, but this is a hack. If a certain functionality is needed, such as separating the trail from the link, then it's worth considering creating a piece of syntax for it.
-- Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי http://aharoni.wordpress.com “We're living in pieces, I want to live in peace.” – T. Moore
בתאריך יום א׳, 7 באוק׳ 2018 ב-19:08 מאת bawolff <bawolff+wn@gmail.com >:
Alas, no longer valid in XML or HTML5. (Although HTML5 will still parse it as an empty comment, but with a "incorrectly-opened-comment" error.
-- Brian
On Sat, Oct 6, 2018 at 6:57 AM Chad innocentkiller@gmail.com wrote:
Found it :)
https://www.w3.org/MarkUp/SGML/sgml-lex/sgml-lex
Search for "empty comment declaration" :)
-Chad
On Fri, Oct 5, 2018, 11:50 PM Chad innocentkiller@gmail.com wrote:
I'm personally a fan of <!>.
I came across it years ago--it's a null comment. Can't find the
reference
at the moment though.
-Chad
On Thu, Oct 4, 2018, 2:25 PM Daniel Kinzler dkinzler@wikimedia.org wrote:
Am 04.10.2018 um 18:58 schrieb Thiemo Kreuz:
The syntax "[[Schnee]]<nowiki />reichtum" is quite common in the German community. There are not many other ways to achieve the same: <span /> or ­ can be used instead.[1] The later is often the better alternative, but an auto-replacement is not possible. For example, "[[Bund]]<nowiki />estag" must become "[[Bund]]es­tag".
We could introduce new syntax for this, such as &nope; or even
&nowiki;.
Or how about {{}} for "this is a syntactic element, but it does
nothing"?
But if that is mixed in with template expansion, it won't work if it expands
to
nothing, since template expansion happens before link parsing, right?
For
better or worse...
-- Daniel Kinzler Principal Software Engineer, MediaWiki Platform Wikimedia Foundation
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
On Oct 5, 2018 11:50 PM, "Chad" innocentkiller@gmail.com wrote:
I'm personally a fan of <!>.
I came across it years ago--it's a null comment. Can't find the reference at the moment though.
-Chad
On Thu, Oct 4, 2018, 2:25 PM Daniel Kinzler dkinzler@wikimedia.org
wrote:
Am 04.10.2018 um 18:58 schrieb Thiemo Kreuz:
The syntax "[[Schnee]]<nowiki />reichtum" is quite common in the German community. There are not many other ways to achieve the same: <span /> or ­ can be used instead.[1] The later is often the better alternative, but an auto-replacement is not possible. For example, "[[Bund]]<nowiki />estag" must become "[[Bund]]es­tag".
We could introduce new syntax for this, such as &nope; or even
&nowiki;.
Or how about {{}} for "this is a syntactic element, but it does
nothing"?
But if that is mixed in with template expansion, it won't work if it expands
to
nothing, since template expansion happens before link parsing, right?
For
better or worse...
-- Daniel Kinzler Principal Software Engineer, MediaWiki Platform Wikimedia Foundation
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
The relevant Parsoid feature request for having VE use linktrails is https://phabricator.wikimedia.org/T50463 since in general Parsoid just generates [[Book|books]] when VE gives it `<a href="./Book">books</a>`.
If VE gives Parsoid `<a href="./Book">book</a>s` it will assume that's what the author actually meant, and will generate `[[Book]]<nowiki/>s` using a very general mechanism used for a number of other syntax conflicts (like if you actually want to start a line with the literal character `*`).
I don't think the answer is to invent new syntax for linktrail separation -- we already have quite enough different ways of escaping and/or token-breaking already, as partially enumerated in this thread already. The only one I would be happy to faciliatate would be `-{}-` since it is already an odd parser corner case -- it is parsed by the wikitext preprocessor but then spit back out as literal text by the second parsing phase unless LanguageConverter is enabled for the specific page language. It would simplify the parse if the LanguageConverter constructs were "always on" instead of being en/disabled on a page-by-page basis. --scott
On Sun, Oct 7, 2018 at 12:23 PM Amir E. Aharoni < amir.aharoni@mail.huji.ac.il> wrote:
... And, more importantly, its form doesn't say "separate the trail from the link". Just like <nowiki>, it only *happened* to do it (I tried on Wikipedia, and it doesn't do it now).
The point I'm trying to make in this thread is that <nowiki> happens to do certain things other than showing wiki syntax without parsing, and is used for them as if it's *intended* for it, but this is a hack. If a certain functionality is needed, such as separating the trail from the link, then it's worth considering creating a piece of syntax for it.
-- Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי http://aharoni.wordpress.com “We're living in pieces, I want to live in peace.” – T. Moore
בתאריך יום א׳, 7 באוק׳ 2018 ב-19:08 מאת bawolff <bawolff+wn@gmail.com >:
Alas, no longer valid in XML or HTML5. (Although HTML5 will still parse it as an empty comment, but with a "incorrectly-opened-comment" error.
-- Brian
On Sat, Oct 6, 2018 at 6:57 AM Chad innocentkiller@gmail.com wrote:
Found it :)
https://www.w3.org/MarkUp/SGML/sgml-lex/sgml-lex
Search for "empty comment declaration" :)
-Chad
On Fri, Oct 5, 2018, 11:50 PM Chad innocentkiller@gmail.com wrote:
I'm personally a fan of <!>.
I came across it years ago--it's a null comment. Can't find the
reference
at the moment though.
-Chad
On Thu, Oct 4, 2018, 2:25 PM Daniel Kinzler dkinzler@wikimedia.org wrote:
Am 04.10.2018 um 18:58 schrieb Thiemo Kreuz:
The syntax "[[Schnee]]<nowiki />reichtum" is quite common in the German community. There are not many other ways to achieve the
same:
<span /> or ­ can be used instead.[1] The later is often the better alternative, but an auto-replacement is not possible. For example, "[[Bund]]<nowiki />estag" must become
"[[Bund]]es­tag".
We could introduce new syntax for this, such as &nope; or even
&nowiki;.
Or how about {{}} for "this is a syntactic element, but it does
nothing"?
But if that is mixed in with template expansion, it won't work if it
expands
to
nothing, since template expansion happens before link parsing,
right?
For
better or worse...
-- Daniel Kinzler Principal Software Engineer, MediaWiki Platform Wikimedia Foundation
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
On Oct 5, 2018 11:50 PM, "Chad" innocentkiller@gmail.com wrote:
I'm personally a fan of <!>.
I came across it years ago--it's a null comment. Can't find the
reference
at the moment though.
-Chad
On Thu, Oct 4, 2018, 2:25 PM Daniel Kinzler dkinzler@wikimedia.org
wrote:
Am 04.10.2018 um 18:58 schrieb Thiemo Kreuz:
The syntax "[[Schnee]]<nowiki />reichtum" is quite common in the German community. There are not many other ways to achieve the
same:
<span /> or ­ can be used instead.[1] The later is often the better alternative, but an auto-replacement is not possible. For example, "[[Bund]]<nowiki />estag" must become
"[[Bund]]es­tag".
We could introduce new syntax for this, such as &nope; or even
&nowiki;.
Or how about {{}} for "this is a syntactic element, but it does
nothing"?
But if that is mixed in with template expansion, it won't work if it expands
to
nothing, since template expansion happens before link parsing, right?
For
better or worse...
-- Daniel Kinzler Principal Software Engineer, MediaWiki Platform Wikimedia Foundation
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
We have the same in Norwegian, but linking on part of a composite is almost always wrong. Either you link on the whole composite or no part of the composite. If you link on a part of a composite, then in nearly all cases I have seen the link is placed on the wrong term.
Some examples on what insanity users write - [[absorpsjon]]s[[Spektrallinje|linjene]] - [[Autentisering]]s[[Protokoll (datamaskiner)|protokollen]] - [[Sykepleie|sykehjem]]s[[Hjemmesykepleie|omsorg]]
From an article messed up by VE (yes it does mess up articles sometimes!)
- ma[[Øssur Havgrímsson|ge]]<nowiki/>e[[Øssur Havgrímsson|evner og]] - og[[Øssur Havgrímsson|i]]<nowiki/>t[[Øssur Havgrímsson|det samme]]
I have no clue what the previous means…
Things like the following is quite common - [[Alexander Kielland]]<nowiki/>s - [[De forente nasjoner|FN]]<nowiki/>s
Usually it comes from user errors while using VE. This kind of errors are quite common, and I asked (several years ago) whether it could be fixed in VE, but was told "no".
Anyhow I just started a bot to clean up some of the mess…
On Thu, Oct 4, 2018 at 6:59 PM Thiemo Kreuz thiemo.kreuz@wikimedia.de wrote:
Hey!
The syntax "[[Schnee]]<nowiki />reichtum" is quite common in the German community. There are not many other ways to achieve the same: <span /> or ­ can be used instead.[1] The later is often the better alternative, but an auto-replacement is not possible. For example, "[[Bund]]<nowiki />estag" must become "[[Bund]]es­tag".
Not long ago <b/> was often used. This became a problem with the recent parser updates. All <b/> got replaced with <nowiki />, as far as I'm aware of.
in German, shouldn't they be tweaking the "linktrail" setting on dewiki,
instead of using `<nowiki/>`? What are cases where they *do* want the link to include the entire word?
The software feature exists because of English [[word ending]]s. The same exists in German ("viele [[Wiki]]s, viele [[Tisch]]e, viele [[Arbeit]]en"), but is overshadowed by the fact that German is a language with many composites. From my experience, the fact that all linktrails, no matter how long, become part of the link is almost always a problem. It enlarges the click region, which is good, but surprises the reader when he ends at an unexpected article. I guess it would actually be a net-gain when the feature gets turned off or tuned down in German wikis. For example, we could limit the length of the linktrail to 2 characters.
Is somebody interested in creating usage statistics for these linktrails in the German Wikipedia main namespace?
Best Thiemo
[1] https://de.wikipedia.org/wiki/Wikipedia:Verlinken#Verlinkung_von_Teilw%C3%B6...
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Interesting, today this was topic in the German main forum: https://de.wikipedia.org/wiki/Wikipedia:Fragen_zur_Wikipedia#Anwendung_von_%...
Today there are also more than one user indefinite blocked, which only removed <nowiki/> https://de.wikipedia.org/wiki/Benutzer:Entgr%C3%A4ten40
Am Fr., 5. Okt. 2018 um 00:29 Uhr schrieb John Erling Blad <jeblad@gmail.com
:
We have the same in Norwegian, but linking on part of a composite is almost always wrong. Either you link on the whole composite or no part of the composite. If you link on a part of a composite, then in nearly all cases I have seen the link is placed on the wrong term.
Some examples on what insanity users write
- [[absorpsjon]]s[[Spektrallinje|linjene]]
- [[Autentisering]]s[[Protokoll (datamaskiner)|protokollen]]
- [[Sykepleie|sykehjem]]s[[Hjemmesykepleie|omsorg]]
From an article messed up by VE (yes it does mess up articles sometimes!)
- ma[[Øssur Havgrímsson|ge]]<nowiki/>e[[Øssur Havgrímsson|evner og]]
- og[[Øssur Havgrímsson|i]]<nowiki/>t[[Øssur Havgrímsson|det samme]]
I have no clue what the previous means…
Things like the following is quite common
- [[Alexander Kielland]]<nowiki/>s
- [[De forente nasjoner|FN]]<nowiki/>s
Usually it comes from user errors while using VE. This kind of errors are quite common, and I asked (several years ago) whether it could be fixed in VE, but was told "no".
Anyhow I just started a bot to clean up some of the mess…
On Thu, Oct 4, 2018 at 6:59 PM Thiemo Kreuz thiemo.kreuz@wikimedia.de wrote:
Hey!
The syntax "[[Schnee]]<nowiki />reichtum" is quite common in the German community. There are not many other ways to achieve the same: <span /> or ­ can be used instead.[1] The later is often the better alternative, but an auto-replacement is not possible. For example, "[[Bund]]<nowiki />estag" must become "[[Bund]]es­tag".
Not long ago <b/> was often used. This became a problem with the recent parser updates. All <b/> got replaced with <nowiki />, as far as I'm aware of.
in German, shouldn't they be tweaking the "linktrail" setting on
dewiki,
instead of using `<nowiki/>`? What are cases where they *do* want the
link
to include the entire word?
The software feature exists because of English [[word ending]]s. The same exists in German ("viele [[Wiki]]s, viele [[Tisch]]e, viele [[Arbeit]]en"), but is overshadowed by the fact that German is a language with many composites. From my experience, the fact that all linktrails, no matter how long, become part of the link is almost always a problem. It enlarges the click region, which is good, but surprises the reader when he ends at an unexpected article. I guess it would actually be a net-gain when the feature gets turned off or tuned down in German wikis. For example, we could limit the length of the linktrail to 2 characters.
Is somebody interested in creating usage statistics for these linktrails in the German Wikipedia main namespace?
Best Thiemo
[1]
https://de.wikipedia.org/wiki/Wikipedia:Verlinken#Verlinkung_von_Teilw%C3%B6...
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Only thing more dangerous than running a bot on nowiki is running a bot on dewiki. Nope, newer touches dewiki.
On Fri, Oct 5, 2018 at 12:49 AM Roul P. perhelion1@gmail.com wrote:
Interesting, today this was topic in the German main forum:
https://de.wikipedia.org/wiki/Wikipedia:Fragen_zur_Wikipedia#Anwendung_von_%...
Today there are also more than one user indefinite blocked, which only removed <nowiki/> https://de.wikipedia.org/wiki/Benutzer:Entgr%C3%A4ten40
Am Fr., 5. Okt. 2018 um 00:29 Uhr schrieb John Erling Blad < jeblad@gmail.com
:
We have the same in Norwegian, but linking on part of a composite is
almost
always wrong. Either you link on the whole composite or no part of the composite. If you link on a part of a composite, then in nearly all
cases I
have seen the link is placed on the wrong term.
Some examples on what insanity users write
- [[absorpsjon]]s[[Spektrallinje|linjene]]
- [[Autentisering]]s[[Protokoll (datamaskiner)|protokollen]]
- [[Sykepleie|sykehjem]]s[[Hjemmesykepleie|omsorg]]
From an article messed up by VE (yes it does mess up articles sometimes!)
- ma[[Øssur Havgrímsson|ge]]<nowiki/>e[[Øssur Havgrímsson|evner og]]
- og[[Øssur Havgrímsson|i]]<nowiki/>t[[Øssur Havgrímsson|det samme]]
I have no clue what the previous means…
Things like the following is quite common
- [[Alexander Kielland]]<nowiki/>s
- [[De forente nasjoner|FN]]<nowiki/>s
Usually it comes from user errors while using VE. This kind of errors are quite common, and I asked (several years ago) whether it could be fixed
in
VE, but was told "no".
Anyhow I just started a bot to clean up some of the mess…
On Thu, Oct 4, 2018 at 6:59 PM Thiemo Kreuz thiemo.kreuz@wikimedia.de wrote:
Hey!
The syntax "[[Schnee]]<nowiki />reichtum" is quite common in the German community. There are not many other ways to achieve the same: <span /> or ­ can be used instead.[1] The later is often the better alternative, but an auto-replacement is not possible. For example, "[[Bund]]<nowiki />estag" must become "[[Bund]]es­tag".
Not long ago <b/> was often used. This became a problem with the recent parser updates. All <b/> got replaced with <nowiki />, as far as I'm aware of.
in German, shouldn't they be tweaking the "linktrail" setting on
dewiki,
instead of using `<nowiki/>`? What are cases where they *do* want the
link
to include the entire word?
The software feature exists because of English [[word ending]]s. The same exists in German ("viele [[Wiki]]s, viele [[Tisch]]e, viele [[Arbeit]]en"), but is overshadowed by the fact that German is a language with many composites. From my experience, the fact that all linktrails, no matter how long, become part of the link is almost always a problem. It enlarges the click region, which is good, but surprises the reader when he ends at an unexpected article. I guess it would actually be a net-gain when the feature gets turned off or tuned down in German wikis. For example, we could limit the length of the linktrail to 2 characters.
Is somebody interested in creating usage statistics for these linktrails in the German Wikipedia main namespace?
Best Thiemo
[1]
https://de.wikipedia.org/wiki/Wikipedia:Verlinken#Verlinkung_von_Teilw%C3%B6...
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
On Thu, 4 Oct 2018 at 23:29, John Erling Blad jeblad@gmail.com wrote:
Usually it comes from user errors while using VE. This kind of errors are quite common, and I asked (several years ago) whether it could be fixed in VE, but was told "no".
I'd really appreciate it if you could give me more information on this. Could you link to the task for this request? There is T128060 https://phabricator.wikimedia.org/T128060 from early 2016 ("VisualEditor makes it easy to create partially linked words, when the user expects a fully linked one") but I don't see you on there, and I want to make sure I understand your request.
Here's how the linking feature works right now for adding links to words which presently have no links:
- If you put your cursor inside a word without highlighting anything, and add a link, the link is added to the entire word. - If you highlight some text, and add a link, the link is added to the highlighted text.
How would you propose this feature be changed?
Thanks, Dan
בתאריך יום ו׳, 5 באוק׳ 2018 ב-16:59 מאת Dan Garry <dgarry@wikimedia.org >:
On Thu, 4 Oct 2018 at 23:29, John Erling Blad jeblad@gmail.com wrote:
Usually it comes from user errors while using VE. This kind of errors
are
quite common, and I asked (several years ago) whether it could be fixed
in
VE, but was told "no".
I'd really appreciate it if you could give me more information on this.
This is very frequent. I know that in the Hebrew Wikipedia it happens up to 20 times a day (I actually counted this for many months), and this is never intentional or desirable. Never, ever. 100% of cases. The same must be true for many other languages, but probably not for all. In wikis bigger than the Hebrew Wikipedia it probably happens much more often than 20 times a day.
It is possibly the most frequent reason for automatic insertion of <nowiki> tags (although this may be different by language).
How does it happen? Several ways: * People add a word ending to an existing link. English has very few word endings (-s, -ing, -ed, -able, and not much more), but many other languages have more. * People highlight only a part of a word when they add a link, even though they should have highlighted the whole word. * In particular, people highlight the part of the word without an ending. For example, "Dogs" is written, and people highlight "Dog". * People sometimes actually want to write two separate words and forget to write a space. (This may sound silly, but I saw this happening very often.) * People write a compound word and link a part of the word. Sometimes it's intentional, although as we can see in other emails in this thread not everybody agrees about the desirability of this. This works very differently in different languages. German has a lot of them, English has much less, Hebrew has almost zero.
It's worth running proper user testing
Here's how the linking feature works right now for adding links to words which presently have no links:
- If you put your cursor inside a word without highlighting anything,
and add a link, the link is added to the entire word.
- If you highlight some text, and add a link, the link is added to the
highlighted text.
I know this, and I like how it works, but the fact is that there are many other users who don't know this. Simply searching wikitext for "]]<nowiki/>" will show how often does this happen.
How would you propose this feature be changed?
One possibility is to not add <nowiki/> after a link. I proposed it, but it was declined: https://phabricator.wikimedia.org/T141689 . The declining comment links to T128060, which you mentioned in your email, and it's still not resolved.
Other than fully stopping to do it, I cannot think of many other possibilities. Maybe we could show a warning, although I suspect that many users will ignore it or find it unnecessarily intrusive. I'm not a real designer, and it's possible that a real designer can come with something better.
Another thing we could consider is to link the whole word *by default*, and to add another function that separates a link from the trail. I'd further suggest the separation be done internally not by "<nowiki/>", but by some other syntax that looks more semantic, for example "{{#sep}}" (this should be a magic word and not a template!). My educated guess is that separating the word from the link is much less frequent than wanting to link the whole word. Part of my motivation for starting this thread was to understand how does this work in different languages.
In my opinion we should try to first process the whole linked phrase by inflection aka affix rules, and if that fails aka no link target can be found – then and only then should regexps form prefix and linktrails be applied. If applying prefix or linktrails creates a word that can be inflected, and it links to the same target, then move the strings into the linked phrase. If the link use the pipe-form, then move the strings into the second part of the link, aka the link text.
Links using the pipe-form should not have the link target inflected. This is important, as this is the natural escape route if inflection gives wrong target for whatever reason.
Inflected links should go to the target with the smallest difference. This is a non-trivial problem. We often link _phrases_ and those could be processed by several rules, each with some kind of weight rules. An edit distance would probably not be sufficient.
Perhaps most important; VisualEditor should not insert <nowiki/>, if the users needs this escape route then let them do it themselves in WikitextEditor.
On Fri, Oct 5, 2018 at 6:17 PM Amir E. Aharoni amir.aharoni@mail.huji.ac.il wrote:
בתאריך יום ו׳, 5 באוק׳ 2018 ב-16:59 מאת Dan Garry < dgarry@wikimedia.org >:
On Thu, 4 Oct 2018 at 23:29, John Erling Blad jeblad@gmail.com wrote:
Usually it comes from user errors while using VE. This kind of errors
are
quite common, and I asked (several years ago) whether it could be fixed
in
VE, but was told "no".
I'd really appreciate it if you could give me more information on this.
This is very frequent. I know that in the Hebrew Wikipedia it happens up to 20 times a day (I actually counted this for many months), and this is never intentional or desirable. Never, ever. 100% of cases. The same must be true for many other languages, but probably not for all. In wikis bigger than the Hebrew Wikipedia it probably happens much more often than 20 times a day.
It is possibly the most frequent reason for automatic insertion of <nowiki> tags (although this may be different by language).
How does it happen? Several ways:
- People add a word ending to an existing link. English has very few word
endings (-s, -ing, -ed, -able, and not much more), but many other languages have more.
- People highlight only a part of a word when they add a link, even though
they should have highlighted the whole word.
- In particular, people highlight the part of the word without an ending.
For example, "Dogs" is written, and people highlight "Dog".
- People sometimes actually want to write two separate words and forget to
write a space. (This may sound silly, but I saw this happening very often.)
- People write a compound word and link a part of the word. Sometimes it's
intentional, although as we can see in other emails in this thread not everybody agrees about the desirability of this. This works very differently in different languages. German has a lot of them, English has much less, Hebrew has almost zero.
It's worth running proper user testing
Here's how the linking feature works right now for adding links to words which presently have no links:
- If you put your cursor inside a word without highlighting anything,
and add a link, the link is added to the entire word.
- If you highlight some text, and add a link, the link is added to the
highlighted text.
I know this, and I like how it works, but the fact is that there are many other users who don't know this. Simply searching wikitext for "]]<nowiki/>" will show how often does this happen.
How would you propose this feature be changed?
One possibility is to not add <nowiki/> after a link. I proposed it, but it was declined: https://phabricator.wikimedia.org/T141689 . The declining comment links to T128060, which you mentioned in your email, and it's still not resolved.
Other than fully stopping to do it, I cannot think of many other possibilities. Maybe we could show a warning, although I suspect that many users will ignore it or find it unnecessarily intrusive. I'm not a real designer, and it's possible that a real designer can come with something better.
Another thing we could consider is to link the whole word *by default*, and to add another function that separates a link from the trail. I'd further suggest the separation be done internally not by "<nowiki/>", but by some other syntax that looks more semantic, for example "{{#sep}}" (this should be a magic word and not a template!). My educated guess is that separating the word from the link is much less frequent than wanting to link the whole word. Part of my motivation for starting this thread was to understand how does this work in different languages. _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Hi,
Links using the pipe-form should not have the link target inflected. This is important, as this is the natural escape route if inflection gives wrong target for whatever reason.
This is what I think is particularly odd about linktrails: Why do links like [[Examples|Example]]s have a linktrail? I wouldn't expect it and I don't think anyone would, on the contrary I still remember discovering this really weird behavior years ago.
I know parser changes are difficult, but adding linktrails only to links without | seems like the easiest and expected solution for this whole problem to me, even if it isn't the most elegant one.
Regards, MGChecker
-----Ursprüngliche Nachricht----- Von: Wikitech-l [mailto:wikitech-l-bounces@lists.wikimedia.org] Im Auftrag von John Erling Blad Gesendet: Freitag, 5. Oktober 2018 20:48 An: Wikimedia developers Betreff: Re: [Wikitech-l] non-obvious uses of <nowiki> in your language
In my opinion we should try to first process the whole linked phrase by inflection aka affix rules, and if that fails aka no link target can be found – then and only then should regexps form prefix and linktrails be applied. If applying prefix or linktrails creates a word that can be inflected, and it links to the same target, then move the strings into the linked phrase. If the link use the pipe-form, then move the strings into the second part of the link, aka the link text.
Links using the pipe-form should not have the link target inflected. This is important, as this is the natural escape route if inflection gives wrong target for whatever reason.
Inflected links should go to the target with the smallest difference. This is a non-trivial problem. We often link _phrases_ and those could be processed by several rules, each with some kind of weight rules. An edit distance would probably not be sufficient.
Perhaps most important; VisualEditor should not insert <nowiki/>, if the users needs this escape route then let them do it themselves in WikitextEditor.
On Fri, Oct 5, 2018 at 6:17 PM Amir E. Aharoni amir.aharoni@mail.huji.ac.il wrote:
בתאריך יום ו׳, 5 באוק׳ 2018 ב-16:59 מאת Dan Garry < dgarry@wikimedia.org >:
On Thu, 4 Oct 2018 at 23:29, John Erling Blad jeblad@gmail.com wrote:
Usually it comes from user errors while using VE. This kind of errors
are
quite common, and I asked (several years ago) whether it could be fixed
in
VE, but was told "no".
I'd really appreciate it if you could give me more information on this.
This is very frequent. I know that in the Hebrew Wikipedia it happens up to 20 times a day (I actually counted this for many months), and this is never intentional or desirable. Never, ever. 100% of cases. The same must be true for many other languages, but probably not for all. In wikis bigger than the Hebrew Wikipedia it probably happens much more often than 20 times a day.
It is possibly the most frequent reason for automatic insertion of <nowiki> tags (although this may be different by language).
How does it happen? Several ways:
- People add a word ending to an existing link. English has very few
word endings (-s, -ing, -ed, -able, and not much more), but many other languages have more.
- People highlight only a part of a word when they add a link, even
though they should have highlighted the whole word.
- In particular, people highlight the part of the word without an ending.
For example, "Dogs" is written, and people highlight "Dog".
- People sometimes actually want to write two separate words and
forget to write a space. (This may sound silly, but I saw this happening very often.)
- People write a compound word and link a part of the word. Sometimes
it's intentional, although as we can see in other emails in this thread not everybody agrees about the desirability of this. This works very differently in different languages. German has a lot of them, English has much less, Hebrew has almost zero.
It's worth running proper user testing
Here's how the linking feature works right now for adding links to words which presently have no links:
- If you put your cursor inside a word without highlighting anything,
and add a link, the link is added to the entire word.
- If you highlight some text, and add a link, the link is added to the
highlighted text.
I know this, and I like how it works, but the fact is that there are many other users who don't know this. Simply searching wikitext for "]]<nowiki/>" will show how often does this happen.
How would you propose this feature be changed?
One possibility is to not add <nowiki/> after a link. I proposed it, but it was declined: https://phabricator.wikimedia.org/T141689 . The declining comment links to T128060, which you mentioned in your email, and it's still not resolved.
Other than fully stopping to do it, I cannot think of many other possibilities. Maybe we could show a warning, although I suspect that many users will ignore it or find it unnecessarily intrusive. I'm not a real designer, and it's possible that a real designer can come with something better.
Another thing we could consider is to link the whole word *by default*, and to add another function that separates a link from the trail. I'd further suggest the separation be done internally not by "<nowiki/>", but by some other syntax that looks more semantic, for example "{{#sep}}" (this should be a magic word and not a template!). My educated guess is that separating the word from the link is much less frequent than wanting to link the whole word. Part of my motivation for starting this thread was to understand how does this work in different languages. _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
_______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
MGChecker wrote:
Links using the pipe-form should not have the link target inflected. This is important, as this is the natural escape route if inflection gives wrong target for whatever reason.
This is what I think is particularly odd about linktrails: Why do links like [[Examples|Example]]s have a linktrail? I wouldn't expect it and I don't think anyone would, on the contrary I still remember discovering this really weird behavior years ago.
I'm not sure I understand. I would expect a link trail with "[[Examples|Example]]s" since there is a link trail with "[[Example]]s". I'm not sure why anyone would associate link trail behavior with the presence or lack of a pipe character. The defining characteristic of link trails is text being adjacent to "]]", as far as I know.
Is the particular case you mention common? It seems like it would be much more common for a user to simply write "[[Examples]]" currently to achieve the same output.
MZMcBride
MZMcBrider wrote: I'm not sure I understand. I would expect a link trail with "[[Examples|Example]]s" since there is a link trail with "[[Example]]s". I'm not sure why anyone would associate link trail behavior with the presence or lack of a pipe character. The defining characteristic of link trails is text being adjacent to "]]", as far as I know.
Yeah, currently there is a link trail with "[[Example]]s", but I neither consider this intuitive nor helpful. If I specify target and link text separately, why would I want a link trail? I could write it as part of the target instead. I think for most people writing something like [[Examples|Example]]s is the first thing they try to avoid link trails. In my opinion, link trailing doesn't make anything easier if target and link text are specified separately. To be clear: I propose to change the current parser behavior to avoid unwanted link trails.
Is the particular case you mention common? It seems like it would be much more common for a user to simply write "[[Examples]]" currently to achieve the same output.
As the case I mentioned shouldn't be common and clearly more complicated as needed, I think a behavior change wouldn't have that much impact.
Regards, MGChecker
I'm not sure how much impact it would have on existing link specifications to make the change, but I think MGChecker has a good solution. The "[[target|linktext]]extra" format allows you to specify exactly what part of the text should have a link, while "[[target]]extra" would be understood as a shortcut to "[[target|targetextra]]". This solves the linktrails problem without introducing any extra tags or using nowiki in weird ways.
Looking at some examples in this thread:
- [[Schnee]]<nowiki />reichtum would be [[Schnee|Schnee]]reichtum - [[Gesetz]]e and [[Finger]]s are fine - [[Heimat]]losigkeit is fine - [[absorpsjon]]s[[Spektrallinje|linjene]] might work as intended, but if the middle "s" isn't supposed to be linked then [[Absorpsjon| absorpsjon]]s[[Spektrallinje|linjene]] would do the trick - ma[[Øssur Havgrímsson|ge]]<nowiki/>e[[Øssur Havgrímsson|evner og]] is still something of a mystery, but ma[[Øssur Havgrímsson|ge]]e[[Øssur Havgrímsson|evner og]] would probably do what is intended - [[Alexander Kielland]]<nowiki/>s would be [[Alexander Kielland|Alexander Kielland]]s - [[De forente nasjoner|FN]]s would be fine
This isn't really about <nowiki> anymore—sorry Amir!—but I think it could solve the linktrails syntax issue. The problem, as I alluded to earlier, is what changing the syntax would do to existing links. Though it would be possible to automatically convert existing "[[target|linktext]]extra" to " [[target|linktextextra]]" if target and linktext are different, or " [[target]]extra" if target and linktext are the same (possibly modulo whatever minor differences are allowed, like upper/lowercase—though there are rare instances of articles that differ only by upper/lowercase).
Are their any other linktrails setting other than off and on? We'd want to make sure any changes didn't do weird things to Chinese or other spaceless languages.
—Trey
Trey Jones Sr. Software Engineer, Search Platform Wikimedia Foundation
On Sat, Oct 13, 2018 at 9:29 PM, MGChecker hgasuser@gmail.com wrote:
MZMcBrider wrote: I'm not sure I understand. I would expect a link trail with
"[[Examples|Example]]s" since there is a link trail with "[[Example]]s".
I'm not sure why anyone would associate link trail behavior with the
presence or lack of a pipe character. The defining characteristic of link trails is text being adjacent to "]]", as far as I know.
Yeah, currently there is a link trail with "[[Example]]s", but I neither consider this intuitive nor helpful. If I specify target and link text separately, why would I want a link trail? I could write it as part of the target instead. I think for most people writing something like [[Examples|Example]]s is the first thing they try to avoid link trails. In my opinion, link trailing doesn't make anything easier if target and link text are specified separately. To be clear: I propose to change the current parser behavior to avoid unwanted link trails.
Is the particular case you mention common? It seems like it would be
much more common for a user to simply write "[[Examples]]" currently to achieve the same output.
As the case I mentioned shouldn't be common and clearly more complicated as needed, I think a behavior change wouldn't have that much impact.
Regards, MGChecker
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
On 2018-10-15 16:34, Trey Jones wrote:
I'm not sure how much impact it would have on existing link specifications to make the change, but I think MGChecker has a good solution. The "[[target|linktext]]extra" format allows you to specify exactly what part of the text should have a link, while "[[target]]extra" would be understood as a shortcut to "[[target|targetextra]]". This solves the linktrails problem without introducing any extra tags or using nowiki in weird ways.
Sounds like a cute small syntax improvement! :)
Are their any other linktrails setting other than off and on? We'd want to make sure any changes didn't do weird things to Chinese or other spaceless languages.
There are two things to consider:
* Linktrails are language-specific. For example, in English, only ASCII a-z are handled in linktrails, while Polish also allows accented letters ęóąśłżźćńĘÓĄŚŁŻŹĆŃ. Chinese actually effectively disables linktrails (disallows everything). This is defined using $linkTrail variables in files like MessagesEn.php etc.
* There is also something called "linkprefix", used by e.g. Arabic (MessagesAr.php uses $linkPrefixExtension = true). I am not sure how this feature works, but it probably complicates everything a bit.
Thanks for the technical details, Bartosz!
One would hope (but should confirm) that link prefixes are treated with the same basic logic as link postfixes/trails, so assuming pre- and post-link trails are enabled, "pre[[target]]post" is all linked, but "pre[[target|linktext]]post" is only linked on "linktext", and intermediate cases can be spelled out as "[[target|pre+target]]post" or "pre[[target|target+post]]".
Overall, it sounds like reasonable default shortcut behavior that can easily be overridden with a fully-specified link.
Sounds like a cute small syntax improvement! :)
Exactly!
On Mon, Oct 15, 2018 at 12:01 PM, Bartosz Dziewoński matma.rex@gmail.com wrote:
On 2018-10-15 16:34, Trey Jones wrote:
I'm not sure how much impact it would have on existing link specifications to make the change, but I think MGChecker has a good solution. The "[[target|linktext]]extra" format allows you to specify exactly what part of the text should have a link, while "[[target]]extra" would be understood as a shortcut to "[[target|targetextra]]". This solves the linktrails problem without introducing any extra tags or using nowiki in weird ways.
Sounds like a cute small syntax improvement! :)
Are their any other linktrails setting other than off and on? We'd want to
make sure any changes didn't do weird things to Chinese or other spaceless languages.
There are two things to consider:
- Linktrails are language-specific. For example, in English, only ASCII
a-z are handled in linktrails, while Polish also allows accented letters ęóąśłżźćńĘÓĄŚŁŻŹĆŃ. Chinese actually effectively disables linktrails (disallows everything). This is defined using $linkTrail variables in files like MessagesEn.php etc.
- There is also something called "linkprefix", used by e.g. Arabic
(MessagesAr.php uses $linkPrefixExtension = true). I am not sure how this feature works, but it probably complicates everything a bit.
-- Bartosz Dziewoński
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
T129778
On Fri, Oct 5, 2018 at 3:59 PM Dan Garry dgarry@wikimedia.org wrote:
On Thu, 4 Oct 2018 at 23:29, John Erling Blad jeblad@gmail.com wrote:
Usually it comes from user errors while using VE. This kind of errors are quite common, and I asked (several years ago) whether it could be fixed
in
VE, but was told "no".
I'd really appreciate it if you could give me more information on this. Could you link to the task for this request? There is T128060 https://phabricator.wikimedia.org/T128060 from early 2016 ("VisualEditor makes it easy to create partially linked words, when the user expects a fully linked one") but I don't see you on there, and I want to make sure I understand your request.
Here's how the linking feature works right now for adding links to words which presently have no links:
- If you put your cursor inside a word without highlighting anything,
and add a link, the link is added to the entire word.
- If you highlight some text, and add a link, the link is added to the
highlighted text.
How would you propose this feature be changed?
Thanks, Dan
-- Dan Garry Lead Product Manager, Editing Wikimedia Foundation _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
wikitech-l@lists.wikimedia.org