In my opinion we should try to first process the whole linked phrase by
inflection aka affix rules, and if that fails aka no link target can be
found – then and only then should regexps form prefix and linktrails be
applied. If applying prefix or linktrails creates a word that can be
inflected, and it links to the same target, then move the strings into the
linked phrase. If the link use the pipe-form, then move the strings into
the second part of the link, aka the link text.
Links using the pipe-form should not have the link target inflected. This
is important, as this is the natural escape route if inflection gives wrong
target for whatever reason.
Inflected links should go to the target with the smallest difference. This
is a non-trivial problem. We often link _phrases_ and those could be
processed by several rules, each with some kind of weight rules. An edit
distance would probably not be sufficient.
Perhaps most important; VisualEditor should not insert <nowiki/>, if the
users needs this escape route then let them do it themselves in
WikitextEditor.
On Fri, Oct 5, 2018 at 6:17 PM Amir E. Aharoni <amir.aharoni(a)mail.huji.ac.il>
wrote:
בתאריך יום ו׳, 5 באוק׳ 2018 ב-16:59 מאת Dan Garry
<
dgarry(a)wikimedia.org
>:
On Thu, 4 Oct 2018 at 23:29, John Erling Blad <jeblad(a)gmail.com> wrote:
> Usually it comes from user errors while using VE. This kind of errors
are
> quite common, and I asked (several years
ago) whether it could be fixed
in
VE, but
was told "no".
I'd really appreciate it if you could give me more information on this.
This is very frequent. I know that in the Hebrew Wikipedia it happens up to
20 times a day (I actually counted this for many months), and this is never
intentional or desirable. Never, ever. 100% of cases. The same must be true
for many other languages, but probably not for all. In wikis bigger than
the Hebrew Wikipedia it probably happens much more often than 20 times a
day.
It is possibly the most frequent reason for automatic insertion of <nowiki>
tags (although this may be different by language).
How does it happen? Several ways:
* People add a word ending to an existing link. English has very few word
endings (-s, -ing, -ed, -able, and not much more), but many other languages
have more.
* People highlight only a part of a word when they add a link, even though
they should have highlighted the whole word.
* In particular, people highlight the part of the word without an ending.
For example, "Dogs" is written, and people highlight "Dog".
* People sometimes actually want to write two separate words and forget to
write a space. (This may sound silly, but I saw this happening very often.)
* People write a compound word and link a part of the word. Sometimes it's
intentional, although as we can see in other emails in this thread not
everybody agrees about the desirability of this. This works very
differently in different languages. German has a lot of them, English has
much less, Hebrew has almost zero.
It's worth running proper user testing
Here's how the linking feature works right
now for adding links to words
which presently have no links:
- If you put your cursor inside a word without highlighting anything,
and add a link, the link is added to the entire word.
- If you highlight some text, and add a link, the link is added to the
highlighted text.
I know this, and I like how it works, but the fact is that there are many
other users who don't know this. Simply searching wikitext for
"]]<nowiki/>" will show how often does this happen.
How would you propose this feature be changed?
One possibility is to not add <nowiki/> after a link. I proposed it, but it
was declined:
https://phabricator.wikimedia.org/T141689 . The declining
comment links to T128060, which you mentioned in your email, and it's still
not resolved.
Other than fully stopping to do it, I cannot think of many other
possibilities. Maybe we could show a warning, although I suspect that many
users will ignore it or find it unnecessarily intrusive. I'm not a real
designer, and it's possible that a real designer can come with something
better.
Another thing we could consider is to link the whole word *by default*, and
to add another function that separates a link from the trail. I'd further
suggest the separation be done internally not by "<nowiki/>", but by
some
other syntax that looks more semantic, for example "{{#sep}}" (this should
be a magic word and not a template!). My educated guess is that separating
the word from the link is much less frequent than wanting to link the whole
word. Part of my motivation for starting this thread was to understand how
does this work in different languages.
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l