On Fri, Jun 13, 2008 at 9:42 AM, Simetrical <Simetrical+wikilist@gmail.comSimetrical%2Bwikilist@gmail.com> wrote:
On Fri, Jun 13, 2008 at 12:55 AM, Danny B. Wikipedia.Danny.B@email.cz wrote:
How about dashes?
Em and en dashes should certainly not be included in the link. Hyphens probably shouldn't either. Remember that linktrail characters should be those that should *always* be included in the trail. Those that should only sometimes be included, on a case-by-case basis, can still be manually added with a pipe. A phrase like, say, "[[moth]]-eater" should certainly not become "[[moth|moth-eater]]".
This is a good development, but there are two major possible issues I see with this:
- Languages that don't use spaces. This is probably not such a
problem, since all such languages I know of use their own writing system, which can be specifically checked for. Make sure that links don't automatically cross a boundary between characters in a writing system that uses spaces and one from a writing system that does not. Of course, it should also not automatically include further characters within a writing system that doesn't use spaces.
- Compound words. In English, a phrase like "moth-eater" is
hyphenated; in other languages, it might be written as the equivalent of "motheater", for all I know. Some languages may go even further and use much more elaborate compound words: "agglutinative" or "polyagglutinative" languages. If this is correct, such languages should be exempted.
Note that this change might be a regression for some languages, even if they didn't previously use a custom link trail. Some languages might have deliberately refrained from including their own alphabet in the linktrail, keeping the English default so that it still worked for English. Such languages will now incorrectly see their own writing system become part of link trails.
As many languages probably have a use for link prefixes as have a use for link trails. They should probably work symmetrically unless it causes problems for a particular language.
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Any reason not to use $linkTrail = '/^(\p{L&}*'?\p{L&}+)(.*)$/usD'; to allow things like [[Verb]]ing's?
Also, as Simetrical stated, it could be very valuable, if somewhat more complicated, to use something like a $linkWord instead of a $linkTrail, to allow us to do really cool stuff like this:
$linkWord = '/^(.*?)(\p{L&}*)Something that signifies the initial link was here(\p{L&}*(?:(?<!'.+)')?\p{L&}+)(.*?)$/usD'; Essentially, any letters before the link are included, as are any letters after the link, allowing an apostrophe iff there is not another apostrophe before the link. This is probably slow, might not even work, and is in short a bad idea, but it would be nice to be able to do things like this.