Sorry about the borked line wrapping in the previous message - I'm resending it so you can read it properly!
----
This is a proposal to try and bring order to the messy area of interwiki linking and interwiki prefixes, particularly for non-WMF users of MediaWiki.
At the moment, anyone who installs MediaWiki gets a default interwiki table that is hopelessly out of date. Some of the URLs listed there have seemingly been broken for 7 years [1]. Meanwhile, WMF wikis have access to a nice, updated interwiki map, stored on Meta, that is difficult for anyone else to use. Clearly something needs to be done.
What I propose we do to improve the situation is along the lines of bug 58369:
1. Split the existing interwiki map on Meta [2] into a "global interwiki map", located on MediaWiki.org (draft at [3]), and a "WMF-specific interwiki map" on Meta (draft at [4]). Wikimedia-specific interwiki prefixes, like bugzilla:, gerrit:, and irc: would be located in the map on Meta, whereas general-purpose interwikis, like orthodoxwiki: and wikisource: would go to the "global map" at MediaWiki.org.
2. Create a bot, similar to l10n-bot, that periodically updates the default interwiki data in mediawiki/core based on the contents of the global map. (Right now, the default map is duplicated in two different formats [5] [6]which is quite messy.)
3. Write a version of the rebuildInterwiki.php maintenance script [7] that can be bundled with MediaWiki, and which can be run by server admins to pull in new entries to their interwiki table from the global map.
This way, fresh installations of MediaWiki get a set of current, useful interwiki prefixes, and they have the ability to pull in updates as required. It also has the benefit of separating out the WMF-specific stuff from the global MediaWiki logic, which is a win for external users of MW.
Two other things it would be nice to do:
* Define a proper scope for the interwiki map. At the moment it is a bit unclear what should and shouldn't be there. The fact that we currently have a Linux users' group from New Zealand and someone's personal blog on the map suggests the scope of the map have not been well thought out over the years. My suggested criterion at [3] is:
"Most well-established and active wikis should have interwiki prefixes, regardless of whether or not they are using MediaWiki software. Sites that are not wikis may be acceptable in some cases, particularly if they are very commonly linked to (e.g. Google, OEIS)."
* Take this opportunity to CLEAN UP the global interwiki map! ** Many of the links are long dead. ** Many new wikis have sprung up in the last few years that deserve to be added. ** Broken prefixes can be moved to the WMF-specific map so existing links on WMF sites can be cleaned up and dealt with appropriately. ** We could add API URLs to fill the iw_api column in the database (currently empty by default).
I'm interested to hear your thoughts on these ideas.
Sorry for the long message, but I really think this topic has been neglected for such a long time.
TTO
----
PS. I am aware of an RFC on MediaWiki.org relating to this, but I can't see that gaining traction any time soon. This proposal would be a more light-weight way of dealing with the problem at hand.
[1] https://gerrit.wikimedia.org/r/#/c/84303/ [2] https://meta.wikimedia.org/wiki/Interwiki_map [3] https://www.mediawiki.org/wiki/User:This,_that_and_the_other/Interwiki_map [4] https://meta.wikimedia.org/wiki/User:This,_that_and_the_other/Local_interwik... [5] http://git.wikimedia.org/blob/mediawiki%2Fcore.git/master/maintenance%2Finte... [6] http://git.wikimedia.org/blob/mediawiki%2Fcore.git/master/maintenance%2Finte... [7] https://git.wikimedia.org/blob/mediawiki%2Fextensions%2FWikimediaMaintenance...
You might want to have a look at https://www.mediawiki.org/wiki/Requests_for_comment/New_sites_system . That's more future proof than using the current interwiki system IMO. Also we already use a subset of that for Wikidata.
Cheers,
Marius
On Thu, 2014-01-16 at 22:06 +1100, This, that and the other wrote:
Sorry about the borked line wrapping in the previous message - I'm resending it so you can read it properly!
This is a proposal to try and bring order to the messy area of interwiki linking and interwiki prefixes, particularly for non-WMF users of MediaWiki.
At the moment, anyone who installs MediaWiki gets a default interwiki table that is hopelessly out of date. Some of the URLs listed there have seemingly been broken for 7 years [1]. Meanwhile, WMF wikis have access to a nice, updated interwiki map, stored on Meta, that is difficult for anyone else to use. Clearly something needs to be done.
What I propose we do to improve the situation is along the lines of bug 58369:
Split the existing interwiki map on Meta [2] into a "global interwiki map", located on MediaWiki.org (draft at [3]), and a "WMF-specific interwiki map" on Meta (draft at [4]). Wikimedia-specific interwiki prefixes, like bugzilla:, gerrit:, and irc: would be located in the map on Meta, whereas general-purpose interwikis, like orthodoxwiki: and wikisource: would go to the "global map" at MediaWiki.org.
Create a bot, similar to l10n-bot, that periodically updates the default interwiki data in mediawiki/core based on the contents of the global map. (Right now, the default map is duplicated in two different formats [5] [6]which is quite messy.)
Write a version of the rebuildInterwiki.php maintenance script [7] that can be bundled with MediaWiki, and which can be run by server admins to pull in new entries to their interwiki table from the global map.
This way, fresh installations of MediaWiki get a set of current, useful interwiki prefixes, and they have the ability to pull in updates as required. It also has the benefit of separating out the WMF-specific stuff from the global MediaWiki logic, which is a win for external users of MW.
Two other things it would be nice to do:
Define a proper scope for the interwiki map. At the moment it is a bit unclear what should and shouldn't be there. The fact that we currently have a Linux users' group from New Zealand and someone's personal blog on the map suggests the scope of the map have not been well thought out over the years. My suggested criterion at [3] is:
"Most well-established and active wikis should have interwiki prefixes, regardless of whether or not they are using MediaWiki software. Sites that are not wikis may be acceptable in some cases, particularly if they are very commonly linked to (e.g. Google, OEIS)."
Take this opportunity to CLEAN UP the global interwiki map!
** Many of the links are long dead. ** Many new wikis have sprung up in the last few years that deserve to be added. ** Broken prefixes can be moved to the WMF-specific map so existing links on WMF sites can be cleaned up and dealt with appropriately. ** We could add API URLs to fill the iw_api column in the database (currently empty by default).
I'm interested to hear your thoughts on these ideas.
Sorry for the long message, but I really think this topic has been neglected for such a long time.
TTO
PS. I am aware of an RFC on MediaWiki.org relating to this, but I can't see that gaining traction any time soon. This proposal would be a more light-weight way of dealing with the problem at hand.
[1] https://gerrit.wikimedia.org/r/#/c/84303/ [2] https://meta.wikimedia.org/wiki/Interwiki_map [3] https://www.mediawiki.org/wiki/User:This,_that_and_the_other/Interwiki_map [4] https://meta.wikimedia.org/wiki/User:This,_that_and_the_other/Local_interwik... [5] http://git.wikimedia.org/blob/mediawiki%2Fcore.git/master/maintenance%2Finte... [6] http://git.wikimedia.org/blob/mediawiki%2Fcore.git/master/maintenance%2Finte... [7] https://git.wikimedia.org/blob/mediawiki%2Fextensions%2FWikimediaMaintenance...
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
On 16/01/14 22:06, This, that and the other wrote:
"Most well-established and active wikis should have interwiki prefixes, regardless of whether or not they are using MediaWiki software. Sites that are not wikis may be acceptable in some cases, particularly if they are very commonly linked to (e.g. Google, OEIS)."
I think the interwiki map should be retired. I think broken links should be removed from it, and no new wikis should be added.
Interwiki prefixes, local namespaces and article titles containing a plain colon intractably conflict. Every time you add a new interwiki prefix, main namespace articles which had that prefix in their title become inaccessible and need to be recovered with a maintenance script.
There is a very good, standardised system for linking to arbitrary remote wikis -- URLs. URLs have the advantage of not sharing a namespace with local article titles.
Even the introduction of new WMF-to-WMF interwiki prefixes has caused the breakage of large numbers of article titles. I can see that is convenient, but I think it should be replaced even in that use case. UI convenience, link styling and rel=nofollow can be dealt with in other ways.
-- Tim Starling
On Thu, Jan 16, 2014 at 10:56 PM, Tim Starling tstarling@wikimedia.orgwrote:
I think the interwiki map should be retired. I think broken links should be removed from it, and no new wikis should be added.
Interwiki prefixes, local namespaces and article titles containing a plain colon intractably conflict. Every time you add a new interwiki prefix, main namespace articles which had that prefix in their title become inaccessible and need to be recovered with a maintenance script.
There is a very good, standardised system for linking to arbitrary remote wikis -- URLs. URLs have the advantage of not sharing a namespace with local article titles.
Even the introduction of new WMF-to-WMF interwiki prefixes has caused the breakage of large numbers of article titles. I can see that is convenient, but I think it should be replaced even in that use case. UI convenience, link styling and rel=nofollow can be dealt with in other ways.
These are some good points. I've run into a problem many times when importing pages (e.g. templates and/or their documentation) from Wikipedia, that pages like [[Wikipedia:Signatures]] become interwiki links to Wikipedia mainspace rather than redlinks. Also, usually I end up accessing interwiki prefixes through templates like Template:whttps://meta.wikimedia.org/wiki/Template:Wanyway. It would be a simple matter to make those templates generate URLs rather than interwiki links. The only other way to prevent these conflicts from happening would be to use a different delimiter besides a single colon; but what would that replacement be?
Before retiring the interwiki map, we could run a bot to edit all the pages that use interwiki links, and convert the interwiki links to template uses. A template would have the same advantage as an interwiki link in making it easy to change the URLs if the site were to switch domains or change its URL scheme.
"Tim Starling" wrote in message news:lba9ld$8pj$1@ger.gmane.org...
I think the interwiki map should be retired. I think broken links should be removed from it, and no new wikis should be added.
Interwiki prefixes, local namespaces and article titles containing a plain colon intractably conflict. Every time you add a new interwiki prefix, main namespace articles which had that prefix in their title become inaccessible and need to be recovered with a maintenance script.
There is a very good, standardised system for linking to arbitrary remote wikis -- URLs. URLs have the advantage of not sharing a namespace with local article titles.
Even the introduction of new WMF-to-WMF interwiki prefixes has caused the breakage of large numbers of article titles. I can see that is convenient, but I think it should be replaced even in that use case. UI convenience, link styling and rel=nofollow can be dealt with in other ways.
-- Tim Starling
The one main advantage of interwiki mapping is the convenience you mention. They save a great amount of unnecessary typing and remembering of URLs. Whenever we go to any WMF wiki, we can simply type [[gerrit:12345]] and know that the link will point where we want it to.
Some possible alternatives to our current system would include: * to make people manually type out URLs everywhere (silly) * to use cross-wiki linking templates instead of interwikis. This has its own set of problems: cross-wiki transclusion is another area in sore need of attention (see bug 4547); we need to decide which wikis get their own linking templates; how do we deal with collisions between local and global (cross-wiki) templates? etc. To me, it doesn't seem worth the effort. * to introduce a new syntax for interwiki links that does not collide with internal links (too ambitious?)
I personally favour keeping interwikis as we know them, as collisions are very rare, and none of the alternatives seem viable or practical. Maybe the advent of interactive editing systems like VisualEditor and Flow will make them obsolete, but until then, editors need the convenience and flexibility that they offer when writing wikitext.
It seems as though your proposal, Tim, relates to the WMF cluster. I'd be interested to know what your thoughts are with relation to the interwiki table in external MediaWiki installations.
TTO _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
On 01/16/2014 07:56 PM, Tim Starling wrote:
I think the interwiki map should be retired. I think broken links should be removed from it, and no new wikis should be added.
Interwiki prefixes, local namespaces and article titles containing a plain colon intractably conflict. Every time you add a new interwiki prefix, main namespace articles which had that prefix in their title become inaccessible and need to be recovered with a maintenance script.
There is a very good, standardised system for linking to arbitrary remote wikis -- URLs. URLs have the advantage of not sharing a namespace with local article titles.
The underlying issue here is that we are still using wikitext as our primary storage format, rather than treating it as the textual user interface it is. With HTML storage this issue disappears, as interwiki links are stored with full URLs. When using the wikitext editor, prefixes are introduced correctly and on demand, so you get the convenience without the conflicts.
Currently Flow is the only project using HTML storage. We are working on preparing this for MediaWiki proper though, so in the longer term the interwiki conflict issue should disappear.
Gabriel
----- Original Message -----
From: "Gabriel Wicke" gwicke@wikimedia.org
Currently Flow is the only project using HTML storage. We are working on preparing this for MediaWiki proper though, so in the longer term the interwiki conflict issue should disappear.
Where, by "HTML storage" I hope you actually mean "something that isn't HTML" storage, since HTML is a *presentation* markup manguage, not a semantic one, and thus singularly unsuited to use for the sort of semantic storage a wiki engine requires...
Cheers, -- jra
On 01/17/2014 10:17 AM, Jay Ashworth wrote:
----- Original Message -----
From: "Gabriel Wicke" gwicke@wikimedia.org
Currently Flow is the only project using HTML storage. We are working on preparing this for MediaWiki proper though, so in the longer term the interwiki conflict issue should disappear.
Where, by "HTML storage" I hope you actually mean "something that isn't HTML" storage, since HTML is a *presentation* markup manguage, not a semantic one, and thus singularly unsuited to use for the sort of semantic storage a wiki engine requires...
I mean our HTML5+RDFa DOM spec format [1], which is semantic markup that also displays as expected. It exposes all the semantic information of Wikitext in RDFa, which is why Parsoid can provide a wikitext editing interface to it.
Gabriel
[1]: https://www.mediawiki.org/wiki/Parsoid/MediaWiki_DOM_spec
Tim Starling wrote:
I can see that is convenient, but I think it should be replaced even in that use case. UI convenience, link styling and rel=nofollow can be dealt with in other ways.
Re: https://meta.wikimedia.org/wiki/Interwiki_map
It's not just convenience. Interwiki links are an easy way to implement global (across all Wikimedia wikis) templates. They're very simple linker templates, but templates just the same.
Instead of {{bugzilla|}} for Bugzilla, you use [[bugzilla:]]. Instead of updating dozens of templates on hundreds of wikis indefinitely, you can update a centralized interwiki map. The centralized map also helps avoid conflicts. And if one day one of the targets moves and doesn't leave a redirect (boo!), we can theoretically update the interwiki map and all of the links across Wikimedia wikis will continue to work. I believe we use this feature occasionally.
We could make parser functions such as "{{#bugzilla:}}", but depending on who you ask, wikitext as a written form is on its way out. I'm not sure the investment is worth the return.
I suppose it's possible that people are using interwiki markup to disable the typical link icons, but instead we should be discussing link icons generally in the user interface. This is pretty far removed from interwiki links, in my opinion. I do know that people occasionally use redirection to get around weird link generation behavior when using interwiki markup. As I recall, space interpretation was the center of that (i.e., query paths containing "_" v. "+" v. "%20" v. " " &c.).
Regarding rel=nofollow and link trustworthiness: I'm not sure any sane search engine continues to trust user input these days. I thought lessons of the past taught developers that people are pretty unscrupulous. :-)
MZMcBride
wikitech-l@lists.wikimedia.org