Re Denny
- Questions by Bawulff I redacted from my answer (because I was
[..]
Most of all, we need global identifiers for the different wikis. We could add a table which only contains mapping of the local prefixes to global identifiers, but we think that the current interwiki table could use some love anyway, and thus we decided to restructure it as a whole. This now has lead to the above mentioned RFC, but the original blocker is: for providing language links form a central source -- Wikidata -- we need to have global wiki identifiers.
In some ways we already have that. There is the iw_wikiid field added for the gsoc project which was never merged. wiki ids should be unique within the wikifarm (since they correspond to db names)
I probably misunderstand. If currently something is not set up as an interlanguage link and neither as an interwiki link, it will become a normal link, not an interwiki link (i.e. it will point to the local page foo:some page in the main namespace). Did you mean something else?
Interlanguage links are only interlanguage on subject namespace pages. On talk pages they're normal interwikis so a link that is only an interlanguage and not an interwiki does not make sense.
I would really like to see the interlanguage stuff re-done. Preferably with a means to configure multiple types of interwikis-that-go-in-sidebars so people could have interproject links and what not. Commons might have a section (portlet) in the sidebar for each of the sister projects, and each section contains the language links for that project
The issue I was trying to deal with was storage. Currently we 100% assume that the interwiki list is a table and there will only ever be one of them.
Do we really assume that? Certainly that's the default config, but I don't think that is the config used on WMF. As far as I'm aware, Wikimedia uses a cdb database file (via $wgInterwikiCache), which contains all the interwikis for all sites. From what I understand, it supports doing various "scope" levels of interwikis, including per db, per site (Wikipedia, Wiktionary, etc), or global interwikis that act on all sites.
We did not know about that database. Who can tell us more about it? This would be very interesting to get our synching code optimized.
It still wouldn't help us with the global identifiers, though, but it
would be good to know more about it.
I've tried to add a brief bit on the RFC page (mostly gleaned from the docs), I was kind of rushed though. Its basically a cdb file that has all the interwiki links for several wikis.
- --bawolff
On Tue, 21 Aug 2012 15:33:59 -0700, bawolff bawolff+wn@gmail.com wrote:
Re Denny
- Questions by Bawulff I redacted from my answer (because I was
[..]
Most of all, we need global identifiers for the different wikis. We could add a table which only contains mapping of the local prefixes to global identifiers, but we think that the current interwiki table could use some love anyway, and thus we decided to restructure it as a whole. This now has lead to the above mentioned RFC, but the original blocker is: for providing language links form a central source -- Wikidata -- we need to have global wiki identifiers.
In some ways we already have that. There is the iw_wikiid field added for the gsoc project which was never merged. wiki ids should be unique within the wikifarm (since they correspond to db names)
I probably misunderstand. If currently something is not set up as an interlanguage link and neither as an interwiki link, it will become a normal link, not an interwiki link (i.e. it will point to the local page foo:some page in the main namespace). Did you mean something else?
Interlanguage links are only interlanguage on subject namespace pages. On talk pages they're normal interwikis so a link that is only an interlanguage and not an interwiki does not make sense.
I would really like to see the interlanguage stuff re-done. Preferably with a means to configure multiple types of interwikis-that-go-in-sidebars so people could have interproject links and what not. Commons might have a section (portlet) in the sidebar for each of the sister projects, and each section contains the language links for that project
That's why we turned the language link boolean into a string ;)
Also part of my skinning rewrite plans (if I ever get back to that) actually touched the area of letting us have more than just language links in the sidebar/wherever... https://www.mediawiki.org/wiki/User:Dantman/Skinning_system/Link_lists_rewri... https://www.mediawiki.org/wiki/User:Dantman/Skinning_system/Customization#Al...
That said there are some other parts of interlanguage I'd like changed too. It would be nice if we could drop the in-wiki [[en:Foo]] for a proper interface and have templates use something like {{#languagelink:en:Foo}}.
The issue I was trying to deal with was storage. Currently we 100%
assume
that the interwiki list is a table and there will only ever be one of
them.
Do we really assume that? Certainly that's the default config, but I don't think that is the config used on WMF. As far as I'm aware, Wikimedia uses a cdb database file (via $wgInterwikiCache), which contains all the interwikis for all sites. From what I understand, it supports doing various "scope" levels of interwikis, including per db, per site (Wikipedia, Wiktionary, etc), or global interwikis that act on all sites.
We did not know about that database. Who can tell us more about it? This would be very interesting to get our synching code optimized.
It still wouldn't help us with the global identifiers, though, but it
would be good to know more about it.
I've tried to add a brief bit on the RFC page (mostly gleaned from the docs), I was kind of rushed though. Its basically a cdb file that has all the interwiki links for several wikis.
--bawolff
2012/8/22 Daniel Friesen daniel@nadir-seen-fire.com:
On Tue, 21 Aug 2012 15:33:59 -0700, bawolff bawolff+wn@gmail.com wrote:
I would really like to see the interlanguage stuff re-done. Preferably with a means to configure multiple types of interwikis-that-go-in-sidebars so people could have interproject links and what not. Commons might have a section (portlet) in the sidebar for each of the sister projects, and each section contains the language links for that project
That's why we turned the language link boolean into a string ;)
Also part of my skinning rewrite plans (if I ever get back to that) actually touched the area of letting us have more than just language links in the sidebar/wherever... https://www.mediawiki.org/wiki/User:Dantman/Skinning_system/Link_lists_rewri... https://www.mediawiki.org/wiki/User:Dantman/Skinning_system/Customization#Al...
That said there are some other parts of interlanguage I'd like changed too. It would be nice if we could drop the in-wiki [[en:Foo]] for a proper interface and have templates use something like {{#languagelink:en:Foo}}.
Whereas I see that that would be nice in the long run, due to its possible effect on the existing text in the Wikipedias I would strongly suggest to not make that a blocker for Wikidata.
My assumption would be: once we have Wikidata running, the language links in the Wikipedias will drop considerably, upon which a change to the syntax of interlanguage links would become much more feasible. Changing the syntax for language links just so we can externalize them to Wikidata is not the way I would suggest to go.
Does this sound reasonable?
Cheers, Denny
On Wed, 22 Aug 2012 09:56:17 -0700, Denny Vrandečić denny.vrandecic@wikimedia.de wrote:
2012/8/22 Daniel Friesen daniel@nadir-seen-fire.com:
On Tue, 21 Aug 2012 15:33:59 -0700, bawolff bawolff+wn@gmail.com wrote:
I would really like to see the interlanguage stuff re-done. Preferably with a means to configure multiple types of interwikis-that-go-in-sidebars so people could have interproject links and what not. Commons might have a section (portlet) in the sidebar for each of the sister projects, and each section contains the language links for that project
That's why we turned the language link boolean into a string ;)
Also part of my skinning rewrite plans (if I ever get back to that) actually touched the area of letting us have more than just language links in the sidebar/wherever... https://www.mediawiki.org/wiki/User:Dantman/Skinning_system/Link_lists_rewri... https://www.mediawiki.org/wiki/User:Dantman/Skinning_system/Customization#Al...
That said there are some other parts of interlanguage I'd like changed too. It would be nice if we could drop the in-wiki [[en:Foo]] for a proper interface and have templates use something like {{#languagelink:en:Foo}}.
Whereas I see that that would be nice in the long run, due to its possible effect on the existing text in the Wikipedias I would strongly suggest to not make that a blocker for Wikidata.
My assumption would be: once we have Wikidata running, the language links in the Wikipedias will drop considerably, upon which a change to the syntax of interlanguage links would become much more feasible. Changing the syntax for language links just so we can externalize them to Wikidata is not the way I would suggest to go.
Does this sound reasonable?
Cheers, Denny
Sure that was already the plan. The last two of the 3 paragraphs were just extra notes of future thoughts.
The only now part was the fact that we made the way to answer "Is this an interlanguage link?" a "Does the type equal 'languagelink'?" instead of "Is this languagelink boolean true?" so that in the future the is room for extensions to put something like 'interproject', 'sistersite', etc... in there.
And ideally we will also make the local identifier/prefix's type a hint rather than the only way to determine what a link type is (ie: Ideally the site link will have it's own type column instead of relying on the site_{identifier,prefix} table's type column.
Though we still need to deal with the discussion on what to do about the old iwlinks and langlinks tables.
wikitech-l@lists.wikimedia.org