Hi all,
thanks to Daniel (F.) for structuring the discussion. The discussion is currently ongoing here:
https://www.mediawiki.org/wiki/Requests_for_comment/New_sites_system
I hope that the requirements and use cases section is complete. If not, please tune in now. We will build on the use cases and their discussion there.
I also created a first draft based for a schema, which was very quickly completely ripped apart, and replaced by a much better one on the discussion page. There are also other discussions going on there. Please tune in if you are interested in the Sites table, in order to achieve consensus on the topic.
Furthermore, I want to address the unanswered questions Rob raised:
* Re Tim's July 18th comment and Rob's following comment: where is the calling code?
The code calling the sitetables is in the Wikibase Library, basically all the files starting with Site*:
But since they are part of the patchset, you probably seen them. The Sites info is being used in:
* most importantly Wikibase/lib/includes/SiteLink.php, where the site link (e.g. the link from a Wikidata item to a Wikipedia article) is defined using the Sites data. The Sitelinks are the most prominent object depending on the data, and are used basically everywhere on the repository. Wikibase/repo/includes/api/ApiSetSiteLink.php offers a good example of that.
* some utils in Wikibase/lib/includes/Utils.php * further, a few places on the client, like LangLinkHandler and the hooks
* Questions by Bawulff I redacted from my answer (because I was focusing on other stuff):
First and foremost, I'm a little confused as to what the actual use cases here are. Could we get a short summary for those who aren't entirely following how wikidata will work, why the current interwiki situation is insufficient?
Most of all, we need global identifiers for the different wikis. We could add a table which only contains mapping of the local prefixes to global identifiers, but we think that the current interwiki table could use some love anyway, and thus we decided to restructure it as a whole. This now has lead to the above mentioned RFC, but the original blocker is: for providing language links form a central source -- Wikidata -- we need to have global wiki identifiers.
- Site definitions can exist that are not used as "interlanguage link" and
not used as "interwiki link"
And if we put one of those on a talk page, what would happen? Or if foo was one such link, doing [[:foo:some page]] (Current behaviour is it becomes an interwiki).
I probably misunderstand. If currently something is not set up as an interlanguage link and neither as an interwiki link, it will become a normal link, not an interwiki link (i.e. it will point to the local page foo:some page in the main namespace). Did you mean something else?
Although to be fair, I do see how the current way we distinguish between interwiki and interlang links is a bit hacky.
Agreed, the way it is currently done in core is a bit hacky.
And in fact we are making this more flexible by having the type system. The MediaWiki site type could for instance be able to form both "nice" urls and index.php ones. Or a gerrit type could have the logic to distinguish between the gerrit commit number and a sha1 hash.
I must admit I do like this this idea. In particular the current situation where we treat the value of an interwiki link as a title (aka spaces -> underscores etc) even for sites that do not use such conventions, has always bothered me. Having interwikis that support url re-writing based on the value does sound cool, but I certainly wouldn't want said code in a db blob (and just using an integer site_type identifier is quite far away from giving us that, but its still a step in a positive direction), which raises the question of where would such rewriting code go.
A handler class for each type of site, that would construct links to that type of side based on the data about this site.
The issue I was trying to deal with was storage. Currently we 100% assume that the interwiki list is a table and there will only ever be one of them.
Do we really assume that? Certainly that's the default config, but I don't think that is the config used on WMF. As far as I'm aware, Wikimedia uses a cdb database file (via $wgInterwikiCache), which contains all the interwikis for all sites. From what I understand, it supports doing various "scope" levels of interwikis, including per db, per site (Wikipedia, Wiktionary, etc), or global interwikis that act on all sites.
We did not know about that database. Who can tell us more about it? This would be very interesting to get our synching code optimized.
It still wouldn't help us with the global identifiers, though, but it would be good to know more about it.
Cheers, Denny
-- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
2012/8/14 Daniel Friesen lists@nadir-seen-fire.com:
On Tue, 14 Aug 2012 07:32:07 -0700, Jeroen De Dauw jeroendedauw@gmail.com wrote:
Hey,
You mention using a global id to refer to sites for making links. And
synchronization of the sites table.
So you're saying that this part of Wikidata only works within Wikimedia projects right?
Does Wikidata overall only function within Wikimedia projects. Or is there a different mechanism to deal with clients from external wikis?
The software we're writing is completely Wikimedia agnostic and the actual Wikidata project will obvious be usable outside of Wikimedia projects. We will allow for links to non Wikimedia sites (although we have not agreed on how open this will be), and for non-Wikimedia sites to access all data stored within Wikidata (including our "equivalent links" using the sites table). Does that answer your question or am I missing something?
Cheers
-- Jeroen De Dauw http://www.bn2vs.com Don't panic. Don't be evil. --
Ok, so the data is available to 3rd party wikis.
I was asking how you planned to handle sites in 3rd party wikis. Do you have a separate mechanism to handle links from 3rd party clients? Or are they supposed to sync their sites from Wikimedia's Wikidata?
-- ~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l