CCing wikidata.
I don't think this is a good approach. We shouldn't be breaking API just because there is a new under-the-hood feature (wikibase). From the API client's perspective, it should work as before, plus there should be an extra flag notifying if the sitelink is stored in wikidata or locally. Sitelinks might be the first, but not the last change - e.g. categories, etc.
As for the implementation, it seems the hook approach might not satisfy all the usage scenarios:
* Given a set of pages (pageset), give all the sitelinks (possibly filtered with a set of wanted languages). Rendering page for the UI would use this approach with just one page. * langbacklinks - get a list of pages linking to a site. * filtering based on having/not having specific langlink for other modules. E.g. list all pages that have/don't have a link to a site X. * alllanglinks (not yet implemented, but might be to match corresponding allcategories, ...) - list all existing langlinks in the site.
We could debate the need of some of these scenarios, but I feel that we shouldn't be breaking existing API.
On Thu, Apr 25, 2013 at 2:24 PM, Brad Jorsch bjorsch@wikimedia.org wrote:
Language links added by Wikidata are currently stored in the parser cache and in the langlinks table in the database, which means they work the same as in-page langlinks but also that the page must be reparsed if these wikidata langlinks change. The Wikidata team has proposed to remove the necessity for the page reparse, at the cost of changing the behavior of the API with regard to langlinks.
Gerrit change 59997[1] (still in review) will make the following behavioral changes:
- action=parse will return only the in-page langlinks by default.
Inclusion of Wikidata langlinks may be requested using a new parameter.
- list=allpages with apfilterlanglinks will only consider in-page
langlinks.
- list=langbacklinks will only consider in-page langlinks.
- prop=langlinks will only list in-page langlinks.
Gerrit change 60034[2] (still in review) will make the following behavioral changes:
- prop=langlinks will have a new parameter to request inclusion of the
Wikidata langlinks in the result.
A future change, not coded yet, will allow for Wikidata to flag its langlinks in various ways. For example, it could indicate which of the other-language articles are Featured Articles.
At this time, it seems likely that the first change will make it into 1.22wmf3.[3] The timing of the second and third changes are less certain.
-- Brad Jorsch Software Engineer Wikimedia Foundation
Mediawiki-api-announce mailing list Mediawiki-api-announce@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-api-announce
On 25.04.2013 21:43, Yuri Astrakhan wrote:
CCing wikidata.
I don't think this is a good approach. We shouldn't be breaking API just because there is a new under-the-hood feature (wikibase).
This is not a breaking change to the MediaWiki API at all. The hook did not exist before. Things not using the hook keep working exactly as before.
Only once Wikidata starts using the hook, the behavior of *Wikipedias* API changes (from including external links to not including them per default).
One could actually see this as fixing a bug: currently, "external" language links are mis-reported as being "local" language links. This is being fixed.
From the API client's perspective, it should work as before, plus there should be an extra flag notifying if the sitelink is stored in wikidata or locally. Sitelinks might be the first, but not the last change - e.g. categories, etc.
The "external" links could be included per default by ApiQueryLangLinks; I did not do this for performance reasons (considering the hook makes paging a lot more difficult, and may result in a lot more database queries).
Anomie said he'd think about making this less costly.
As for the implementation, it seems the hook approach might not satisfy all the usage scenarios:
- Given a set of pages (pageset), give all the sitelinks (possibly filtered with
a set of wanted languages). Rendering page for the UI would use this approach with just one page.
You want the hook to work on a more complex structure, changing the link sets for multiple pages?
Possible, but I don't think it's helpful. For any non-trivial set of pages, we'd be in danger of running out of memory, and some kind of chunking would be needed, complicating things even more. Also, implementing a handler for a hook that handles such a complex structure is quite painful and error prone. Assembling the result from multiple calls to a simple hook seems to make more sense to me, which is what I implemented in Idfcdc53af.
- langbacklinks - get a list of pages linking to a site.
Yes, that would only consider locally defined links. As I understand, this query is mainly used to find and fix broken links. So it makes sense to only include the ones that are actually defined (and fixable) locally.
- filtering based on having/not having specific langlink for other modules. E.g.
list all pages that have/don't have a link to a site X.
Same as above.
- alllanglinks (not yet implemented, but might be to match corresponding
allcategories, ...) - list all existing langlinks in the site.
Same as above. I believe the sensible semantics it "list all langlinks *defined* on the site". At least per default.
For alllanglinks, I can imagine how to do this efficiently for the wikibase case, but not for a generic hook that can manipulate sitelinks.
We could debate the need of some of these scenarios, but I feel that we shouldn't be breaking existing API.
Again: it doesn't. The API reports what is defined locally, and stored in the API locally, as before. Wikidata starting to use the new hook may break expectations in the data returned by Wikipedia's API, but that's a separate issue, I think.
-- daniel