On 25.04.2013 21:43, Yuri Astrakhan wrote:
CCing wikidata.
I don't think this is a good approach. We shouldn't be breaking API just because
there is a new under-the-hood feature (wikibase).
This is not a breaking change to the MediaWiki API at all. The hook did not
exist before. Things not using the hook keep working exactly as before.
Only once Wikidata starts using the hook, the behavior of *Wikipedias* API
changes (from including external links to not including them per default).
One could actually see this as fixing a bug: currently, "external" language
links are mis-reported as being "local" language links. This is being fixed.
From the API client's
perspective, it should work as before, plus there should be an extra flag
notifying if the sitelink is stored in wikidata or locally. Sitelinks might be
the first, but not the last change - e.g. categories, etc.
The "external" links could be included per default by ApiQueryLangLinks; I did
not do this for performance reasons (considering the hook makes paging a lot
more difficult, and may result in a lot more database queries).
Anomie said he'd think about making this less costly.
As for the implementation, it seems the hook approach
might not satisfy all the
usage scenarios:
* Given a set of pages (pageset), give all the sitelinks (possibly filtered with
a set of wanted languages). Rendering page for the UI would use this approach
with just one page.
You want the hook to work on a more complex structure, changing the link sets
for multiple pages?
Possible, but I don't think it's helpful. For any non-trivial set of pages,
we'd
be in danger of running out of memory, and some kind of chunking would be
needed, complicating things even more. Also, implementing a handler for a hook
that handles such a complex structure is quite painful and error prone.
Assembling the result from multiple calls to a simple hook seems to make more
sense to me, which is what I implemented in Idfcdc53af.
* langbacklinks - get a list of pages linking to a
site.
Yes, that would only consider locally defined links. As I understand, this query
is mainly used to find and fix broken links. So it makes sense to only include
the ones that are actually defined (and fixable) locally.
* filtering based on having/not having specific
langlink for other modules. E.g.
list all pages that have/don't have a link to a site X.
Same as above.
* alllanglinks (not yet implemented, but might be to
match corresponding
allcategories, ...) - list all existing langlinks in the site.
Same as above. I believe the sensible semantics it "list all langlinks *defined*
on the site". At least per default.
For alllanglinks, I can imagine how to do this efficiently for the wikibase
case, but not for a generic hook that can manipulate sitelinks.
We could debate the need of some of these scenarios,
but I feel that we
shouldn't be breaking existing API.
Again: it doesn't. The API reports what is defined locally, and stored in the
API locally, as before. Wikidata starting to use the new hook may break
expectations in the data returned by Wikipedia's API, but that's a separate
issue, I think.
-- daniel
--
Daniel Kinzler, Softwarearchitekt
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.