Hi all. I'm writing to get input on a conceptual issue regarding the resolution of redirects.
I'm currently in the process of implementing redirects for Wikibase Items (bugzilla 66067). My present task is to add support for redirect resolution to the EntityLookup service interface (and possibly the related EntityRevisionLookup service interface; bugzilla 66075).
Currently, the two interfaces in question look like this (with some irrelevant stuff omitted):
interface EntityLookup { public function getEntity( EntityId $entityId, $revision = 0 ); public function hasEntity( EntityId $entityId ); }
interface EntityRevisionLookup extends EntityLookup { public function getEntityRevision( EntityId $entityId, $revisionId = 0 ); public function getLatestRevisionId( EntityId $entityId ); }
Note that getEntityRevision returns an EntityRevision object (an Entity with some revision meta data), while getEntity just returns an Entity object.
Also note that the $revision parameter in EntityLookup::getEntity is deprecated and being removed (see patch Iafdcb5b38), while $revision in EntityRevisionLookup::getEntityRevision is supposed to stay.
Presently, the attempt to look up an Entity via an ID that has been turned into a redirect will result in an exception being thrown. To implement redirect resolution, original intention was to leave the EntityRevisionLookup as is, and change EntityLookup like this:
interface EntityLookup { public function getEntity( EntityId $entityId, $resolveRedirects = 1 ); public function hasEntity( EntityId $entityId, $resolveRedirects = 1 ); }
...with the $resolveRedirects parameter indicating how many levels of redirects should be resolved before giving up.
This gives use a convenient way to get the current revision of an entity, following redirects; And it keeps the interface for requesting a specific, or the latest, version of an Entity, with meta info attached.
However, it means we have to implement the logic for redirect resolution in every implementation class, generally using the same code over and over (there are currently three implementations of EntityRevisionLookup: the actual lookup, a caching wrapper, and an in-memory fake).
Also, it does not give us a straight-forward way to get the meta-data of the current revision while following redirects. For that, we'd have to modify EntityRevisionLookup::getEntityReevision:
public function getEntityRevision( EntityId $entityId, $revisionId = 0, $resolveRedirects = 0 );
This is ugly, and annoying since we'll want to *either* resolve redirects *or* specify a revision. We could use a special value for $revisionId to indicate that we not only want the current revision (indicated by 0), but also want to have redirects resolved (indicated by "follow" or -1 or whatever):
public function getEntityRevision( EntityId $entityId, $revisionIdOrRedirects = 0, );
That's concise, but somewhat magical. Or we could add another method:
public function getEntityRevisionAfterFollowingAnyRedirects( EntityId $entityId, $resolveRedirects = 1, );
That's not quite obvious, and the awkward name indicates that this isn't really what we want either.
Perhaps we can get around all this mess by making redirect resolution something the interface doesn't know about? An implementation detail? The logic for resolving redirects could be implemented in a Proxy/Wrapper that would implement EntityRevisionLookup (and thus also EntityLookup). The logic would have to be implemented only once, in one implementation class, that could be wrapped around any other implementation.
From the implementation's point of view, this is a lot more elegant, and removes
all the issues of how to fit the flag for redirect resolution into the method signatures.
However, this means that the caller does not have control over whether redirects are resolved or not. It would then be the responsibility of bootstrap code to provide an instances that does, or doesn't, do redirect resolution to the appropriate places. That's impractical, since the decisions whether redirects should be resolved may be dynamic (e.g. depend on a parameter in an web API call), or the caller may wish to handle redirects explicitly, by first looking up without redirect, and then with redirect resolution, after some special treatment.
So, it seems that the "ugly" variant with an extra parameter in getEntityRevision() is the most practical, even though it's not the most elegant from an OO design perspective.
What's your take on this? Got any better ideas?
-- daniel
Hey,
Resolving redirects and retrieving entities are two different things. Sometimes you want to do both, sometimes just one. Trying to create a general solution by adding one of them to an existing interface dedicated to the other is bound to end up being problematic, as your email illustrates. I suggest only putting them together where there is need to do so, and create new objects and interfaces based on the needs encountered there. Forcing the existing interface to know about redirects would be repeating the mistake of putting the revision id in there, though in this case it'd be worse.
Perhaps we can get around all this mess by making redirect resolution
something the interface doesn't know about? An implementation detail? The logic for resolving redirects could be implemented in a Proxy/Wrapper that would implement EntityRevisionLookup (and thus also EntityLookup). The logic would have to be implemented only once, in one implementation class, that could be wrapped around any other implementation.
From the implementation's point of view, this is a lot more elegant, and removes all the issues of how to fit the flag for redirect resolution into the method signatures.
However, this means that the caller does not have control over whether redirects are resolved or not. It would then be the responsibility of bootstrap code to provide an instances that does, or doesn't, do redirect resolution to the appropriate places. That's impractical, since the decisions whether redirects should be resolved may be dynamic (e.g. depend on a parameter in an web API call), or the caller may wish to handle redirects explicitly, by first looking up without redirect, and then with redirect resolution, after some special treatment.
I'm not suggesting this is the approach to take, though I disagree with the objections raised against it. First of all, it is not the caller that has the control, it is the thing configuring the object graph being used. If the decision if redirects should be resolved or not needs to happen after this configuration, then you can simply have your object require both types of lookups. This would work, though it makes clear the approach of putting this functionality in a wrapper is odd for this use case. Having a service to resolve redirects and one to look up entities would be a lot more natural.
Cheers
-- Jeroen De Dauw - http://www.bn2vs.com Software craftsmanship advocate Evil software architect at Wikimedia Germany ~=[,,_,,]:3
Am 23.06.2014 15:45, schrieb Jeroen De Dauw:
Hey,
Resolving redirects and retrieving entities are two different things. Sometimes you want to do both, sometimes just one. Trying to create a general solution by adding one of them to an existing interface dedicated to the other is bound to end up being problematic, as your email illustrates. I suggest only putting them together where there is need to do so, and create new objects and interfaces based on the needs encountered there. Forcing the existing interface to know about redirects would be repeating the mistake of putting the revision id in there, though in this case it'd be worse.
Well, the thing is, in many cases it's supposed to be opaque. Most code should just be based on "I have an ID, give me the entity", and never think about redirects. They should just work.
However, in some cases, we want to explicitly specify whether redirects should be resolved or not. E.g. wbgetentities should have a flag for that. And performing entity edit operations on a redirect should fail.
So, code that knows about redirects should use a different interface than code that doesn't care? Is that what you are saying? This *mostly* aligns with code that uses EntityLookup (doesn't care) and EntityRevisionLookup (does care).
Maybe EntityLookup should always opaquely resolve redirects, while EntityRevisionLookup should do so when requested. That makes sense from the perspective that the context in which the revision ID is relevant is generally also context in which redirects should be considered explicitly. I'd have to poke around to see how far that correlation goes.
What do you think of tying redirect handling to explicit revision handling?
-- daniel
PS: yes, resolving a redirect is not the same as getting an entity. getting an entity may however involve resolving a redirect.
wikidata-tech@lists.wikimedia.org