Hi Yuri,
Thanks for writing this up. I'll put some comments and questions inline.
On May 30, 2013, at 7:16 PM, Yuri Astrakhan <yastrakhan(a)wikimedia.org> wrote:
*== Technical Requirements ==*
* increase Varnish cache hits / minimize cache fragmentation
* Set up and configure new partners without code changes
* Use partner-supplied IP ranges as a preferred alternative to the geo-ip
database for fundraising & analytic teams
Note that a Varnish VMOD to support the latter is being written at the moment by Brandon
Black.
*== Current state ==*
Zero domain requests set X-Subdomain="ZERO", and treat the request as
mobile. The backend uses X-Subdomain and X-CS headers to customize result.
The cache is heavily fragmented due to its variance on both of these
headers in addition to the variance set by MobileFrontend extension and
MediaWiki core.
...and also, variance due to the different hostname (and thus URL).
*== Proposals ==*
In order to reduce Zero-caused fragmentation, we propose to shrink from one
bucket per carrier (X-CS) to three general buckets:
* smart phones bucket -- banner and site modifications are done on the
client in javascript
* feature phones -- HTML only, the banner is inserted by the ESI
** for carriers with free images
** for carriers without free images
*=== Varnish logic ===*
* Parse User-Agent to distinguish between desktop / mobile / feature phone:
X-Device-Type=desktop|mobile|legacy
Using the OpenDDR library?
* Use IP -> X-CS lookup (under development by OPs)
to convert client's IP
into X-CS header
* If X-CS && X-Device-Type == 'legacy': Use IP -> X-Images lookup
(same
lookup plugin, different database file) to determine if carrier allows
images
Hopefully we can set the X-Images header straight from the ip database.
Since each carrier has its own list of free
languages, language links on
feature phones will point to origin, which will either silently redirect
or ask for confirmation.
Perhaps we can store the list of supported languages for the carrier in the ip database as
well?
*=== ZERO vs M ===*
Even though I think zero. and m. subdomains should both go the way of the
dodo to make each article have just one canonical location (no more
linking & Google issues) , this won't happen until we are fully migrated
to Varnish and make some mobile code changes (and possibly other changes
that I am not aware of).
What do you mean by "until we are fully migrated to Varnish"? MobileFrontend has
always exclusively been on Varnish.
At the same time, we should try to get rid of ZERO
wherever possible.
There are two technical differences between m & zero: zero shows a link to
image instead of the actual image, and a big red zero warning is shown if
the carrier is not detected. There is also an organizational difference --
some carriers only whitelist zero, some - only m, and some -- both zero
& m subdomains.
I'm still a little confused about "m" vs "ZERO" and
"images" vs "no images". That probably means others are too. :) Can
you elaborate a little on that? I thought that was pretty much the same, but according to
your spreadsheet that doesn't seem to be the case?
Overall this sounds reasonable I think, we'll just need to work out the details.
As Arthur also said in this thread, I'd like to keep zero & m completely aligned,
ideally sharing the Varnish cache objects and the mobile device detection at the Varnish
level as much as possible. I don't think we disagree here.
--
Mark Bergsma <mark(a)wikimedia.org>
Lead Operations Architect
Wikimedia Foundation