Hi
I currently try to create a cache for "mwoffliner". A cache for images (thumbnails) and a cache for Parsoid output. For the images/thumbnails it's pretty straight forward thanks to the "last-modified" header.
Unfortunately, for the Parsoid output, this seems to be more complicated. Gabriel's htmldumper relies only on the oldid value, but I'm not really satisfied byt this approach because I want to be able to download a new version of the HTML for the same oldid if necessary (for example if the HTML output was improved with a Parsoid fix).
There is an "age" header but I don't really understand the fundamental difference with "last-modified". Do we have the same information here but presented in an other way? If yes, why is that better than "last-modified"?
There is in addition the "x-varnish" header but this is IMO an internal information I should not rely on (and BTW, time to time we get headers with two "x-warning" header entries, what looks pretty weird to me - see PS).
Finally my question, might we introduce a "last-modified" HTTP header?
Regards Emmanuel
PS: Here an example of request with two "x-varnish" headers:
$ curl -I "http://parsoid-lb.eqiad.wikimedia.org/dewiki/Almer%C3%ADa?oldid=133672544" HTTP/1.1 200 OK X-Powered-By: Express Vary: Accept-Encoding Access-Control-Allow-Origin: * Cache-Control: s-maxage=2592000 content-revision-id: 133672544 X-Parsoid-Performance: duration=4063; start=1416051524354 Content-Type: text/html; charset=UTF-8 X-Varnish: 735376643 735208307 Via: 1.1 varnish Date: Sat, 15 Nov 2014 12:03:47 GMT X-Varnish: 1047669169 Age: 1499 Via: 1.1 varnish Connection: keep-alive X-Cache: cp1058 hit (6), cp1058 frontend miss (0)
Emmanuel,
in RESTBase we are exposing an ETag header for each 'render' revision of the HTML, which you will be able to use with an If-None-Match request header to conditionally retrieve a newer version of the page. ETA for the public API is currently January.
Gabriel
On Sat, Nov 15, 2014 at 4:07 AM, Emmanuel Engelhart kelson@kiwix.org wrote:
Hi
I currently try to create a cache for "mwoffliner". A cache for images (thumbnails) and a cache for Parsoid output. For the images/thumbnails it's pretty straight forward thanks to the "last-modified" header.
Unfortunately, for the Parsoid output, this seems to be more complicated. Gabriel's htmldumper relies only on the oldid value, but I'm not really satisfied byt this approach because I want to be able to download a new version of the HTML for the same oldid if necessary (for example if the HTML output was improved with a Parsoid fix).
There is an "age" header but I don't really understand the fundamental difference with "last-modified". Do we have the same information here but presented in an other way? If yes, why is that better than "last-modified"?
There is in addition the "x-varnish" header but this is IMO an internal information I should not rely on (and BTW, time to time we get headers with two "x-warning" header entries, what looks pretty weird to me - see PS).
Finally my question, might we introduce a "last-modified" HTTP header?
Regards Emmanuel
PS: Here an example of request with two "x-varnish" headers:
$ curl -I "http://parsoid-lb.eqiad.wikimedia.org/dewiki/Almer%C3% ADa?oldid=133672544" HTTP/1.1 200 OK X-Powered-By: Express Vary: Accept-Encoding Access-Control-Allow-Origin: * Cache-Control: s-maxage=2592000 content-revision-id: 133672544 X-Parsoid-Performance: duration=4063; start=1416051524354 Content-Type: text/html; charset=UTF-8 X-Varnish: 735376643 735208307 Via: 1.1 varnish Date: Sat, 15 Nov 2014 12:03:47 GMT X-Varnish: 1047669169 Age: 1499 Via: 1.1 varnish Connection: keep-alive X-Cache: cp1058 hit (6), cp1058 frontend miss (0)
-- Kiwix - Wikipedia Offline & more
- Web: http://www.kiwix.org
- Twitter: https://twitter.com/KiwixOffline
- more: http://www.kiwix.org/wiki/Communication
Wikitext-l mailing list Wikitext-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitext-l
Hi Gabriel,
Thx, sounds a perfect solution.
It is already in prod? I can not see it: $ curl -I "http://parsoid-lb.eqiad.wikimedia.org/frwiki/Wikip%C3%A9dia%3AAccueil_princi..." HTTP/1.1 200 OK X-Powered-By: Express Vary: Accept-Encoding Access-Control-Allow-Origin: * Cache-Control: s-maxage=2592000 content-revision-id: 110551673 X-Parsoid-Performance: duration=1321; start=1422795321182 Content-Type: text/html; charset=UTF-8 X-Varnish: 871330818 Via: 1.1 varnish Date: Sun, 01 Feb 2015 12:55:22 GMT X-Varnish: 1210324840 Age: 0 Via: 1.1 varnish Connection: keep-alive X-Cache: cp1058 miss (0), cp1058 frontend miss (0)
If not, do we have a task somewhere I can subscribe/follow to be informed as soon as it is available?
Regards Emmanuel
On 18.12.2014 20:43, Gabriel Wicke wrote:
Emmanuel,
in RESTBase we are exposing an ETag header for each 'render' revision of the HTML, which you will be able to use with an If-None-Match request header to conditionally retrieve a newer version of the page. ETA for the public API is currently January.
Gabriel
On Sat, Nov 15, 2014 at 4:07 AM, Emmanuel Engelhart <kelson@kiwix.org mailto:kelson@kiwix.org> wrote:
Hi I currently try to create a cache for "mwoffliner". A cache for images (thumbnails) and a cache for Parsoid output. For the images/thumbnails it's pretty straight forward thanks to the "last-modified" header. Unfortunately, for the Parsoid output, this seems to be more complicated. Gabriel's htmldumper relies only on the oldid value, but I'm not really satisfied byt this approach because I want to be able to download a new version of the HTML for the same oldid if necessary (for example if the HTML output was improved with a Parsoid fix). There is an "age" header but I don't really understand the fundamental difference with "last-modified". Do we have the same information here but presented in an other way? If yes, why is that better than "last-modified"? There is in addition the "x-varnish" header but this is IMO an internal information I should not rely on (and BTW, time to time we get headers with two "x-warning" header entries, what looks pretty weird to me - see PS). Finally my question, might we introduce a "last-modified" HTTP header? Regards Emmanuel PS: Here an example of request with two "x-varnish" headers: $ curl -I "http://parsoid-lb.eqiad.__wikimedia.org/dewiki/Almer%C3%__ADa?oldid=133672544 <http://parsoid-lb.eqiad.wikimedia.org/dewiki/Almer%C3%ADa?oldid=133672544>" HTTP/1.1 200 OK X-Powered-By: Express Vary: Accept-Encoding Access-Control-Allow-Origin: * Cache-Control: s-maxage=2592000 content-revision-id: 133672544 X-Parsoid-Performance: duration=4063; start=1416051524354 Content-Type: text/html; charset=UTF-8 X-Varnish: 735376643 735208307 Via: 1.1 varnish Date: Sat, 15 Nov 2014 12:03:47 GMT X-Varnish: 1047669169 Age: 1499 Via: 1.1 varnish Connection: keep-alive X-Cache: cp1058 hit (6), cp1058 frontend miss (0) -- Kiwix - Wikipedia Offline & more * Web: http://www.kiwix.org * Twitter: https://twitter.com/__KiwixOffline <https://twitter.com/KiwixOffline> * more: http://www.kiwix.org/wiki/__Communication <http://www.kiwix.org/wiki/Communication> _________________________________________________ Wikitext-l mailing list Wikitext-l@lists.wikimedia.org <mailto:Wikitext-l@lists.wikimedia.org> https://lists.wikimedia.org/__mailman/listinfo/wikitext-l <https://lists.wikimedia.org/mailman/listinfo/wikitext-l>
Wikitext-l mailing list Wikitext-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitext-l
On Sun, Feb 1, 2015 at 4:59 AM, Emmanuel Engelhart kelson@kiwix.org wrote:
Hi Gabriel,
Thx, sounds a perfect solution.
It is already in prod? I can not see it:
Not yet, we are waiting for hardware.
$ curl -I "http://parsoid-lb.eqiad.wikimedia.org/frwiki/Wikip%C3% A9dia%3AAccueil_principal?oldid=110551673" HTTP/1.1 200 OK X-Powered-By: Express Vary: Accept-Encoding Access-Control-Allow-Origin: * Cache-Control: s-maxage=2592000 content-revision-id: 110551673 X-Parsoid-Performance: duration=1321; start=1422795321182 Content-Type: text/html; charset=UTF-8 X-Varnish: 871330818 Via: 1.1 varnish Date: Sun, 01 Feb 2015 12:55:22 GMT X-Varnish: 1210324840 Age: 0 Via: 1.1 varnish Connection: keep-alive X-Cache: cp1058 miss (0), cp1058 frontend miss (0)
If not, do we have a task somewhere I can subscribe/follow to be informed as soon as it is available?
Yes: https://phabricator.wikimedia.org/T1228
Regards Emmanuel
On 18.12.2014 20:43, Gabriel Wicke wrote:
Emmanuel,
in RESTBase we are exposing an ETag header for each 'render' revision of the HTML, which you will be able to use with an If-None-Match request header to conditionally retrieve a newer version of the page. ETA for the public API is currently January.
Gabriel
On Sat, Nov 15, 2014 at 4:07 AM, Emmanuel Engelhart <kelson@kiwix.org mailto:kelson@kiwix.org> wrote:
Hi I currently try to create a cache for "mwoffliner". A cache for images (thumbnails) and a cache for Parsoid output. For the images/thumbnails it's pretty straight forward thanks to the "last-modified" header. Unfortunately, for the Parsoid output, this seems to be more complicated. Gabriel's htmldumper relies only on the oldid value, but I'm not really satisfied byt this approach because I want to be able to download a new version of the HTML for the same oldid if necessary (for example if the HTML output was improved with a Parsoid fix). There is an "age" header but I don't really understand the fundamental difference with "last-modified". Do we have the same information here but presented in an other way? If yes, why is that better than "last-modified"? There is in addition the "x-varnish" header but this is IMO an internal information I should not rely on (and BTW, time to time we get headers with two "x-warning" header entries, what looks pretty weird to me - see PS). Finally my question, might we introduce a "last-modified" HTTP header? Regards Emmanuel PS: Here an example of request with two "x-varnish" headers: $ curl -I "http://parsoid-lb.eqiad.__wikimedia.org/dewiki/Almer%C3%__
ADa?oldid=133672544 http://parsoid-lb.eqiad.wikimedia.org/dewiki/Almer%C3% ADa?oldid=133672544" HTTP/1.1 200 OK X-Powered-By: Express Vary: Accept-Encoding Access-Control-Allow-Origin: * Cache-Control: s-maxage=2592000 content-revision-id: 133672544 X-Parsoid-Performance: duration=4063; start=1416051524354 Content-Type: text/html; charset=UTF-8 X-Varnish: 735376643 735208307 Via: 1.1 varnish Date: Sat, 15 Nov 2014 12:03:47 GMT X-Varnish: 1047669169 Age: 1499 Via: 1.1 varnish Connection: keep-alive X-Cache: cp1058 hit (6), cp1058 frontend miss (0)
-- Kiwix - Wikipedia Offline & more * Web: http://www.kiwix.org * Twitter: https://twitter.com/__KiwixOffline <https://twitter.com/KiwixOffline> * more: http://www.kiwix.org/wiki/__Communication <http://www.kiwix.org/wiki/Communication> _________________________________________________ Wikitext-l mailing list Wikitext-l@lists.wikimedia.org <mailto:Wikitext-l@lists.wikimedia.org
https://lists.wikimedia.org/__mailman/listinfo/wikitext-l <https://lists.wikimedia.org/mailman/listinfo/wikitext-l>
Wikitext-l mailing list Wikitext-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitext-l
-- Kiwix - Wikipedia Offline & more
- Web: http://www.kiwix.org
- Twitter: https://twitter.com/KiwixOffline
- more: http://www.kiwix.org/wiki/Communication
Wikitext-l mailing list Wikitext-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitext-l
wikitext-l@lists.wikimedia.org