LS, I have been trying for two days to change one record in nl:wiktionary. The update ends with ****** Sorry- we have a problem... The wikimedia web server didn't return any response to your request. To get information on what's going on you can visit #wikipedia. An "offsite" status page is hosted on OpenFacts. Generated Tue, 15 Jun 2004 06:52:10 GMT by wikipedia.org (squid/2.5.STABLE4-20040219) ******* the new content should be ====''[[WikiWoordenboek:Zelfstandig naamwoord|Zelfstandig naamwoord]]''==== the article is [[Sjabloon:-noun-]]
Thanks, GerardM7d41c42515401cc--
On Tue, Jun 15, 2004 at 09:19:20AM +0200, Gerard.Meijssen wrote:
LS, I have been trying for two days to change one record in nl:wiktionary. The update ends with
Sorry- we have a problem... The wikimedia web server didn't return any response to your request. To get information on what's going on you can visit #wikipedia. An "offsite" status page is hosted on OpenFacts. Generated Tue, 15 Jun 2004 06:52:10 GMT by wikipedia.org (squid/2.5.STABLE4-20040219)
the new content should be ====''[[WikiWoordenboek:Zelfstandig naamwoord|Zelfstandig naamwoord]]''==== the article is [[Sjabloon:-noun-]]
When saving a template, all pages using the template are "touched" so that they will be updated the next time they are rendered. Having zillions of pages using that template, this takes a long time. Probably too long.
Regards,
JeLuF
Jens Frank wrote:
On Tue, Jun 15, 2004 at 09:19:20AM +0200, Gerard.Meijssen wrote:
LS, I have been trying for two days to change one record in nl:wiktionary. The update ends with
Sorry- we have a problem... The wikimedia web server didn't return any response to your request. To get information on what's going on you can visit #wikipedia. An "offsite" status page is hosted on OpenFacts. Generated Tue, 15 Jun 2004 06:52:10 GMT by wikipedia.org (squid/2.5.STABLE4-20040219)
the new content should be ====''[[WikiWoordenboek:Zelfstandig naamwoord|Zelfstandig naamwoord]]''==== the article is [[Sjabloon:-noun-]]
When saving a template, all pages using the template are "touched" so that they will be updated the next time they are rendered. Having zillions of pages using that template, this takes a long time. Probably too long.
Regards,
JeLuF _______________________________________________ Wikitech-l mailing list Wikitech-l@Wikipedia.org http://mail.wikipedia.org/mailman/listinfo/wikitech-l
Is this the case in all the Wikipedias? Could this be an explanation of some of the recent performance problems -- some of the templates must be used on thousands, if not tens of thousands, of pages?
-- Neil
Jens Frank wrote:
On Tue, Jun 15, 2004 at 09:19:20AM +0200, Gerard.Meijssen wrote:
LS, I have been trying for two days to change one record in nl:wiktionary. the article is [[Sjabloon:-noun-]]
When saving a template, all pages using the template are "touched" so that they will be updated the next time they are rendered. Having zillions of pages using that template, this takes a long time. Probably too long.
Did whoever came up with that think about the performance implications? o.O
Timwi wrote:
Jens Frank wrote:
On Tue, Jun 15, 2004 at 09:19:20AM +0200, Gerard.Meijssen wrote:
LS, I have been trying for two days to change one record in nl:wiktionary. the article is [[Sjabloon:-noun-]]
When saving a template, all pages using the template are "touched" so that they will be updated the next time they are rendered. Having zillions of pages using that template, this takes a long time. Probably too long.
Did whoever came up with that think about the performance implications? o.O
Of course I did. In MediaWiki 1.2 I deliberately avoided this kind of cache purging, for performance reasons. But this behaviour was very unpopular, and some automated method was often requested. So in 1.3 I invalidated the cache of all pages linking to a given template page, when the template page is updated. The purge operation required is no worse than that for page creation or deletion.
There are a number of workarounds. Firstly, the cache invalidation only occurs when pages in the Template (Sjabloon) namespace are updated. So you could put the -noun- template in [[WikiWoordenboek:-noun]], and include it with {{WikiWoordenboek:-noun}}.
The second alternative is to stop using templates for each and every little link. Are they really necessary?
The third alternative is to get it right the first time, to avoid changing it at all costs, and if you really need to change it, only do so in off-peak times.
In any case, there isn't a zillion pages using that template, only 497. Perhaps we need to do some database optimisation so that relatively small operations like this become possible.
-- Tim Starling
Tim Starling wrote:
Timwi wrote:
Jens Frank wrote:
On Tue, Jun 15, 2004 at 09:19:20AM +0200, Gerard.Meijssen wrote:
LS, I have been trying for two days to change one record in nl:wiktionary. the article is [[Sjabloon:-noun-]]
When saving a template, all pages using the template are "touched" so that they will be updated the next time they are rendered. Having zillions of pages using that template, this takes a long time. Probably too long.
Did whoever came up with that think about the performance implications? o.O
Of course I did. In MediaWiki 1.2 I deliberately avoided this kind of cache purging, for performance reasons.
Jens Frank was talking about "touching". This sounded like a database query to me. Now you are terming it "cache purging", which I'm not sure what it is, but if it were referring to memcached, it would not require a database query.
I suppose I definitely need to know more about this. Where is the "parser cache" -- is that in memcached or elsewhere? What does "touching" a page mean -- does it involve a database query? Can cache purging be done without a database query? If not, can the cache mechanism be changed (perhaps to use memcached) so database queries can be avoided when all you want to do is invalidate a cache entry?
Timwi
Timwi wrote:
Tim Starling wrote:
Timwi wrote:
Jens Frank wrote:
On Tue, Jun 15, 2004 at 09:19:20AM +0200, Gerard.Meijssen wrote:
LS, I have been trying for two days to change one record in nl:wiktionary. the article is [[Sjabloon:-noun-]]
When saving a template, all pages using the template are "touched" so that they will be updated the next time they are rendered. Having zillions of pages using that template, this takes a long time. Probably too long.
Did whoever came up with that think about the performance implications? o.O
Of course I did. In MediaWiki 1.2 I deliberately avoided this kind of cache purging, for performance reasons.
Jens Frank was talking about "touching". This sounded like a database query to me. Now you are terming it "cache purging", which I'm not sure what it is, but if it were referring to memcached, it would not require a database query.
"Touch" is the name of a unix utility which changes the modification time on a file, usually setting it to the present time. Hence "touching" a file is equivalent to making some infinitesimal change to it.
In the cur table, there is a field called "cur_touched". This field is used for cache invalidation. If links on the page change from broken to unbroken or vice versa, cur_touched is set to the present. This field is queried to determine whether the client cache or the parser cache is valid.
The squid cache does not check back with the database on each query, instead it keeps its own records of which pages are valid and which are invalid. So whenever a page changes, we need to not only update cur_touched, but also to send a message to each squid server informing it that the page has become invalid. This is called "purging".
I suppose I definitely need to know more about this. Where is the "parser cache" -- is that in memcached or elsewhere?
The parser cache is a cache of the content HTML of a page, stored in memcached. That is, it omits the sidebar, header and footer. It is stored under an index which indicates what user preferences were used to render it. You can tell when a page is served from the parser cache by looking for an HTML comment at the end of the content area, e.g.
<!-- Saved in parser cache with key enwiki:pcache:idhash:13173-1!1!1!monobook!1!1!0!1!0!1!0 and timestamp 20040622004805 -->
13173 is the main page.
What does "touching" a page mean -- does it involve a database query? Can cache purging be done without a database query? If not, can the cache mechanism be changed (perhaps to use memcached) so database queries can be avoided when all you want to do is invalidate a cache entry?
You can't store anything in memcached unless you're prepared to have it go missing. Memcached is appropriate for caching read operations, but not for replacing write operations.
One alternative is to use a system where two cur_touched fields must be checked to validate a page, instead of one. One for the page, one for the template. This would only be efficient for widely included templates. Templates included in only a few pages would have to be handled the old way.
Another alternative would be to reduce service. When changing a template, before attempting to invalidate all caches, it would check how many pages are using the template. If it is greater than some critical number, the invalidation could simply be skipped. After a few days, the squid cache and the parser cache will expire, and users will see the updated template.
Please don't reply by saying "yeah, do that!"
-- Tim Starling
Tim Starling wrote:
In the cur table, there is a field called "cur_touched". This field is used for cache invalidation. If links on the page change from broken to unbroken or vice versa, cur_touched is set to the present. This field is queried to determine whether the client cache or the parser cache is valid.
I see. If we are okay with sacrificing a few 304 Not Modifieds, I have an idea of an alternative scheme that can work without having to modify cur_touched of hundreds of pages when a template is changed.
I think we should have an equivalent of what is cur_touched in memcached (I'll call it a touched-key), and if it is not in memcached, the page is served.
This means that sometimes such a touched-key will expire from the cache and *one* page will be served instead of 304ed, but more importantly, editing a template (or, in fact, creating a heavily-linked-to page) would require only purging (parser cache (memcached) and squids), and no database queries on hundreds of rows.
You can't store anything in memcached unless you're prepared to have it go missing.
Oh, come on, I know that. :) I don't think the few times that a touched-key "goes missing" are that much of a problem. Compared to fully parsed pages, the touched-keys are extremely small, and hence many of them can stay in memcached. If one does go missing, then we'll just have to serve the page; if we're lucky, it's still in the parser cache. If not, then we re-render it, but we also update both the parser cache and the touched-key.
Memcached is appropriate for caching read operations, but not for replacing write operations.
cur_touched is not the kind of information that is essential to keep (meaning: it is not a problem if this information "goes missing").
Timwi
wikitech-l@lists.wikimedia.org