Hello to all!
I am currently working on interwiki transclusion [1].
In the proposed approach, we currently retrieve distant templates: 1) using wfGetLB('wikiid')->getConnection if the distant wiki has a wikiid in the interwiki table 2) using the API in the opposite case
In case 1, it seems that retrieving a template from a distant DB is just as expensive as retrieving it from the local DB. So, we don't store the wikitext of the template locally.
In case 2, the retrieved wikitext is cached in the transcache table for an arbitrary time.
I have two questions about this system: * Is it better to use the transcache table or to use memcached for the API-retrieved remplates? * Should we cache the DB-retrieved templates with memcached?
An advantage of memcached here is that it is shared by all the WMF wikis, whereas the transcache table is owned by a wiki for itself.
Thanks in advance
Best regards
-- Peter Potrowl http://www.mediawiki.org/wiki/User:Peter17
[1] http://www.mediawiki.org/wiki/User:Peter17/Reasonably_efficient_interwiki_tr...
On Tue, Jul 20, 2010 at 9:07 AM, Peter17 peter017@gmail.com wrote:
Hello to all!
I am currently working on interwiki transclusion [1].
In the proposed approach, we currently retrieve distant templates:
- using wfGetLB('wikiid')->getConnection if the distant wiki has a
wikiid in the interwiki table
That won't work. Wikimedia doesn't use the text table, so you can't just query the text table. When calling things like getRevText() locally, it's actually accessing external storage, not querying the database.
- using the API in the opposite case
In case 1, it seems that retrieving a template from a distant DB is just as expensive as retrieving it from the local DB. So, we don't store the wikitext of the template locally.
In case 2, the retrieved wikitext is cached in the transcache table for an arbitrary time.
Currently the transcache table stores any entries, regardless of where they were transcluded from (farm or remote wiki). This probably doesn't need to be the situation for case 1, like you said.
I have two questions about this system:
- Is it better to use the transcache table or to use memcached for the
API-retrieved remplates?
- Should we cache the DB-retrieved templates with memcached?
Memcached, imho. It's faster than the DB and handles expiries on its own. And for sites without Memcached, they can always use CACHE_DB or CACHE_ACCEL.
An advantage of memcached here is that it is shared by all the WMF wikis, whereas the transcache table is owned by a wiki for itself.
It's also generally faster, handles its own expiries and allows for a bit more flexibility in cache times (some things can be cached longer than others, potentially).
-Chad
2010/7/20 Chad innocentkiller@gmail.com:
That won't work. Wikimedia doesn't use the text table, so you can't just query the text table. When calling things like getRevText() locally, it's actually accessing external storage, not querying the database.
Wikimedia actually does use the text table, it's just that old_text contains a URL-like referrer to external storage and old_flags contains 'external' to indicate this. You would still need to access the text table on the remote wiki in order to be able to retrieve the text from ES.
- Should we cache the DB-retrieved templates with memcached?
No. At Wikimedia, we already cache revision texts in memcached so we don't have to query ES too much. Caching DB-retrieved templates in memcached would duplicate this at Wikimedia at least, and generally not make a great deal of sense even without the duplication.
An advantage of memcached here is that it is shared by all the WMF wikis, whereas the transcache table is owned by a wiki for itself.
It's also generally faster, handles its own expiries and allows for a bit more flexibility in cache times (some things can be cached longer than others, potentially).
Note that memcached is also segmented by the choice of keys: wfMemcKey() prefixes the keys it generates with the wiki ID so wikis don't interfere with each other's keys. To share memcached entries between wikis, you'd either need to not use wfMemcKey() (bad) or hack it to optionally replace the wiki ID with something else (better).
Roan Kattouw (Catrope)
On 20 Jul 2010 at 15:07, Peter17 wrote:
I am currently working on interwiki transclusion [1].
In the proposed approach, we currently retrieve distant templates:
- using wfGetLB('wikiid')->getConnection if the distant wiki has a
wikiid in the interwiki table 2) using the API in the opposite case
In case 1, it seems that retrieving a template from a distant DB is just as expensive as retrieving it from the local DB. So, we don't store the wikitext of the template locally.
In case 2, the retrieved wikitext is cached in the transcache table for an arbitrary time.
I have two questions about this system:
- Is it better to use the transcache table or to use memcached for the
API-retrieved remplates?
- Should we cache the DB-retrieved templates with memcached?
An advantage of memcached here is that it is shared by all the WMF wikis, whereas the transcache table is owned by a wiki for itself.
Thanks in advance
Best regards
-- Peter Potrowl http://www.mediawiki.org/wiki/User:Peter17
[1] http://www.mediawiki.org/wiki/User:Peter17/Reasonably_efficient_interwiki_tr...
A question for you Peter (and I have no opinion or knowledge on your question) about your project.
At Wikisource, we undertake a lot of transclusion, and a fair amount of it utilising Sanbeg's Extension:Labeled Section Transclusion [2]. Do you see that this may be part of what you are looking towards, or is it solely templates?
My reason for asking is that it would be possible to identify components of WS works where one may wish to incorporate part of a page into a page at another wiki.
I also see that there is scope for further display of the same texts in different translations.
Regards, Andrew
[2] http://www.mediawiki.org/wiki/Extension:Labeled_section_transclusion
Hi Peter
For your info: I have been working on a system for transclusion of data records (as opposed to wikitext) from external sources (which may again be wikis). The idea is kind of complementary, but the caching issues may be similar.
Have a look at http://www.mediawiki.org/wiki/Extension:DataTransclusion if you like.
-- daniel
wikitech-l@lists.wikimedia.org