Hi Tim,
thank you for the input.
Wikidata unfortunately will not contain all language links: a wiki can locally overwrite the list (by extending the list, suppressing a link from Wikidata, or replacing a link). This is a requirement as not all language links are necessarily symmetric (although I wish they were). This means there is some interplay between the wikitext and the links coming from Wikidata. An update to the links coming from Wikidata can have different effects on the actually displayed language links depending on what is in the local wikitext.
Now, we could possibly also save the effects defined in the local wikitext (which links are suppressed, which are additionally locally defined) in the DB as well, and then when they get changed externally smartly combine that is and create the new correct list --- but this sounds like a lot of effort. It would potentially save cycles compared to today's situation. But the proposed solution does not *add* cycles compared to today's situation. Today, the bots that keep the language links in synch, basically incur a re-rendering of the page anyway, we would not being adding any cost on top of that. We do not make matters worse with regards to server costs.
Also it would, as Daniel mentioned, be an optimization which only would work for the language links. Once we add further data, that will be available to the wikitext, this will not work at all anymore.
I hope this explains why we think that the re-rendering is helpful.
Having said that, here's an alternative scenario: Assuming we do not send any re-rendering jobs to the Wikipedias, what is the worst that would happen?
For this to answer, I need the answer to this question first: are the squids and caches holding their content indefinitely, or would the data, in the worst case, just be out of synch for, say, up to 24 hours on a Wikipedia article that didn't have an edit at all?
If we do not re-render, I assume editors will come up with their own workflows (e.g. changing some values in Wikidata, going to their home wiki, purge the effected page, or write a script that gives them a "purge my homewiki page"-link on Wikidata), which is fine, and still cheaper than if we initiate re-rendering all pages every time. It just means that in some cases some pages will not be up to date.
So, we could go without re-rendering at all, if there is consensus that this is the preferred solution and that this is better than the solution we suggested.
Anyone having any comments, questions, or insights?
Cheers, Denny
2012/11/5 Tim Starling tstarling@wikimedia.org:
On 02/11/12 22:35, Denny Vrandečić wrote:
- For re-rendering the page, the wiki needs access to the data.
We are not sure about how do to this best: have it per cluster, or in one place only?
Why do you need to re-render a page if only the language links are changed? Language links are only in the navigation area, the wikitext content is not affected.
As I've previously explained, I don't think the langlinks table on the client wiki should be updated. So you only need to purge Squid and add an entry to Special:RecentChanges.
Purging Squid can certainly be done from the context of a wikidatawiki job. For RecentChanges the main obstacle is accessing localisation text. You could use rc_params to store language-independent message parameters, like what we do for log entries.
-- Tim Starling
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l