Ariel T. Glenn wrote:
Στις 23-09-2010, ημέρα Πεμ, και ώρα 21:27 -0500, ο/η Q έγραψε:
Given the fact that static dumps have been broken for *years* now, static dumps are on the bottom of WMFs priority list; I thought it would be the best if I just went ahead and built something that can be used (and, of course, improved).
Marco
That's what I just said. Work with them to fix it, IE: volunteer. IE: you fix it.
Actually it's not so much that they are on the bottom of the list as that there are two people potentially looking at them, and they are Tomasz (who is also doing mobile) and me (and I am doing the XML dumps rather than the HTML ones, until they are reliable and happy).
However if you are interested in working on these, I am *very* happy to help with suggestions, testing, feedback, etc., even while I am still woroking on the XML dumps. Do yuu have time and interest?
Ariel
Most (all?) articles should be already parsed in memcached. I think the bottleneck would be the compression. Note however that the ParserOutput would still need postprocessing, as would ?action=render. The first thing that comes to my mind is to remove the edit links (this use case alone seems enough for implementing editsection stripping). Sadly, we can't (easily) add the edit sections after the rendering.
On Sat, Sep 25, 2010 at 12:56 AM, Platonides platonides@gmail.com wrote:
Ariel T. Glenn wrote:
Στις 23-09-2010, ημέρα Πεμ, και ώρα 21:27 -0500, ο/η Q έγραψε:
Given the fact that static dumps have been broken for *years* now, static dumps are on the bottom of WMFs priority list; I thought it would be the best if I just went ahead and built something that can be used (and, of course, improved).
Marco
That's what I just said. Work with them to fix it, IE: volunteer. IE: you fix it.
Actually it's not so much that they are on the bottom of the list as that there are two people potentially looking at them, and they are Tomasz (who is also doing mobile) and me (and I am doing the XML dumps rather than the HTML ones, until they are reliable and happy).
However if you are interested in working on these, I am *very* happy to help with suggestions, testing, feedback, etc., even while I am still woroking on the XML dumps. Do yuu have time and interest?
Ariel
Most (all?) articles should be already parsed in memcached. I think the bottleneck would be the compression. Note however that the ParserOutput would still need postprocessing, as would ?action=render. The first thing that comes to my mind is to remove the edit links (this use case alone seems enough for implementing editsection stripping). Sadly, we can't (easily) add the edit sections after the rendering.
This should be doable using a simple regex which plainly goes for <span class="editsection">.
Marco
wikitech-l@lists.wikimedia.org