Jan writes:
Ultimately the best solution would be to have a table wanted(title, #pages) with an index on #pages (and a unique index on title), then MySQL wouldn't need to sort at all. I don't know of the top of my head if there are any other queries that depend on 'brokenlinks' but I don't believe so and if there are not then I would recommend replacing it.
I believe we need the full brokenlinks information in order to initialize the links table once a new article is written, so that "What links here" will immediately work for new articles.
Nevertheless, I think we should have a wanted table as above in addition. Space is really no issue, but time is, and Most Wanted is likely to be one of our more commonly called slow functions.
Axel
On Wed, Jul 10, 2002 at 02:23:13AM +0200, Axel Boldt wrote:
Jan writes:
Ultimately the best solution would be to have a table wanted(title, #pages) with an index on #pages (and a unique index on title), then MySQL wouldn't need to sort at all. I don't know of the top of my head if there are any other queries that depend on 'brokenlinks' but I don't believe so and if there are not then I would recommend replacing it.
I believe we need the full brokenlinks information in order to initialize the links table once a new article is written, so that "What links here" will immediately work for new articles.
Good grief. How could I miss that? I even remember writing the code that did that! You are of course completely right. Lee already pointed out another problem that has to do with the fact that pages are not saved until after they are rendered, but by that time the information of the old page (esp. the valid links on that page) is no longer in the internal PHP variables.
Nevertheless, I think we should have a wanted table as above in addition. Space is really no issue, but time is, and Most Wanted is likely to be one of our more commonly called slow functions.
In fact I advised the same thing by mail to Lee. However, I also agree with Lee that we should now go with the quick fix and get Wikipedia running on the new code base ASAP. Then we will be able to get some real timing data and be sure that this is one of the bottle-necks. I strongly suspect it will be because it will still be eating a lot of space. If I had the time and Lee would allow it I would probably go and implement it right away.
-- Jan Hidders
wikitech-l@lists.wikimedia.org