Hi,
after some browsing around in the code, I've found a major bottleneck: the current internal link resolution system.
If you have a local Wikipedia install, try it for yourself: Visit a few very link heavy pages, like [[List of reference tables]]. You will notice that the rendering speed depends directly on the number of internal links (not on the actual text size, or formatting tags). The link cache seems to work as page rendering gets faster as you load a page repeatedly, but it remains rather slow for link-heavy pages.
For further proof, edit OutputPage.php and in the function replaceInternalLinks.php just add a return $s at the top. As a result, internal links will no longer be resolved. Now browse a page (optimally open Main Page before the change and follow links from there after it) and notice that the rendering is lightning fast.
The internal link resolution process is quite complex. I notice that lots of Title objects are created, and we have lots of ID lookups. I won't speculate too much about the cause until I have had time to examine them closer. But that this is a bottleneck is certain.
I believe we need to make use of the links and brokenlinks table for the actual page rendering. Right now they seem to be used only for the special pages.
Regards,
Erik
Erik Moeller wrote:
Hi,
after some browsing around in the code, I've found a major bottleneck: the current internal link resolution system.
If you have a local Wikipedia install, try it for yourself: Visit a few very link heavy pages, like [[List of reference tables]]. You will notice that the rendering speed depends directly on the number of internal links (not on the actual text size, or formatting tags). The link cache seems to work as page rendering gets faster as you load a page repeatedly, but it remains rather slow for link-heavy pages.
I address this issue as a non-techie. I have worked on some very link-heavy pages, such as the ones for the Academy Awards whose loading time increases with the addition of more data. 1. Is it likely that a technical solution will soon be found for the slow loading? 2. Are there ways in which data could be better organized to minimize the effects of slow loading? 3. Should articles like the Academy Awards listings be broken down to more manageable sizes, and if so, how does one determine optimum sizes?
Eclecticology
Hi,
I address this issue as a non-techie. I have worked on some very link-heavy pages, such as the ones for the Academy Awards whose loading time increases with the addition of more data. 1. Is it likely that a technical solution will soon be found for the slow loading?
I'll try to look closer into it in the next few days, but I'd appreciate some help from the long-term developers. Another thing I'm currently trying to improve is the speed of the Orphaned Pages special, which takes an awful lot of time on my local install (possibly because of a missing index in the table).
2. Are there ways in which data could be better organized to
minimize the effects of slow loading? 3. Should articles like the Academy Awards listings be broken down to more manageable sizes, and if so, how does one determine optimum sizes?
With the current software, it's simply a matter of the number of links. The more links, the slower the page will load. The volume of the actual text content does not matter. So you could try to break collections of links into different sections on separate pages.
But in general I am against such workarounds. We should try to improve the software ASAP. Do you think I could convert you into a developer? We need all the help we can get, and I'd be willing to teach as far as I understand.
Regards,
Erik
erik_moeller@gmx.de wrote:
But in general I am against such workarounds. We should try to improve the software ASAP. Do you think I could convert you into a developer? We need all the help we can get, and I'd be willing to teach as far as I understand.
I would be interested in joining a mailing list aimed at assisting neophyte developers with getting started or in establishing private dialogues with friendly mentors and other neophytes.
I have some obsolete pentium systems that I could dedicate and set up for local testing and/or development activities.
I have substantial experience and exposure to a variety of software and system development projects and some applicable formal training but very limited actual coding experience.
If you can assist me (and others) with converting to a handson developer I would appreciate it. Gaining traction with free software based development has proven more difficult than I expected. Much more so (for me) than my previous experience with limited Windows based programming.
I will be traveling on business next week and the following and will have access to high bandwith internet access. If we could develop a list of software and versions to work with initially in the next few days; I could download all the required components to setup a local system very similar or identical to the initial recommended neophyte's platform.
Regards, Mike Irwin
Hello Michael!
I would be interested in joining a mailing list aimed at assisting neophyte developers with getting started or in establishing private dialogues with friendly mentors and other neophytes.
Hm, how about a wiki page instead? The knowledge of mailing lists tends to get lost in their vast archives.
I have created http://meta.wikipedia.org/wiki/How_to_become_a_Wikipedia_hacker as a skeleton with some content. Please add to the structure, particularly the questions you find relevant for yourself.
I have some obsolete pentium systems that I could dedicate and set up for local testing and/or development activities.
Great!
I have substantial experience and exposure to a variety of software and system development projects and some applicable formal training but very limited actual coding experience.
The best way to learn coding is to look at code. Again, that's why I think installing Wikipedia is crucial for anyone, and it doesn't take much knowledge.
A problem for me, and I should have written that, is that I have never used PHP, MySQL etc. under Windows -- I'm used to Linux. Do we have any Wikipedia coders using Windows?
Regards,
Erik
On Tue, Nov 12, 2002 at 10:41:55AM +0100, Erik Moeller wrote:
I have substantial experience and exposure to a variety of software and system development projects and some applicable formal training but very limited actual coding experience.
The best way to learn coding is to look at code. Again, that's why I think installing Wikipedia is crucial for anyone, and it doesn't take much knowledge.
A problem for me, and I should have written that, is that I have never used PHP, MySQL etc. under Windows -- I'm used to Linux. Do we have any Wikipedia coders using Windows?
Why should we bother about Windoze at all ? Server stuff is coded here.
wikitech-l@lists.wikimedia.org