Many thanks to Neil Harris for the bots; They've been pounding on the site for while now, and here's what we've learned:
- The size of the database isn't an issue. If Wikipedia doubles, or more, performance won't be affected at all. This is what I would have expected.
- Concurrent access does slow things down, but not pathologically. I've got 16 bots running over two high-speed connectiions right now, and the server isn't swapping. However, some pages do take longer to serve than when load is light.
- Some special pages are still moderately slow (particularly "wanted" and "random page"), but the real time hogs now are very long pages with lots of links. Some particular hogs are "Current events", "Chinese sovereign", "List of rare diseases" and long history pages with lots of changes like main page and bug reports. The sample scrabble game is the longest page, but it has very few links so it's not as much of a hog, though it's still a problem.
"Current events" strikes me as a particularly big, yet solvable, problem. We should come up with a way of breaking it into manageable pieces. Chinese sovereign can clearly be broken up as well, and we can do things like replace the Scrabble diagrams with images.
0
Many thanks to Neil Harris for the bots;
Do we have an emergency tool to redo any edits from a certain IP on a certain time? Just in case the bot falls into the wrong hands, or a script kiddy makes one himself.
I know bots can also be useful. Ben-Zin has made one to automatically generate the year pages. It could also be helpful in other cases, so it would be a good idea to share it with others, but I am afraid someone might misuse it.
Kurt
On 7/13/02 4:19 AM, "lcrocker@nupedia.com" lcrocker@nupedia.com wrote:
"Current events" strikes me as a particularly big, yet solvable, problem. We should come up with a way of breaking it into manageable pieces. Chinese sovereign can clearly be broken up as well, and we can do things like replace the Scrabble diagrams with images.
We really shouldn't replace the diagrams with images. That would be a major step backward. It's bearable that the page is slow--it should be a test case for improvement.
However, I agree that super-high-traffic pages like "Current events" should handled expediently.
wikitech-l@lists.wikimedia.org