On Tue, 2003-04-29 at 23:33, Lee Daniel Crocker wrote:
(David A. Wheeler dwheeler@dwheeler.com):
- Perhaps for simple reads of the current article (cur), you
could completely skip using MySQL and use the filesystem instead.
In other words, caching.
Not necessarily; it would also be possible to keep the wiki text in files. But I'm not sure what great benefit this would have, as you still have to go looking up various information to render it.
Yes, various versions of that have been tried and proposed, and more will be. The major hassles are (1) links, which are displayed differently when they point to existing pages, so a page may appear differently from one view to the next depending on the existence of other pages,
That's not a problem; one simply invalidates the caches of all linking pages when creating/deleting.
This is already done in order to handle browser-side caching; each page's cur_touched timestamp is updated whenever a linked page is created or deleted. Simply regenerate the page if cur_touched is more recent than the cached HTML.
- You could start sending out text ASAP, instead of batching it.
Many browsers start displaying text as it's available, so to users it might _feel_ faster.
A few things (like language links) currently require parsing the entire wikitext before we output the topbar. Hypothetically we could output the topbar after the text and let CSS take care of its location as we do for the sidebar, but this may be problematic (ie in case of varying vertical size due to word wrap) and would leave users navigationally stranded while loading.
Also, holding text in-memory may create memory pressure that forces more useful stuff out of memory.
Not an issue. HTML is sent out immediately after it's rendered.
Well... many passes of processing are done over the wikitext on its way to HTML, then the whole bunch is dumped out in a chunk.
Things like database updates are deferred until after sending;
I'm not 100% sure how safe this is; if the user closes the connection from their browser deliberately (after all, the page _seems_ to be done loading, why is the icon still spinning?) or due to an automatic timeout, does the script keep running through the end or is it halted in between queries?
One things that would be nice is if the HTTP connection could be dropped immediately after sending and before those database updates. That's easy to do with threads in Java Servlets, but I haven't found any way to do it with Apache/PHP.
For some things (search index updates) we use INSERT/REPLACE DELAYED queries, whose actual action will happen at some point in the future, taken care of for us by the database. There doesn't seem to be an equivalent for UPDATE queries.
Hypothetically we could have an entirely separate process to perform asynchronous updates and just shove commands at it via a pipe or shared memory, but that's probably more trouble than it's worth.
-- brion vibber (brion @ pobox.com)