(David A. Wheeler david_a_wheeler@yahoo.com):
- Perhaps for simple reads of the current article (cur), you
could completely skip using MySQL and use the filesystem instead.
In other words, caching.
Sorry, I wasn't clear. I wasn't thinking of caching - I was thinking of accessing the filesystem INSTEAD of MySQL when getting the current wikitext.
No, you were clear. I am using "caching" in the plain English sense of the word. Using the file system as a cache in front of the database is just one possible implementation of the idea.
It isn't. And there's no reason to expect flex to be any faster than any other language.
Actually, for some lexing applications flex can be MUCH faster. That's because it can pre-compile a large set of patterns into C, and compile the result. Its "-C" option can, for some applications, result in blazingly fast operations.
I suppose that's true. I do want to formalize the wikitext grammar at some point, and using something like Lex/Yacc code compiled and linked into PHP as a module is certainly a possibility.
Oh, one note - if you want to simply store whether or not a given article entry exists or not, and quickly check it, one fancy way of doing this is by using a Bloom filter. You can hash the article title, and then using a fancy data structure can store its existance or non-existance.
Yes, that's a very good idea. I just recompiled the PHP on the server to have the shared memory extensions, so putting a Bloom filter into that memory is probably better than a more typical hash table.