-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Simetrical wrote:
On Tue, Aug 5, 2008 at 3:55 PM, Guy Van den Broeck guyvdb@gmail.com wrote:
I can't draw any final conclusions with respect to performance because I don't have a site of considerable size. I'd appreciate it if somebody could test this implementation and compare it to the current php installation.
Any site of considerable size is likely to be running wikidiff2, which is written in C++ and almost certainly much faster than your PHP implementation. (Or did you really mean that your PHP code is faster than the C++ library?) So if your target is big sites, you shouldn't have written it in PHP.
Diffing is not a significant bottleneck at present with wikidiff2, however, so a rewrite for performance reasons alone is unlikely to be productive.
The vast majority of wikis are probably not running wikidiff2 as they're probably on shared hosting or their admins don't know how to compile and install a low-level extension, or don't know they should.
Nor does one have to be a "big" site to benefit from diffs that aren't insanely slow for the "pathological" cases (many bulk changes, or vandalism that affects a large page being common).
I imagine the Wikimedia sysadmins (for instance) would be uninterested in trying out new solutions when an established one works perfectly well.
The fact that we've got a custom solution means that everyone else gets shafted with the slow diff that's prone to causing timeouts and can even be DoSed easily.
Very slow diffs can be a huge problem with RSS feeds, for instance, which diff many pages for a single request.
In the long run, a hopefully faster, more maintainable diff system could be a good base to work from in the future -- and to build smarter diff tools with improved output or a cleaner degrade/failout for ramining super-slow cases.
- -- brion