On Wed, May 07, 2003 at 11:46:59PM +0200, Tomasz Wegrzanowski wrote:
On Wed, May 07, 2003 at 04:07:00PM -0500, Nick Reinking wrote:
Also, what format is the wikitext stored in the database as? UTF-8? UTF-16?
Most in UTF-8. Some in ISO-8859-1 but they should move to UTF-8 at some point.
As far as performance goes, with what I'm handling now, with all the .txt data files in the testsuite (x256 = 492672 lines), I'm seeing parsing speeds of about 86600 lines/sec (in an 18KB executable).
On what hardware ? What test suite ? kB/sec would be more useful estimate than line/sec.
Hardware: P4 1.8Ghz 512MB RAM The files are from CVS under phase3/testsuite/data/*.txt
The file is 28650KB, so 5036KB/sec.