-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Rolf Lampa wrote:
Is there any cheap trick to make a "bit- or byte-wise wash" of the xml dump-files? Or is there any tool out there capable of streaming through huge files tidying them up? (I can make my own "bitwise wash/mask" with Delphi-code if I just knew for sure what general bit-pattern (if any) to apply on the file).
There's already such verification run on the dumps as they're produced, so it shouldn't be possible to get a file with invalid UTF-8 characters.
Can you be specific about where in the file they occur?
- -- brion vibber (brion @ pobox.com)