Hi again,
I wrote:
When I tried to parse the current German XML dump I discovered the following malformed sequence (in [[de:India]]):
You can remove the errors with a little perl script - only a workaround for the current dump:
#!/usr/bin/perl
while(<>) { next if ($_ =~ /^[[got:�/); if ($_ =~ /^[[zh:�/) { print "</text>"; } else { print $_; } }