If you're reading this, I hit "send" by mistake.
On mar, 2002-03-12 at 03:29, Tomasz Wegrzanowski wrote:
Generally it worked, but there were some problems
1)
Conversion script tried to convert .db.orig files:
Now there are articles like "Propozycje Tematow.db.o"
with contents "There was an error converting this file : file not found."
Should be harmless...
2)
Article q{"Casablanca"} (WITH quotes, i don't know how to write
it down, let's say that q{} are stronger quotes ;)), currently
a redirect to q{Casablanca (film)} was converted to q{\"Casablanca"}.
I wouldn't mind if it was just removed from database.
!! Okay, that's just wrong. Hmm.
Actually, on my machine I don't even see this article converted. I'm not
sure what's going on there, I seem to have a lot of articles missing.
Possibly some weird PHP problem, as near as I can tell it's simply not
reading all the files in every directory.
3)
Some articles are weird, like:
Polimery
12c12 < [?rednia masa cz?steczkowa polimer?w]? --- > Masa cz?steczkowa polimer?w
That looks like a diff which was imported as article.
I have no idea what could cause it.
The usemod db format doesn't seem to be quite as consistent as the
conversion script wants... I'll look it over again in the morning.
Major problem: z-with-dot letter seems to be screwed
in links.
Links from
http://local_copy_of_wikipedia/wiki.phtml?title=Wy%C5%BCsze_uczelnie_w_Pols…
to subpages are broken.
They work on Polish UseMod Wikipedia, so it must be problem with conversion script.
Maybe there are some mistakes in latin2 -> unt8 table ?
Ah, looks like parts of the links are accidentally converted from latin2
to utf-8 twice. (Thanks for catching that! It doesn't show up on the
Esperanto database I usually test with, which is already mostly UTF-8
and doesn't change on a second conversion.) I've moved the conversion to
before the link extraction, it seems better now.
-- brion vibber (brion @
pobox.com)