[WikiEN-l] [Foundation-l] Old Wikipedia backups discovered
Martin Møller Skarbiniks Pedersen
traxplayer at gmail.com
Sun Dec 19 06:11:06 UTC 2010
On 17 December 2010 21:18, Joseph Reagle <joseph.2008 at reagle.org> wrote:
> On Thursday, December 16, 2010, Federico Leva (Nemo) wrote:
>> I have the first 10K edits up reconstructed in their various pages at:
>> http://cyber.law.harvard.edu/~reagle/wp-redux/
>
> I fixed some of the encoding issues. The DB dump contained different encodings. So, the encoding of each diff in the dump is independently now guessed using Python's CharDet (Universal Encoding Detector) library.
>
> So now you can read up on the few "accented" topics in the early Wikipedia including: Göteborg, Köpenhamn, and Křbenhavn.
Should probably be København and not Křbenhavn
/Martin
More information about the WikiEN-l
mailing list