Just to narrow this down a little further from the DB
server-side: the
eventlogging tables do use utf-8, so the fix probably doesn't
require
laborious schema changes (if that's what you meant by changing database
types).
To follow the structure on mediawiki I think the easiest is to change db
types from varchar to varbinary where utf-8 is being used. Please let us
know if you do not think it is appropriate.
On Mon, Jun 9, 2014 at 8:13 AM, Sean Pringle <springle(a)wikimedia.org> wrote:
On Fri, Jun 6, 2014 at 6:30 PM, Nuria Ruiz
<nuria(a)wikimedia.org> wrote:
Encoding in python2 is a notorious pain and hard to get right so to
fixing this will mean not just "restoring" records from logs but also it
involves changing database connection args, bindings and database types.
Just to narrow this down a little further from the DB server-side: the
eventlogging tables do use utf-8, so the fix probably doesn't require
laborious schema changes (if that's what you meant by changing database
types).
BR
Sean
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics