Le 16/10/2012 23:04, Platonides a écrit :
On 11/10/12 17:46, Strainu wrote:
I did something last year for exporting the files from WLMRO to Europeana: http://code.google.com/p/wikiro/source/browse/trunk/robots/python/pywikipedi... It was done very quickly and it probably has some bugs. Platonides also has something made for these statistics: http://toolserver.org/~platonides/wlm/users.php , although I couldn't tell you where to get the source from.
It's basically a regex to extract the user from the author field... plus 120 special cases of people who don't put a link to their user page or use a template, plus 4 special cases for users which use custom templates instead of {{information}}, plus another for Talmoryair which uses {{Artwork}} instead of {{Information}}. OTOH it has parsed -hopefully quite correctly- 363k images from 15000 authors.
Yes, this is what I meant: not what I called a handful solution, neither in the principle nor in the implementation. What would be the disadvantages having these two information in the DB?
Emmanuel
PS: Thank you for proposing your help for the scripting, but this not really my purpose with this email - I really hope to find a clean solution instead of having to code/use some painful scripts on millions of pictures.