Francesco Cosoleto ha scritto:
I had always asked myself why we have adopted this
solution because I
have a doubt about the amount of RAM requested by mediawiki-messages
that the bot actually use. I think a list of items not to discard would
have been simpler. Although I have really appreciated this more
sophisticated solution.
It requires about 1-2 Kb for site on wikipedia family, this family has a
total of 255 sites. Looks for me as acceptable memory usage (recently I
have reduced memory requested by wikipedia module of about 60 Kb with
r6751 if I remember rightly...). And with diskcache enabled are 50 Mb or
more of diskspace wasted (software should use temporary files only if
really needed).
Simple test script:
grep --exclude-dir=.svn -rohP
"\.mediawiki_message\s*\(\s*[\'\"][^)]+\)"
./ | sort | uniq | sed -e "s/^/ sum += len\(site/" -e "s/$/)/" -e
1i"import wikipedia\nsum = 0\nfor lang in wikipedia.Site('en',
'wikipedia').languages():\n site = wikipedia.getSite(lang,
'wikipedia')" -e "\$a\ print sum" >mwmsg_length.py
(I have disabled 'sp-contributions-older' line to run it as it raises an
exception on wikipedia:gv)
--
Francesco Cosoleto
History is a gallery of pictures in which there are few originals and
many copies. (Alexis de Tocqueville)