Re: [Wikitech-l] Google Summer of Code: accepted projects

22 Apr 2009


      Aryeh Gregor wrote:
...
I'm not clear on why we don't just make the daemon synchronously
return a result the way ImageMagick effectively does.  Given the level
of reuse of thumbnails, it seems unlikely that the latency is a
significant concern -- virtually no requests will ever actually wait
on it.
( I basically outlined these issues on the soc page but here they are 
again with at bit more clarity )
I recommended that the image daemon run semi-synchronously since the 
changes needed to maintain multiple states and return non-cached 
place-holder images while managing updates and page purges for when the 
updated images are available within the wikimedia server architecture 
probably won't be completed in the summer of code time-line. But if the 
student is up for it the concept would be useful for other components 
like video transformation / transcoding, sequence flattening etc. But 
its not what I would recommend for the summer of code time-line.
== per issues outlined in bug 4854 ==
I don't think its a good idea to invest a lot of energy into a separate 
python based image daemon. It won't avoid all  problems listed in bug 4854
Shell-character-exploit issues should be checked against anyway (since 
not everyone is going to install the daemon)
Other people using mediaWiki won't add a python or java based image 
resize and resolve dependency python or java  component & libraries. It 
won't be easier to install than imagemagick or "php-gd" that are 
repository hosted applications and already present in shared hosting 
environments.
Once you start integrating other libs like (java) Batik it becomes 
difficult to resolve dependencies (java, python etc) and to install you 
have to push out a "new program" that is not integrated into all the 
application repository manages for the various distributions.
Potential to isolate CPU and memory usage should be considered in the 
core medaiWiki image resize support anyway . ie we don't want to crash 
other peoples servers who are using mediaWiki by not checking upper 
bounds of image transforms. Instead we should make the core image 
transform smarter maybe have a configuration var that /attempts/ to bind 
the upper memory for spawned processing and take that into account 
before issuing the shell command for a given large image transformation 
with a given sell application.
== what would probably be better for the image resize efforts should 
focus on ===
(1) making the existing system "more robust" and (2) better taking 
advantage of multi-threaded servers.
(1) right now the system chokes on large images we should deploy support 
for an in-place image resize maybe something like vips (?) 
(http://www.vips.ecs.soton.ac.uk/index.php?title=Speed_and_Memory_Use) 
The system should intelligently call vips to transform the image to a 
reasonable size at time of upload then use those derivative for just in 
time thumbs for articles. ( If vips is unavailable we don't transform 
and we don't crash the apache node.)
(2) maybe spinning out the image transform process early on in the 
parsing of the page with a place-holder and callback so by the time all 
the templates and links have been looked up the image is ready for 
output. (maybe another function wfShellBackgroundExec($cmd, 
$callback_function) (maybe using |pcntl_fork then normal |wfShellExec 
then| ||pcntl_waitpid then callback function ... which sets some var in 
the parent process so that pageOutput knows its good to go) |
If operationally the "daemon" should be on a separate server we should 
still more or less run synchronously ... as mentioned above ... if 
possible the daemon should be php based so we don't explode the 
dependencies for deploying robust image handling with mediaWiki.
peace,
--michael

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] Google Summer of Code: accepted projects