[Labs-l] Out of memory errors
Tim Landscheidt
tim at tim-landscheidt.de
Mon Aug 3 16:27:47 UTC 2015
(anonymous) wrote:
> At least two of my web application have been hitting out of memory errors
> in the past 24 hours. I'm pretty sure that neither is requesting more
> memory than it did before. The problem is not appearing for similar
> cronjobs run through jsub -l release=trusty.
> Is there something that can be done about this server-side?
There is a recurring problem with
tools-webgrid-lighttpd-1409 (= an instance that runs stan-
dard web services on Trusty).
The basic cause is that the available virtual memory is
overstated by these hosts as the jobs running there will
share substantial amounts of memory by using the same bina-
ries (lighttpd, php-cgi, etc.). If one of those web ser-
vices does something different, then the formula doesn't
work anymore and the host runs short on real memory.
Or so I thought, because:
| scfc at tools-bastion-01:~$ qconf -se tools-webgrid-lighttpd-1409.eqiad.wmflabs
| hostname tools-webgrid-lighttpd-1409.eqiad.wmflabs
| load_scaling NONE
| complex_values slots=128,release=trusty
| load_values arch=lx26-amd64,num_proc=4,mem_total=7985.183594M, \
| swap_total=487.996094M,virtual_total=8473.179688M, \
| load_avg=1.160000,load_short=1.110000, \
| load_medium=1.160000,load_long=0.930000, \
| mem_free=6150.722656M,swap_free=487.996094M, \
| virtual_free=6638.718750M,mem_used=1834.460938M, \
| swap_used=0.000000M,virtual_used=1834.460938M, \
| cpu=9.400000,m_topology=NONE,m_topology_inuse=NONE, \
| m_socket=0,m_core=0,np_load_avg=0.290000, \
| np_load_short=0.277500,np_load_medium=0.290000, \
| np_load_long=0.232500
| processors 4
| user_lists NONE
| xuser_lists NONE
| projects NONE
| xprojects NONE
| usage_scaling NONE
| report_variables NONE
| scfc at tools-bastion-01:~$
doesn't say anything about more virtual memory than real
memory (8 GByte) being provided, while on the other hand the
mem_free and virtual_free values do not correspond in any
way with:
| scfc at tools-webgrid-lighttpd-1409:~$ free -m
| total used free shared buffers cached
| Mem: 7985 7719 265 482 150 5728
| -/+ buffers/cache: 1840 6144
| Swap: 487 0 487
| scfc at tools-webgrid-lighttpd-1409:~$
Hmmm.
Tim
More information about the Labs-l
mailing list