From: Federico Cozzi on
Hello,
I am diagnosing an application problem on a production server.
It is a Java based web application and, randomly, freezes due to long
"full garbage collection" pauses (tens of seconds).
This is vmstat output on one such occurrence:
19:53:17 2010: procs -----------memory---------- ---swap-- -----io----
--system-- -----cpu------
19:53:17 2010: r b swpd free buff cache si so bi
bo in cs us sy id wa st
19:53:17 2010: 0 2 1503264 144232 3488 855996 301 0 312
229 1316 1588 2 1 93 4 0
19:53:22 2010: 0 1 1503264 143616 3552 856040 25 0 41
74 1334 1805 1 1 98 0 0
19:53:27 2010: 1 1 1503264 143252 3684 856288 243 0 253
152 1432 1790 3 2 92 3 0
19:53:32 2010: 0 4 1503264 143248 3808 835928 4403 0 4418
127 1685 222868 4 2 69 25 0
19:53:37 2010: 1 1 1503264 141600 4052 818124 3718 0 4002
176 1492 476444 1 2 73 24 0
19:53:42 2010: 0 1 1503264 140000 4096 780712 7818 0 7826
76 1809 14996 1 1 85 14 0
19:53:47 2010: 0 2 1503264 144044 4292 744532 5985 0 5995
165 1637 1830 0 0 78 21 0
19:53:52 2010: 0 1 1503264 144096 4408 710876 6383 0 6392
161 1776 2408 2 1 81 15 0
19:53:57 2010: 0 0 1503264 142632 4532 694436 3821 0 3896
88 1639 5519 16 11 52 21 0
19:54:02 2010: 0 0 1503264 146224 4592 689132 107 0 117
160 1376 2030 2 2 95 1 0
The full garbage collection started at about 19:53:30 and took 27
seconds.
I see higher-than-normal "si" values and moreover a huge number of
context switches. The CPU spent 25% time just in IO wait.

I think that this server has too little RAM for the application and
when a full garbage collection kicks in, all Java heap is swapped in,
causing large delays. Is the high context switches value consistent
with this theory?

Thanks,
Federico