Should calculation of vm.overcommit_ratio be changed? [Kernel]

Prev: writeback: tracing and wbc->nr_to_write fixes
Next: Fix inappropriate substraction on tracing_pages_allocated in trace_free_page()in 2.6.27.y series kernel

From: Dave Wright on 20 Apr 2010 09:50

The current calculation of VM overcommit (particularly with default
vm.overcommit_ratio==50) seems to be a hold-over from the days when we
had more swap than physical memory. For example, 1/2 phy mem + swap
made sense when you had a 1GB of memory and 2GB of swap, however I
recently ran into an issue on a server that had 8GB RAM and 2GB swap.
The OOM killer was getting triggered as VM commit hit 6GB, even though
there was plenty of RAM available. Once I figured out what was going
on, I manually tweaked the ratio to be 110%.
It looks like current distro recommendations are still "have as much
swap as you have RAM", in which case the current calculation is fine,
but with SSD becoming more common on boot drives, I think many users
will end up with less swap than RAM - consider a desktop user who
might have 4GB RAM and 1GB swap. I don't think you would expect
Desktop users to understand or tweak overcommit_ratio, but I also
don't think having the distro simply change the default from 50 (to
100 or something else) would cover all the cases well.

Would it make more sense to have the overcommit formula be calculated as:

max commit = min(swap, ram) * overcommit_ratio + max(swap, ram) ?

When swap>=ram, the formula works exactly the same as it does now, but
when ram>>swap, you are guaranteed to always be able to your full RAM
(even when swap=0).

-Dave Wright
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Dirk Geschke on 20 Apr 2010 17:10

Hi all,

I am not on the mailing list and a friend pointed me to this
thread...

Probably we had the same problem: We had a linux computer with
16 GB of RAM without swap. There was only one big job running
on it which did a lot of I/O. This program failed to allocate
much memory, we thought that this was due to the high amount
of cached memory use. To avoid problems with overcommit we had
set overcommit_memory to 2.

Now I have seen this thread and now it gets clear: The default
value of overcommit_ratio is 50, therefore one program can not
allocate more than 8GB of memory at all.

After reading this thread I wrote a little programm to allocate
memory in 512MB blocks and fill it with zeros. My test system
has 4GB of RAM and so I started:

qfix:~# free
total used free shared buffers cached
Mem: 4052376 338124 3714252 0 0 17992
-/+ buffers/cache: 320132 3732244
Swap: 0 0 0

geschke(a)qfix:~$ ./a.out
got 1 * 512MB
got 2 * 512MB
got 3 * 512MB
malloc failure after 3 * 512 MB

So 1.5 GB are ok, 2 GB of possible 4GB not. I guess some memory of
the 4GB are not useable at all and therefore the limit is slightly
below 2GB with an overcommit_ratio=50.

Next step is to set overcommit_ratio=100:

qfix:~# echo 100 >/proc/sys/vm/overcommit_ratio

and run the porgram again:

geschke(a)qfix:~$ ./a.out
got 1 * 512MB
got 2 * 512MB
got 3 * 512MB
got 4 * 512MB
got 5 * 512MB
got 6 * 512MB
malloc failure after 6 * 512 MB

That are more than 3 GB but I would have expected to get at least
3.5GB:

geschke(a)qfix:~$ free
total used free shared buffers cached
Mem: 4052376 344976 3707400 0 0 22472
-/+ buffers/cache: 322504 3729872
Swap: 0 0 0

Maybe this is due to a reserved percentage for the root user?

However, if I set overcommit_ratio=110 I get more than 3.5 GB:

geschke(a)qfix:~$ ./a.out
got 1 * 512MB
got 2 * 512MB
got 3 * 512MB
got 4 * 512MB
got 5 * 512MB
got 6 * 512MB
got 7 * 512MB
malloc failure after 7 * 512 MB

However, now I tested this with a high usage of cached memory, I
die a lot of I/O:

geschke(a)qfix:~$ free
total used free shared buffers cached
Mem: 4052376 1945512 2106864 0 0 1621200
-/+ buffers/cache: 324312 3728064
Swap: 0 0 0

A new run gives:

geschke(a)qfix:~$ ./a.out
got 1 * 512MB
got 2 * 512MB
got 3 * 512MB
got 4 * 512MB
got 5 * 512MB
got 6 * 512MB
got 7 * 512MB
malloc failure after 7 * 512 MB

and:

qfix:~# free
total used free shared buffers cached
Mem: 4052376 346928 3705448 0 0 26008
-/+ buffers/cache: 320920 3731456
Swap: 0 0 0

So the cached memory is not really a problem for malloc.

But since I am testing, I tried what will happens if a lot of memory
is already in use. So I opened a large file with "vi":

geschke(a)qfix:~$ free
total used free shared buffers cached
Mem: 4052376 1597168 2455208 0 0 391364
-/+ buffers/cache: 1205804 2846572
Swap: 0 0 0

Now I start the program again:

geschke(a)qfix:~$ ./a.out
got 1 * 512MB
got 2 * 512MB
got 3 * 512MB
got 4 * 512MB
got 5 * 512MB
malloc failure after 5 * 512 MB

Fine: It seems that there is not really a problem to increase to
overcommit_ratio=100 if there is no swap in the system and one
has set overcommit_memory=2.

So I think, it is not really a problem to run with these settings.

Best regards

Dirk
--
+----------------------------------------------------------------------+
| Dr. Dirk Geschke / Plankensteinweg 61 / 85435 Erding |
| Telefon: 08122-559448 / Mobil: 0176-96906350 / Fax: 08122-9818106 |
| dirk(a)geschke-online.de / dirk(a)lug-erding.de / kontakt(a)lug-erding.de |
+----------------------------------------------------------------------+
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

|
Pages: 1
Prev: writeback: tracing and wbc->nr_to_write fixes
Next: Fix inappropriate substraction on tracing_pages_allocated in trace_free_page()in 2.6.27.y series kernel