CoW and reference counting in the STL [C++]

Prev: Static Member Function - Function Pointer
Next: const correctness and dtor

From: pfultz2 on 23 Apr 2010 09:31

The best way to accomplish cow is to have all string be immutable and
then have a separate mutable string class, for when you want to modify
the string. This wouldnt necessarily get rid of all useless copies,
because there would still be coping when changing from mutable to
immutable, but now you would have to explicitly declare that you will
be modifying the string, which i believe is a better way to program
anyways. Then, also, for multithreaded environments, a garbage
collector could be used for immutable string references instead of
dealing with synchronized reference counts.

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: Jens Schmidt on 24 Apr 2010 02:56

Martin B. wrote:

> When I first heard about overcommit I thought it must be a joke.
> Now I'm pretty much convinced it is an aberration.

It's a very common technique. See e.g. database systems: locks are
granted whenever possible. Only on deadlock the system says "Haha,
your lock isn't worth anything."

> I mean - it not even kills the application that caused the OOM to trigger,
> it just chooses a random process!

This is an implementation issue.
But: It isn't a specific process that triggers OOM. It is the *sum* of
all requested memory that is too high.
Also, computers are good at loops. If the killed process is pre-determined,
the same will happen next time, e.g. in a few milliseconds. With a random
victim we have at least a chance for some progress.

> With regard to C/C++ and overcommit I have one question though:
> Say we have a process that tries to allocate a 500MB buffer on a 32 bit
> system.
> If the address-space of the process is sufficiently fragmented then this
> can never succeed. Will malloc return 0 in this case, or will it also
> return a 'valid' address that will just crash the machine if accessed?

As OOM is an OS issue, malloc will always return 0. The library knows
nothing about overcommitting.
--
Greetings,
Jens Schmidt

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: Miles Bader on 24 Apr 2010 03:59

"Martin B." <0xCDCDCDCD(a)gmx.at> writes:
> With regard to C/C++ and overcommit I have one question though:
> Say we have a process that tries to allocate a 500MB buffer on a
> 32 bit system. If the address-space of the process is
> sufficiently fragmented then this can never succeed. Will malloc
> return 0 in this case, or will it also return a 'valid' address
> that will just crash the machine if accessed?

Of course it will return 0 in such a case... (what a silly question!)

-Miles

--
Erudition, n. Dust shaken out of a book into an empty skull.

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: Herb Sutter on 27 Apr 2010 07:25

On Wed, 21 Apr 2010 01:14:50 CST, Mathias Gaunard <loufoque(a)gmail.com>
wrote:
>On 20 avr, 20:42, �� Tiib <oot...(a)hot.ee> wrote:
>
>> Interesting. How it breaks C and C++ completely?
>
>By having malloc or new return a pointer to some memory that has not
>yet been allocated, and thus accessing it later may result in a crash,
>instead of returning null or throwing an exception.

See also:

To New, Perchance To Throw, Part 2 (CUJ, May 2001)
http://www.gotw.ca/publications/mill16.htm

where it talks about "lazy allocation."

Here's a relevant snip:

---
Note that, if new uses the operating system's facilities directly,
then new will always succeed but any later innocent code like buf[100]
= 'c'; can throw or fail or halt. From a Standard C++ point of view,
both effects are nonconforming, because the C++ standard requires that
if new can't commit enough memory it must fail (this doesn't), and
that code like buf[100] = 'c' shouldn't throw an exception or
otherwise fail (this might).
[...]
The main problem with this approach, besides that it makes C++
standards conformance difficult, is that it makes program correctness
in general difficult, because any access to successfully-allocated
dynamic memory might cause the program to halt. That's just not good.
If allocation fails up front, the program knows that there's not
enough memory to complete an operation, and then the program has the
choice of doing something about it, such as trying to allocate a
smaller buffer size or giving up on only that particular operation, or
at least attempting to clean up some things by unwinding the stack.
But if there's no way to know whether the allocation really worked,
then any attempt to read or write the memory may cause a halt - and
that halt can't be predicted, because it might happen on the first
attempt to use part of the buffer, or on the millionth attempt after
lots of successful operations have used other parts of the buffer.

On the surface, it would appear that our only way to defend against
this is to immediately write to (or read from) the entire block of
memory to force it to really exist. For example:
---

Herb

---
Herb Sutter (herbsutter.wordpress.com) (www.gotw.ca)

Convener, SC22/WG21 (C++) (www.gotw.ca/iso)
Architect, Visual C++ (www.gotw.ca/microsoft)

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: Andre Kaufmann on 27 Apr 2010 16:03

Herb Sutter wrote:
> On Wed, 21 Apr 2010 01:14:50 CST, Mathias Gaunard <loufoque(a)gmail.com>
>> wrote:

> See also:
>
> To New, Perchance To Throw, Part 2 (CUJ, May 2001)
> http://www.gotw.ca/publications/mill16.htm
>
> where it talks about "lazy allocation."
>
> Here's a relevant snip:
>
> [...]

> If allocation fails up front, the program knows that there's not
> enough memory to complete an operation, and then the program has the
> choice of doing something about it, such as trying to allocate a
> smaller buffer size or giving up on only that particular operation, or
> at least attempting to clean up some things by unwinding the stack.
> But if there's no way to know whether the allocation really worked,
> then any attempt to read or write the memory may cause a halt - and
> that halt can't be predicted, because it might happen on the first
> attempt to use part of the buffer, or on the millionth attempt after
> lots of successful operations have used other parts of the buffer.

Isn't that comparable to the dynamic allocation used by most of the stl lists ?

(silly) Example:

typedef map<string, pair<list<string>, set<string>>> T;

T list1;
T list2;

......
list2 = list1;

Copying the complete list or adding elements might fail during copying, if the expansion of one of the elements fails.
It's hard to predict if the available memory if sufficient for a copy / add operation (in this case) due to memory fragmentation etc.

To make a long story short:

I have the feeling that think that most C++ applications can't handle insufficient memory conditions predictably in every situation. For huge contiguous memory blocks yes.
Or asking the other way round, is there a huge difference to recover from:

list2 = list1;

and

list1["hello"] = element;

?

>
> [...]
> Herb Sutter (herbsutter.wordpress.com) (www.gotw.ca)
>
> Convener, SC22/WG21 (C++) (www.gotw.ca/iso)
> Architect, Visual C++ (www.gotw.ca/microsoft)
>
>

{ Please remove this from the quoting, since obviously you're not
commenting on this part. -mod }

Andre

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

First | Prev | Next | Last
Pages: 1 2 3 4 5 6 7 8 9 10 11
Prev: Static Member Function - Function Pointer
Next: const correctness and dtor