The D Programming Language [C++]

Prev: how can operator new overrun memory?!
Next: Why no std::back_insert_iterator::value_type?

From: James Kanze on 8 Dec 2006 16:56

PeteK wrote:
> James Kanze wrote:
> > PeteK wrote:
> > > You can easily get rid of dangling pointers in C++ and turn them into
> > > zombies instead by simply using a bolt-on garbage collector. The
> > > language doesn't stop you doing that.

> > I'm not sure I understand your point. For the purposes of this
> > discussion, there are two fundamental types of data, those with
> > a determinate lifetime, and those with an indeterminate lifetime
> > (from the design point of view). For dynamically allocated
> > objects with an indeterminate lifetime, current C++ requires you
> > to explicitly use a delete expression, and make the lifetime
> > determinate (and risk dangling pointers). Java just does the
> > right thing. For dynamically allocated objects with a
> > determinate lifetime, C++ has a standard "name" for the function
> > terminating the objects lifetime, the destructor. It's a little
> > wierd, in that it doesn't have the normal function call syntax,
> > but big deal. Java lacks anything standard, but the convention
> > seems to be established to use the name "dispose()" (although
> > some of the standard classes use this name for other things).
> > In the end, it comes out to the same thing. Or almost---if you
> > want to, you can set state in the dispose() function in Java to
> > ensure that later use is detected. Immediately. To get this in
> > C++, you need something like Purify, and the runtime overhead is
> > high enough that you can't use it in production code. So Java
> > offers a safer solution.

> But in C++ you can
> a. Use a garbage collector
> b. Add a dispose function that sets the state, then calls delete (so
> you can run it non-GC)

> Now you've got exactly the same situation as in Java. C++ doesn't
> prevent you from doing this.

But you'll have to admit that it's a very strange way. In C++,
we have the standard "name", which is that of the destructor;
I'd put such verifications in the destructor.

But I agree that the problem isn't present (except perhaps for
some special cases) if you're using garbage collection with C++.

> > > However in Java you are stuck
> > > with the GC system and there's no way to automate the detection of
> > > zombies (big assumption here by someone who's never used it).

> > You can detect them just as easily as in C++. The big
> > difference is that you don't need external instrumentation that
> > makes the detection too slow to be used in production code.

> You missed the word "automate". You need to add dispose functions (and
> the associated checks) by hand. Using purify is somewhat simpler.

Using Purify isn't all that simple, either:-). But the bigger
issue is that the Purify checks aren't there when you need them,
in production code.

> > > In principle it should be possible to pick up all potential
> > > zombies/dangling pointers in C++ by using a sufficiently clever
> > > debugging allocator.

> > I think some systems do this. The trick is to not make the
> > memory available for re-allocation as long as there is a pointer
> > to it still in existance, mark it as freed somehow, and then
> > instrument every single pointer dereference to check for the
> > mark. (It still misses dangling pointers to on stack objects,
> > of course.) The problem is that it has unacceptable runtime
> > cost; ...

> Unacceptable for production use, probably. Unacceptable for debugging
> etc. unlikely. Purify really slows things down, but on the odd
> occasion you need it it's really worth it.

Certainly. I very much like Purify. But I do like to have
important sanity checks in production code, as well.

Note that the issue only affects a few classes. Most objects do
not have (or at least don't need) a deterministic lifetime.

> Also, given that people don't seem to think that GC slows things down
> too much, I can't (off the top of my head) see why freed memory blocks
> can't simply be added to a list then put back into play in a big chunk.
> At this stage a GC-like scan could be run to see if any pointers to
> the deleted memory remain (with optional invalidation of the pointers
> to cause a PE on dereference).

That's certainly an option. I think that the Boehm collector
can be used more or less in this way; if not, it certainly
wouldn't be too difficult to modify it so that it could. About
the only thing you couldn't do with it would be to modify the
pointer values, since at present (and probably for ever), the
collector cannot know with 100% certainty what is a pointer, and
what isn't.

But at that point, why bother with the extra work. I don't
understand your point. If you're running a garbage collector,
why not use it generally. Take advantage of it, and save
yourself some work.

> > ...the standard C++ model requires that all objects have
> > explicit lifetimes, even when the design doesn't require it, so
> > you have to check every single pointer dereference, and not just
> > those where the object by design has a determinate lifetime.
> > And if you think of things like the implementation of a string
> > class, you'll realize that there are a lot of objects which,
> > like the char array in a string, don't need explicit lifetime.

> The thing is that all objects have a logical lifetime.

Nonsense. What's the logical lifetime of the char[] which
contains the characters in a string? In fact, very few objects
have a logical lifetime.

> Accessing them
> after their logical life is over is an error. Java choses to define
> this as "not an error" (although this is probably more to do with Java
> having GC). While it might appear that a string's char array doesn't
> require a specific lifetime, if you've somehow acquired a pointer into
> it then the array is kept alive long after the string is dead.

What does that mean, after the string is dead? The string is
never "dead", in any real sense of the word. At some point, it
ceases to be accessible, but it's not logically dead.

> If you
> use this pointer to access the char array then this is logically wrong,
> but you have absolutely no way of detecting it.

Why is it logically wrong?

> > The essential thing in being able to detect the problem, of
> > course, is not allowing memory to be reused as long as there is
> > still an existing pointer to it. Garbage collection, in sum.
> > (The Boehm collector is often used in this way, as a leak
> > detector, and, with additional instrumentation in user code, to
> > detect dangling pointers.)

> No, there are things you can't detect. If you have an array of ints
> that can legitimately take all integer values, how can you tell that
> you're pointing at an array that will no longer be updated?

But that's the same situation as in C++, or in every other
language. You expect values to be updated every second, say.
For some reason (program error, etc.), the update doesn't occur.
No language feature that I know will protect against that sort
of thing, and no external tool will detect it.

The problem isn't that the object is dead. The problem is that
two different parts of the program disagree as to what is
supposed to be happening with it.

> > > Admittedly this doesn't stop you assigning duff
> > > values to pointers, but that's the price you have to pay for using a
> > > system-level language.

> > There are several issues at stake. The fact that you can have
> > an uninitialized pointer, with undefined contents, can hardly be
> > considered a feature.

> I didn't say it was uninitialised, and even a default initialisation to
> NULL might be no help. After all, address zero was a perfectly valid
> address under DOS*. A system level language must allow the programmer
> to give a pointer any value that could possibly be valid on the target
> system. The problem is that very few programs require that level of
> freedom, but it's a price we have to pay.

> {*Actually I think making NULL equivalent to zero was a mistake. It
> should be a system-defined value.}

NULL is just a macro. A null pointer does contain a
system-defined value. It's not required to be zero, and there
have been systems where it wasn't zero.

What is required is that an integral constant expression which
evaluates to 0 will convert implicitly to whatever the null
pointer value is. A rather peculiar rule, to put it mildly,
since the rules for converting an int to a pointer depend on
whether the int is a constant or not, and if it is a constant,
the value of that constant.

--
James Kanze (GABI Software) email:james.kanze(a)gmail.com
Conseils en informatique orient�e objet/
Beratung in objektorientierter Datenverarbeitung
9 place S�mard, 78210 St.-Cyr-l'�cole, France, +33 (0)1 30 23 00 34

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: James Kanze on 8 Dec 2006 16:57

Al wrote:

> Niklas Matthies wrote:
> <snip>
> > It's not as private as one might assume; with default security
> > settings you can access it via reflection. For example it's possible
> > to corrupt a String object by replacing its char[] value.

> Sure, you can use reflection to do interesting things. But that's a
> whole other can of worms. It isn't just restricted to private data. If
> Java's reflection is anything like C# then it can be use to bypass a
> whole lot of things that the "static" compiler wouldn't have allowed.
> This is fine. No /basic/ language invariants have been violated.

> In addition, I believe most of these things _are_ covered under the
> security principals, so you could simply restrict code access if you
> want to avoid them.

> One other thing, when you say it's possible to "corrupt" a String
> object, what does that mean, exactly? Do you mean that it is somehow
> possible to corrupt the virtual machine's memory integrity? I highly
> doubt that.

Good question. String is normally an immutable object, and
Java's security model counts on this. For example, you pass a
string to a function which first verifies it for correctness
(e.g.: it's a URL, and the function verifies that you, the user,
have a right to access this URL), then executes some more or
less dangerous action. Like everything else in Java, String is
passed by reference; if you could, in another thread, modify the
contents of the string after the security checks, but before the
action, you could violate security.

You might even be able to violate the memory model. String is a
very special case in Java, because it is not just a library
component; it is also part of the language. As such, the VM
"knows" that it is immutable, and could conceivably just do the
bounds check once, on entering the function, and count on the
length not changing. I don't think that this would be legal,
since I think the compiler is required to treat the String like
any other type, but I'm not sure of it.

--
James Kanze (GABI Software) email:james.kanze(a)gmail.com
Conseils en informatique orient�e objet/
Beratung in objektorientierter Datenverarbeitung
9 place S�mard, 78210 St.-Cyr-l'�cole, France, +33 (0)1 30 23 00 34

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: Francis Glassborow on 8 Dec 2006 17:01

In article <1165518686.608714.296020(a)80g2000cwy.googlegroups.com>, PeteK
<pete_k_1955(a)yahoo.co.uk> writes
>I didn't say it was uninitialised, and even a default initialisation to
>NULL might be no help.
Why do you say that in the context of pointers? Using a compile time
zero (even wrapped up as a NULL) to assign to or initialise a pointer
results in an implementation defined value which is often 'address 0'
but is not required to be.

The real problem is when we actually want to use 'address 0' as might be
the case in a DOS system.

--
Francis Glassborow ACCU
Author of 'You Can Do It!' and "You Can Program in C++"
see http://www.spellen.org/youcandoit
For project ideas and contributions:
http://www.spellen.org/youcandoit/projects

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: David Abrahams on 9 Dec 2006 02:12

Gerhard Menzl <clcppm-poster(a)this.is.invalid> writes:

> peter koch larsen wrote:
>
> You stated that an empty exception specification guarantees the
> function will not throw anything. But what it actually guarantees is
> that no exception exits the function.

Peter's statement was perfectly correct and not misleading at all.

What is the difference between calling these three fs?

void f() throw()
{
throw 1;
}

void f()
{
unexpected(); // never returns; calls terminate
}

void f()
{
g(); // might itself call unexpected or terminate
}

Answer: none.

> Your choice of terms was at least
> misleading: someone unfamiliar with the C++ exception mechanism could
> easily interpret it as describing a compile-time check, which is
> precisely what C++ does not offer. To avoid this confusion, especially
> when comparing C++ with Java, which does have static checks, I think it
> is important to distinguish between "cannot throw" and "will abort if it
> throws".

What's the difference between a function that "will abort if it
throws" and one that "might abort (for whatever reason)?"

--
Dave Abrahams
Boost Consulting
www.boost-consulting.com

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: PeteK on 10 Dec 2006 00:43

James Kanze wrote:
> > PeteK wrote:
[snip]
>> > > Now you've got exactly the same situation as in Java. C++ doesn't
>> > > prevent you from doing this.
> >
> > But you'll have to admit that it's a very strange way. In C++,
> > we have the standard "name", which is that of the destructor;
> > I'd put such verifications in the destructor.
> >
Yes, but I wasn't saying that you should actually do this. I was
simply pointing out that in C++ you can get the same behaviour as in
Java with the same programmer overhead. Hence I don't see why this is
such a big plus for Java. It's simply a by-product of using GC.

> > But I agree that the problem isn't present (except perhaps for
> > some special cases) if you're using garbage collection with C++.
> >

> >
> > Note that the issue only affects a few classes. Most objects do
> > not have (or at least don't need) a deterministic lifetime.
> >

I see this comment a lot, but I suspect that people think that the only
objects that require deterministic lifetimes are those that manage
external resources. However they tend to forget that it is equally
important that when let logical owner of an object kills it off then
reading or writing to it is also an error. For example, if your flight
control system is reading it's height from one place, but the curent
height is being written to another then you could end up flying into a
mountain.

In fact probably the safest thing to to is run a hybrid system : -
Everyting you perform checks on is GC.
Everything you don't isn't.

Of course, since hte checks are put in by the programmer there's always
a chance that someone will screw up/miss out a check anyway.

> >
> > But at that point, why bother with the extra work. I don't
> > understand your point. If you're running a garbage collector,
> > why not use it generally. Take advantage of it, and save
> > yourself some work.
> >
I've nothing agains GC as the base-level memory manager, I just prefer
to explicitly manage the lifetimes of all my classes etc.

I know that you and Andrei appear to have lots of pointer cycles in
your code, so perhapse you're over-sensitive to the dangling pointer
problem. The last time I had a cycle was probably at least 8 years
ago, and it was obvious and easily broken.

OTOH when I use my smart pointers to manage memory-only resources there
have been many times when I've had to add in non-memory resources
later. If I'd relied on GC to clean everything up this would hav meant
that I'd have to go back over the whole program and introduce smart
pointers to manage these resources. Using smart pointers throughout
means no extra work, no additional chacks, no more wories.

> >
>> > > The thing is that all objects have a logical lifetime.
> >
> > Nonsense. What's the logical lifetime of the char[] which
> > contains the characters in a string? In fact, very few objects
> > have a logical lifetime.
> >

The logical lifetime is generally bound to the logical lifetime of the
string (if we skip read-only slicing etc.). The logical lifetime of the
string is bound to the object that contains it or the block of code
that uses it. In C++ we can extend that logical lifetime by using
smart pointers and the like, but there is still a defined point when
the object is logically dead. If we we write a logically identical
program in Java then the logical lifetimes of the objects should be
identical. Accessing logically dead data then becomes an error. In
Java you can only detect the error if you instrument the objects
yourself. In C++ you have that choice, but you also have the
opportunity to the the runtime system/specialist tools detect the
problem too.

>> > > Accessing them
>> > > after their logical life is over is an error. Java choses to define
>> > > this as "not an error" (although this is probably more to do with
Java
>> > > having GC). While it might appear that a string's char array doesn't
>> > > require a specific lifetime, if you've somehow acquired a pointer
into
>> > > it then the array is kept alive long after the string is dead.
> >
> > What does that mean, after the string is dead? The string is
> > never "dead", in any real sense of the word. At some point, it
> > ceases to be accessible, but it's not logically dead.
> >

But this is the point. Logical lifetime and accessibility are not the
same. If they were you would never get the dangling pointer problem.

Similarly, if I resize and array so that it's now one element shorter
the the previous last element is logically dead. However the chances
are that it will remain in memory at exactly the same point and I can
probably recover it's original value. But the fact that it's still
accessible changes nothing. It's still logically dead, as it's no
longer part of the array.

>> > > If you
>> > > use this pointer to access the char array then this is logically
wrong,
>> > > but you have absolutely no way of detecting it.
> >
> > Why is it logically wrong?
> >

See above. It's basically the same situation.

>>> > > > The essential thing in being able to detect the problem, of
>>> > > > course, is not allowing memory to be reused as long as there is
>>> > > > still an existing pointer to it. Garbage collection, in sum.
>>> > > > (The Boehm collector is often used in this way, as a leak
>>> > > > detector, and, with additional instrumentation in user code, to
>>> > > > detect dangling pointers.)
> >
>> > > No, there are things you can't detect. If you have an array of ints
>> > > that can legitimately take all integer values, how can you tell that
>> > > you're pointing at an array that will no longer be updated?
> >
> > But that's the same situation as in C++, or in every other
> > language. You expect values to be updated every second, say.
> > For some reason (program error, etc.), the update doesn't occur.
> > No language feature that I know will protect against that sort
> > of thing, and no external tool will detect it.
> >

Maybe I was a bit unclear here. The point I was trying to make is that
if the updater is no longer writing to it then in C++ it would delete
the array, but in Java it would leave it lying around. In C++ this may
be caught. In Java it can't be caught.

> > The problem isn't that the object is dead. The problem is that
> > two different parts of the program disagree as to what is
> > supposed to be happening with it.
> >

And I don't write incorrect programs. It's just that the compiler
writer and I disagree about what the compiler output should be :-)
> >
>> > > {*Actually I think making NULL equivalent to zero was a mistake. It
>> > > should be a system-defined value.}
> >
> > NULL is just a macro. A null pointer does contain a
> > system-defined value. It's not required to be zero, and there
> > have been systems where it wasn't zero.
> >
> > What is required is that an integral constant expression which
> > evaluates to 0 will convert implicitly to whatever the null
> > pointer value is. A rather peculiar rule, to put it mildly,
> > since the rules for converting an int to a pointer depend on
> > whether the int is a constant or not, and if it is a constant,
> > the value of that constant.
> >

I'm well aware of this (though I've never actually worked on a system
where the value used wasn't zero). Maybe I should have written it as
"null". One of my pet hates is people writing:

if ( p )
rather than
if( p != NULL )

with a system-defined vaule they couldn't (portably) do this. But, as
you say, it's just a weird rule anyway.

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

First | Prev | Next | Last
Pages: 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82
Prev: how can operator new overrun memory?!
Next: Why no std::back_insert_iterator::value_type?