From: James Kanze on
On Mar 26, 12:25 am, "Leigh Johnston" <le...(a)i42.co.uk> wrote:
> "George Neuner" <gneun...(a)comcast.net> wrote in message

> news:rq1nq5tskd51cmnf585h1q2elo28euh2kn(a)4ax.com...
> <snip>
>> 'volatile' is necessary for certain uses but is not sufficient for
>> (al)most (all) uses. I would say that for expert uses, some are
>> portable and some are not. For non-expert uses ... I would say that
>> most uses contemplated by non-experts will be neither portable nor
>> sound.

> Whether or not the store that is guaranteed to be emitted by
> the compiler due to the presence of volatile propagates to L1
> cache, L2 cache or main memory is irrelevant as far as
> volatile and multi-threading is concerned as long as CPU
> caches remain coherent.

That depends on the architecture and what the compiler actually
does in the case of volatile. Some of the more recent
processors have a separate cache for each core, at least at the
lowest level, and most access memory through a pipeline which is
unique to the core.

> You could argue that because of this volatile is actually more
> useful for multi-threading than for its more traditional use
> of performing memory mapped I/O with modern CPU architectures.

You'll have to explain that, since none of the compilers I use
generate any sort of fence or membar when volatile is used, and
the processors definitely require it,

--
James Kanze

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: Leigh Johnston on
"James Kanze" <james.kanze(a)gmail.com> wrote in message news:ddf75ee4-b26b-46a0-af32-99ce34954669(a)k19g2000yqn.googlegroups.com...
> On Mar 26, 12:25 am, "Leigh Johnston" <le...(a)i42.co.uk> wrote:
>> "George Neuner" <gneun...(a)comcast.net> wrote in message
>
>> news:rq1nq5tskd51cmnf585h1q2elo28euh2kn(a)4ax.com...
>> <snip>
>>> 'volatile' is necessary for certain uses but is not sufficient for
>>> (al)most (all) uses. I would say that for expert uses, some are
>>> portable and some are not. For non-expert uses ... I would say that
>>> most uses contemplated by non-experts will be neither portable nor
>>> sound.
>
>> Whether or not the store that is guaranteed to be emitted by
>> the compiler due to the presence of volatile propagates to L1
>> cache, L2 cache or main memory is irrelevant as far as
>> volatile and multi-threading is concerned as long as CPU
>> caches remain coherent.
>
> That depends on the architecture and what the compiler actually
> does in the case of volatile. Some of the more recent
> processors have a separate cache for each core, at least at the
> lowest level, and most access memory through a pipeline which is
> unique to the core.
>
>> You could argue that because of this volatile is actually more
>> useful for multi-threading than for its more traditional use
>> of performing memory mapped I/O with modern CPU architectures.
>
> You'll have to explain that, since none of the compilers I use
> generate any sort of fence or membar when volatile is used, and
> the processors definitely require it,

{ quoted signature removed; please remove such extra stuff yourself.
-mod }

I would expect the following property of the volatile keyword on VC++ to be
a common interpretation of the semantics of volatile for most C++ compilers:

"Objects declared as volatile are not used in certain optimizations because
their values can change at any time. The system always reads the current
value of a volatile object at the point it is requested, even if a previous
instruction asked for a value from the same object. Also, the value of the
object is written immediately on assignment. "

It should be obvious how this property can be useful when writing
multi-threaded code, not always useful in isolation perhaps but certainly
useful when used in conjunction with other threading constructs such as
mutexes and fences. Depending on the compiler/platform and on the actual
use-case volatile on its own might not be enough: from what I can tell VC++
volatile does not emit fence instructions for x86 yet the above property
still stands (and there are rare cases when memory barriers are needed on
x86, see
http://bartoszmilewski.wordpress.com/2008/11/05/who-ordered-memory-fences-on-an-x86/).
I agree that this is mostly an implementation specific issue and the current
C++ standard is threading agnostic however saying volatile has absolutely no
use in multi-threading programming is incorrect.

Performance is often cited as another reason to not use volatile however the
use of volatile can actually help with multi-threading performance as you
can perform a safe lock-free check before performing a more expensive lock.
This all depends on the use-case in question and the only volatile I have in
my entire codebase is for just such a check (for this use-case ordering does
not matter so no fences are needed).

I agree with what Andy said elsewhere in this thread:

"Is volatile sufficient - absolutely not.
Portable - hardly.
Necessary in certain conditions - absolutely."

/Leigh

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: Herb Sutter on
On Sun, 28 Mar 2010 15:25:46 CST, James Kanze <james.kanze(a)gmail.com>
wrote:
>On Mar 26, 12:33 am, Herb Sutter <herb.sut...(a)gmail.com> wrote:
>> Please remember this: Standard ISO C/C++ volatile is useless
>> for multithreaded programming. No argument otherwise holds
>> water; at best the code may appear to work on some
>> compilers/platforms, including all attempted counterexamples
>> I've seen on this thread.
>
>I agree with you in principle, but do be careful as to how you
>formulate this. Standard ISO C/C++ is useless for multithreaded
>programming, at least today. With or without volatile. And in
>Standard ISO C/C++, volatile is useless for just about anything;

All of the above is still true in draft C++0x and C1x, both of which
have concurrency memory models, threads, and mutexes.

>it was always intended to be mainly a hook for implementation
>defined behavior, i.e. to allow things like memory-mapped IO
>while not imposing excessive loss of optimizing posibilities
>everywhere.

Right. And is therefore (still) deliberately underspecified.

>In theory, an implementation could define volatile in a way that
>would make it useful in multithreading---I think Microsoft once
>proposed doing so in the standard.

Yes, back in 2006 I briefly agreed with that before realizing why it
was wrong (earlier in this thread you correctly said I supported it
and then stopped doing so).

>In my opinion, this sort of
>violates the original intention behind volation, which was that
>volatile is applied to a single object, and doesn't affect other
>objects in the code. But it's certainly something you could
>argue both ways.

No, it's definitely wrong. Briefly, volatile and atomic<> have two
very different purposes, and they impose similar (but different)
constraints. The pitfall in making them both be the same thing (e.g.,
extending volatile to make it strong enough to serve the needs of
atomic<>s as well) is that you end up with a single thing that can be
used for both purposes but is necessarily suboptimal for either one.
That is, you'll nearly always only be using a given variable for one
of the two uses at a time, and if you're using it for hardware access
it'll be a slower volatile because it also has the optimization
restrictions of an atomic<>, and if you're using it for inter-thread
communication it'll be a slower atomic<> because it also has the
optimization restrictions of a volatile. If I write this up I'll
include some examples.

> [...]
>> No. The reason that can't use volatiles for synchronization is that
>> they aren't synchronized (QED).
>
>:-). And the reason their not synchronized is that
>synchronization involves more than one variable, and that it was
>never the intent of volatile to involve more than one variable.

That's part of it, yes.

>(On a lot of modern processors, however, it would be impossible
>to fully implement the original intent of volatile without
>synchronization. The only instructions available on a Sparc,
>for example, to ensure that a store instruction actually results
>in a write to an external device is a membar. And that
>synchronizes *all* accesses of the given type.)
>
> [...]
>> (and it was a mistake to try to add those
>> guarantees to volatile in VC++).
>
>Just curious: is that Microsoft talking, or Herb Sutter (or
>both)?

Both. It was well-intentioned and seemed like a good idea until the
"ugh it pessimizes both uses" problem was understood. FWIW, a number
of people in WG21 suggested combining the two, hence Hans' and Nick's
paper on why volatile shouldn't be strengthened. (That paper is
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n2016.html and
I think there are additional reasons against it in addition to the
good ones they give.)

Herb


---
Herb Sutter (herbsutter.wordpress.com) (www.gotw.ca)

Convener, SC22/WG21 (C++) (www.gotw.ca/iso)
Architect, Visual C++ (www.gotw.ca/microsoft)

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: James Kanze on
On Mar 29, 7:45 am, "Leigh Johnston" <le...(a)i42.co.uk> wrote:
> "James Kanze" <james.ka...(a)gmail.com> wrote in
> messagenews:ddf75ee4-b26b-46a0-af32-99ce34954669(a)k19g2000yqn.googlegroups.com...
>> On Mar 26, 12:25 am, "Leigh Johnston" <le...(a)i42.co.uk> wrote:
>>> "George Neuner" <gneun...(a)comcast.net> wrote in message

>>> news:rq1nq5tskd51cmnf585h1q2elo28euh2kn(a)4ax.com...
>>> <snip>
>>>> 'volatile' is necessary for certain uses but is not
>>>> sufficient for (al)most (all) uses. I would say that for
>>>> expert uses, some are portable and some are not. For
>>>> non-expert uses ... I would say that most uses
>>>> contemplated by non-experts will be neither portable nor
>>>> sound.

>>> Whether or not the store that is guaranteed to be emitted
>>> by the compiler due to the presence of volatile propagates
>>> to L1 cache, L2 cache or main memory is irrelevant as far
>>> as volatile and multi-threading is concerned as long as CPU
>>> caches remain coherent.

>> That depends on the architecture and what the compiler
>> actually does in the case of volatile. Some of the more
>> recent processors have a separate cache for each core, at
>> least at the lowest level, and most access memory through a
>> pipeline which is unique to the core.

>>> You could argue that because of this volatile is actually
>>> more useful for multi-threading than for its more
>>> traditional use of performing memory mapped I/O with modern
>>> CPU architectures.

>> You'll have to explain that, since none of the compilers I
>> use generate any sort of fence or membar when volatile is
>> used, and the processors definitely require it,

> I would expect the following property of the volatile keyword
> on VC++ to be a common interpretation of the semantics of
> volatile for most C++ compilers:

> "Objects declared as volatile are not used in certain
> optimizations because their values can change at any time. The
> system always reads the current value of a volatile object at
> the point it is requested, even if a previous instruction
> asked for a value from the same object. Also, the value of the
> object is written immediately on assignment. "

That's generally the case. For some very imprecise meaning of
"reads" and "writes". On the compilers I have access to, the
meaning is no more than "executes a machine level load or store
instruction". Which is practically meaningless for anything
useful on a modern processor.

> It should be obvious how this property can be useful when
> writing multi-threaded code, not always useful in isolation
> perhaps but certainly useful when used in conjunction with
> other threading constructs such as mutexes and fences.

It isn't, at least not to me. Perhaps if you could come up with
a small example of where it might be useful.

> Depending on the compiler/platform and on the actual use-case
> volatile on its own might not be enough: from what I can tell
> VC++ volatile does not emit fence instructions for x86 yet the
> above property still stands (and there are rare cases when
> memory barriers are needed on x86,
> seehttp://bartoszmilewski.wordpress.com/2008/11/05/who-ordered-memory-fe...).
> I agree that this is mostly an implementation specific issue
> and the current C++ standard is threading agnostic however
> saying volatile has absolutely no use in multi-threading
> programming is incorrect.

Given that the exact semantics volatile and threading are not
really covered by the standard, it's certain that one cannot
make blanket claims: an implementation could define volatile in
a way that would make it useful with its implementation of
threading, say by giving volatile the same meaning that it has
in Java, for example. In practice, however, Posix doesn't, and
I don't know of a compiler under Unix which goes beyong the
Posix guarantees (except when assembler is involved, and then
they give enough guarantees that you don't need volatile). And
while I've yet to find an exact specification for Windows, the
implementation of volatile in VC++ 8.0 doesn't do enough to make
it useful in threading, and Microsoft (in the voice of Herb
Sutter) has said here that it isn't useful (although I don't
know if Herb is speaking for Microsoft here, or simply
expressing his personal opinion).

Anyhow, for the moment, all I can really claim is that it is
useless under the Unix I know (Solaris, HP/UX, AIX and Linux)
and under Windows.

> Performance is often cited as another reason to not use
> volatile however the use of volatile can actually help with
> multi-threading performance as you can perform a safe
> lock-free check before performing a more expensive lock.

Again, I'd like to see how. This sounds like the double-checked
locking idiom, and that's been proven not to work.

> I agree with what Andy said elsewhere in this thread:

> "Is volatile sufficient - absolutely not.
> Portable - hardly.
> Necessary in certain conditions - absolutely."

Yes, but Andy didn't present any facts to back up his statement.

The simplest solution would be to just post a bit of code
showing where or how it might be useful. A good counter example
trumps every argument.

--
James Kanze

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: Leigh Johnston on


"James Kanze" <james.kanze(a)gmail.com> wrote in message
news:36f7e40e-4584-430d-980e-5f7478728d16(a)z3g2000yqz.googlegroups.com...
<snip>
>> Performance is often cited as another reason to not use
>> volatile however the use of volatile can actually help with
>> multi-threading performance as you can perform a safe
>> lock-free check before performing a more expensive lock.
>
> Again, I'd like to see how. This sounds like the double-checked
> locking idiom, and that's been proven not to work.
>

IMO for an OoO CPU the double checked locking pattern can be made to work
with volatile if fences are also used or the lock also acts as a fence (as
is the case with VC++/x86). This is also the counter-example you are
looking for, it should work on some implementations. FWIW VC++ is clever
enough to make the volatile redundant for this example however adding
volatile makes no difference to the generated code (read: no performance
penalty) and I like making such things explicit similar to how one uses
const (doesn't effect the generated output but documents the programmer's
intentions). Which is better: use volatile if there is no noticeable
performance penalty or constantly check your compiler's generated assembler
to check the optimizer is not breaking things? The only volatile in my
entire codebase is for the "status" of my "threadable" base class and I
don't always acquire a lock before checking this status and I don't fully
trust that the optimizer won't cache it for all cases that might crop up as
I develop code. BTW I try and avoid singletons too so I haven't found the
need to use the double checked locking pattern AFAICR.

/Leigh


--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]