From: Andy on
Assuming I've got an atomic & fenced CAS instruction available via a
library; is it possible to use this to write a lock free algorithm in C
++?

Although the memory fence would prevent hardware re-ordering I'm
worried that compiler could reorder memory access rendering the fence
useless. For example consider the below, grossly simplified example:

bool continue = true;
while(continue) {
while(isLocked) { }
// atomically set isLocked to true if it has the value false
if(AtomicTestAndSet(&isLocked, true, false)) {
continute = false;
}
}
SharedStdVector[3] += 10;

Is there anything that prevents the compiler from reordering the
access to SharedStdVector so that it is executed before mutex is
aquired? Is there anything I can do to prevent such reordering?

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: Lailoken on
On Jul 12, 6:22 am, Andy <amsk1...(a)gmail.com> wrote:
> Assuming I've got an atomic & fenced CAS instruction available via a
> library; is it possible to use this to write a lock free algorithm in C
> ++?
>
> Although the memory fence would prevent hardware re-ordering I'm
> worried that compiler could reorder memory access rendering the fence
> useless. For example consider the below, grossly simplified example:
>
> bool continue = true;
> while(continue) {
> while(isLocked) { }
> // atomically set isLocked to true if it has the value false
> if(AtomicTestAndSet(&isLocked, true, false)) {
> continute = false;
> }}
>
> SharedStdVector[3] += 10;
>
> Is there anything that prevents the compiler from reordering the
> access to SharedStdVector so that it is executed before mutex is
> aquired? Is there anything I can do to prevent such reordering?

When implementing Peterson's (or Decker's) algorithms you may, in
addition to hardware memory barriers, also add compiler memory
barriers:

http://en.wikipedia.org/wiki/Memory_barrier#Out-of-order_execution_versus_compiler_reordering_optimizations
http://en.wikipedia.org/wiki/Memory_ordering#Compiler_memory_barrier

It does not always work to only make the variables volatile, and you
may resort to the compiler-specific hacks above.

Or if speed is not an issue, just use some form of existing library
(like boost, etc.)

Marius.


--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: Anthony Williams on
Andy <amsk1982(a)gmail.com> writes:

> Assuming I've got an atomic & fenced CAS instruction available via a
> library; is it possible to use this to write a lock free algorithm in C
> ++?

Lock-free algorithms and anything else that uses atomics for memory
ordering requires the compiler and CPU respect the ordering constraints.

With a C++0x compiler, the std::atomic<> facilities will ensure this
holds.

With current compilers, it may be possible with a library, if the
library uses the appropriate constructs for the compiler. e.g. the
just::thread implementation of std::atomic<> works with gcc 4.3&4.4,
MSVC2008 and MSVC2010, and uses compiler-specific constructs to ensure
that the ordering constaints are enforced. Earlier versions of MSVC such
as MSVC.Net 2003 do not provide the necessary memory ordering guarantees
to be able to implement full memory ordering semantics in a separate
library.

> Although the memory fence would prevent hardware re-ordering I'm
> worried that compiler could reorder memory access rendering the fence
> useless. For example consider the below, grossly simplified example:
>
> bool continue = true;
> while(continue) {
> while(isLocked) { }
> // atomically set isLocked to true if it has the value false
> if(AtomicTestAndSet(&isLocked, true, false)) {
> continute = false;
> }
> }
> SharedStdVector[3] += 10;
>
> Is there anything that prevents the compiler from reordering the
> access to SharedStdVector so that it is executed before mutex is
> aquired? Is there anything I can do to prevent such reordering?

The C++03 standard does not deal with threads. C++0x supports a full
memory model and atomics. Anything else is up to the compiler
implementation.

Anthony
--
Author of C++ Concurrency in Action http://www.stdthread.co.uk/book/
just::thread C++0x thread library http://www.stdthread.co.uk
Just Software Solutions Ltd http://www.justsoftwaresolutions.co.uk
15 Carrallack Mews, St Just, Cornwall, TR19 7UL, UK. Company No. 5478976

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: cpp4ever on
On 07/12/2010 02:22 PM, Andy wrote:
> Assuming I've got an atomic & fenced CAS instruction available via a
> library; is it possible to use this to write a lock free algorithm in C
> ++?
>
> Although the memory fence would prevent hardware re-ordering I'm
> worried that compiler could reorder memory access rendering the fence
> useless. For example consider the below, grossly simplified example:
>
> bool continue = true;
> while(continue) {
> while(isLocked) { }
> // atomically set isLocked to true if it has the value false
> if(AtomicTestAndSet(&isLocked, true, false)) {
> continute = false;
> }
> }
> SharedStdVector[3] += 10;
>
> Is there anything that prevents the compiler from reordering the
> access to SharedStdVector so that it is executed before mutex is
> aquired? Is there anything I can do to prevent such reordering?
>

You need to look up the thread library implementation for the C++ you
are using. This will provide you with various mechanisms for creating
thread safe code. In your case you'd need a mutex, (short for mutual
exclusion).

HTH

cpp4ever

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: Mathias Gaunard on
On Jul 12, 2:22 pm, Andy <amsk1...(a)gmail.com> wrote:
> Assuming I've got an atomic & fenced CAS instruction available via a
> library; is it possible to use this to write a lock free algorithm in C
> ++?
>
> Although the memory fence would prevent hardware re-ordering I'm
> worried that compiler could reorder memory access rendering the fence
> useless.

Compilers will not do reordering when faced with inline assembly or
the calling of non-pure functions whose definitions are not available
in the same translation unit.

Note however that since this is not (yet) standardized, there is no
formal guarantee.


--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]