From: Dmitriy Vyukov on
Will it be possible to issue only compiler ordering barrier (release,
acquire or full), but not hardware ordering barrier in C++0x?

Accesses to volatile variables are ordered only with respect to
accesses to other volatile variables. What I want it to order access
to variable with respect to accesses to all other volatile and non-
volatile variables (w/o any hardware barriers, only compiler
ordering).

Now it can be accomplished with "__asm__ __volatile__
("" : : :"memory")" on gcc, and with _ReadWriteBarrier() on msvc. I am
interested whether it will be possible to do this in C++0x language w/
o compiler dependency.

Such compiler barriers are useful in effective synchronization
algorithms like SMR+RCU:
http://sourceforge.net/project/showfiles.php?group_id=127837
(fastsmr package)
because they allows one to eliminate all hardware memory barriers from
fast-path.

Dmitriy V'jukov

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: Chris Thomasson on
[this is in response to msg #13688, which has not been processed yet... I
made a mistake, here is correction...]


> "Dmitriy V'jukov" <dvyukov(a)gmail.com> wrote in message
> news:c5199756-8669-419e-84e9-1de582eff02e(a)y38g2000hsy.googlegroups.com...
> [...]
>> I can prove that I have happens-before relation if I can enforce
>> correct compiler ordering. Consider following code:
>>
>> std::vector<int> g_nonatomic_user_data;
>> std::atomic_int g_atomic1;
>> std::atomic_int g_atomic2;
>>
>> std::vector<int> thread()
>> {
>> g_atomic1.store(1, std::memory_order_relaxed);
>> // compiler store-load fence
>> if (g_atomic2.load(std::memory_order_relaxed))
> [...]
>
> Are you sure that atomic store with 'memory_order_relaxed' semantics
> injects
> a #StoreLoad after the operation?


Whoops! I did not read the word COMPILER! Sorry about that. Anyway, I think
that
'g_atomic2.load()' can rise above 'g_atomic1.store()'...


--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: Chris Thomasson on
"Dmitriy V'jukov" <dvyukov(a)gmail.com> wrote in message
news:c5199756-8669-419e-84e9-1de582eff02e(a)y38g2000hsy.googlegroups.com...
[...]
> I can prove that I have happens-before relation if I can enforce
> correct compiler ordering. Consider following code:
>
> std::vector<int> g_nonatomic_user_data;
> std::atomic_int g_atomic1;
> std::atomic_int g_atomic2;
>
> std::vector<int> thread()
> {
> g_atomic1.store(1, std::memory_order_relaxed);
> // compiler store-load fence
> if (g_atomic2.load(std::memory_order_relaxed))
[...]

Are you sure that atomic store with 'memory_order_relaxed' semantics injects
a #StoreLoad after the operation?


--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: Szabolcs Ferenczi on
On May 3, 2:13 pm, Anthony Williams <anthony_w....(a)yahoo.com> wrote:
> Szabolcs Ferenczi <szabolcs.feren...(a)gmail.com> writes:
> > On May 2, 12:43 pm, Anthony Williams <anthony_w....(a)yahoo.com> wrote:
> >> [...]
> >> All accesses to shared data MUST be synchronized with atomics: [...]
>
> > Can you elaborate this point please. How can you generally synchronise
> > N processes with help of atomics? Do you mean only two processes under
> > certain circumstances?
>
> If any thread modifies shared data that is not of type atomic_xxx, the
> developer must ensure appropriate synchronization with any other thread that
> accesses that shared data in order to avoid a data race (and the undefined
> behaviour that comes with that).

It is clear that you must synchronise access to shared variable.
Normally you must use a Critical Region for that.

I was curious how do you synchronise access to shared data with
atomics. Note that atomics only provide this synchronisation for the
access of the atomics themselves but you claimed something like with
atomics you can synchronise access to non-atomic shared data. How? Can
you provide example, please.

> If you're using atomics, that means you must
> have (at least) a release on some atomic variable by the modifying thread
> after the modify, and an acquire on the same atomic variable by the other
> thread before its access (read or write).

It is a bit too an abstract description to me. E.g. "after the modify"
of what? Can you provide some example? I hope you do not mean that the
access of an atomic variable can help in the synchronised access to
another variable that is not an atomic one. Let us see some examples.

> [...]
> Of course, you could also just use a mutex lock or join with the thread doing
> the modification.

That is correct. You can synchronise access with mutexes (implementing
a Critical Region by hand).

I think you refer with the "join with the thread" phrase the end of a
structured parallel block where a shared variable becomes a non-shared
one. Again, an example could help. Please give an example illustrating
what you mean.

Best Regards,
Szabolcs


--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: Chris Thomasson on
"Szabolcs Ferenczi" <szabolcs.ferenczi(a)gmail.com> wrote in message
news:e97bdaf4-5f8a-40bd-b058-46b30b1eb01a(a)x35g2000hsb.googlegroups.com...
> On May 3, 2:13 pm, Anthony Williams <anthony_w....(a)yahoo.com> wrote:
>> Szabolcs Ferenczi <szabolcs.feren...(a)gmail.com> writes:
>> > On May 2, 12:43 pm, Anthony Williams <anthony_w....(a)yahoo.com> wrote:
>> >> [...]
>> >> All accesses to shared data MUST be synchronized with atomics: [...]
>>
>> > Can you elaborate this point please. How can you generally synchronise
>> > N processes with help of atomics? Do you mean only two processes under
>> > certain circumstances?
>>
>> If any thread modifies shared data that is not of type atomic_xxx, the
>> developer must ensure appropriate synchronization with any other thread
>> that
>> accesses that shared data in order to avoid a data race (and the
>> undefined
>> behaviour that comes with that).
>
> It is clear that you must synchronise access to shared variable.
> Normally you must use a Critical Region for that.
>
> I was curious how do you synchronise access to shared data with
> atomics.

Really? How do you think some mutexs, semaphores, non-blocking algorithms
ect, ect, are actually implemented? IMHO, C++ should be at a low enough
level to create fairly efficient custom sync primitives. You can use atomics
to create different forms of reader-writer patterns. Are you familiar with
basic concepts of RCU?




> Note that atomics only provide this synchronisation for the
> access of the atomics themselves but you claimed something like with
> atomics you can synchronise access to non-atomic shared data. How? Can
> you provide example, please.

[...]

Here is one way to use atomics:

#ifndef MUTEX_ERROR_UNEXPECTED
# define MUTEX_ERROR_UNEXPECTED assert(false), std::unexcepted
#endif

class mutex {
enum constant {
UNLOCKED = 0,
LOCKED = 1,
CONTENTION = 2
};

atomic_word m_state;
os_event m_waitset;

public:
mutex() : m_state(UNLOCKED), (os_event_create(...)) {
if (m_waitset == OS_EVENT_INVALID) {
throw std::exception();
}
}

~mutex() throw() {
if (m_state != UNLOCKED || ! os_event_destroy(m_waitset)) {
MUTEX_ERROR_UNEXPECTED();
}
}

void lock() throw() {
if (ATOMIC_SWAP(&m_state, LOCKED)) {
while (ATOMIC_SWAP(&m_state, CONTENTION)) {
if (! os_event_wait(m_waitset)) {
MUTEX_ERROR_UNEXPECTED();
}
}
}
MEMBAR #StoreLoad | #StoreStore;
}

bool trylock() throw() {
if (! ATOMIC_CAS(&m_state, UNLOCKED, LOCKED)) {
return false;
}
MEMBAR #StoreLoad | #StoreStore;
}

void unlock() throw() {
MEMBAR #LoadStore | #StoreStore;
if (ATOMIC_SWAP(&m_state, UNLOCKED) == CONTENTION) {
if (! os_event_set(m_waitset)) {
MUTEX_ERROR_UNEXPECTED();
}
}
}
};


;^)



Do you think that C++ should _not_ be at a level that is low enough to
create custom non-blocking synchronization primitives?


--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]