|
From: Ole-Hjalmar Kristensen on 31 Mar 2008 03:59 >>>>> "FW" == Florian Weimer <fw(a)deneb.enyo.de> writes: FW> * Ole-Hjalmar Kristensen: >> Second, I cannot find anything in the RM which says you can make *any* >> assumptions about objects which are outside the protected object so I >> cannot see how signaling will help you. FW> I think that's what 9.10 (Ada 95 without TC 1) is about. Yes, you are right. Thanks. I looked in the part describing protected objects without finding it. But one part of 9.10 puzzles me: 9.c Reason: The point of this distinction is so that on multiprocessors with inconsistent caches, the caches only need to be refreshed at the beginning of an entry body, and forced out at the end of an entry body or protected procedure that leaves an entry open. Protected function calls, and protected subprogram calls for entryless protected objects do not require full cache consistency. Entryless protected objects are intended to be treated roughly like atomic objects -- each operation is indivisible with respect to other operations (unless both are reads), but such operations cannot be used to synchronize access to other nonvolatile shared variables. If you do not refresh the cache at the beginning of a protected procedure, how do you avoid reading stale data within the protected object on a multiprocessor? And why the wording "cannot be used to synchronize access to other *nonvolatile* shared variables"? Is the implication that it *can* be used to synchronize other *volatile* shared variables? Btw., I ran a simple test on a SPRAC multiprocessor with an entryless protected object containing a single integer versus an integer declared with pragma atomic, and as expected the pragma atomic solution was much (40x) faster. -- C++: The power, elegance and simplicity of a hand grenade.
From: Ole-Hjalmar Kristensen on 1 Apr 2008 05:02 >>>>> "(b" == (see below) <yaldnif.w(a)blueyonder.co.uk> writes: <snip> (b> I think the wording is trying to cover all the bases. (b> One clue is the phrase "_full_ cache consistency". (b> The caches certainly need to be consistent with respect to the (b> protected data, even for protected functions and procedures, (b> but only entries ensure global consistency and so provide (b> synchronization of data that is not local to the protected object. Yes, that seems reasonable. >> Btw., I ran a simple test on a SPRAC multiprocessor with an entryless >> protected object containing a single integer versus an integer >> declared with pragma atomic, and as expected the pragma atomic >> solution was much (40x) faster. (b> Unfortunately, we can't usefully apply that pragma even (b> to a pair of integers. (I don't mean the pair's components!) Agreed, but you may able to cheat and pack a pair of integers into a 64-bit atomic, and a compare-and-swap is also much cheaper than a protected object it seems. (b> -- (b> Bill Findlay (b> <surname><forename> chez blueyonder.co.uk -- C++: The power, elegance and simplicity of a hand grenade.
From: Ole-Hjalmar Kristensen on 2 Apr 2008 03:22 >>>>> "(b" == (see below) <yaldnif.w(a)blueyonder.co.uk> writes: <snip> (b> Also, simply declaring a variable atomic does not in itself (b> ensure global consistency of view. For that you also need (b> to execute appropriate memory barrier operations for the (b> architecture. (I'm sure you know that.) Yes, I know. So pragma atomic does not help very much by itself in implementing lock-free algorithms. >> Agreed, but you may able to cheat and pack a pair of integers into >> a 64-bit atomic, and a compare-and-swap is also much cheaper than a >> protected object it seems. (b> Yes, but this is completely implementation-dependent. (b> Not much above the semantic level of assembly. (b> I would take a lot of convincing that the performance (b> improvement was actually necessary. I agree that it rarely should be necessary. It would have been nice to have a way of implementing lock-free algorithms in Ada efficiently, but entryless protected objects seems to be the best solution if you want a portable program. (b> -- (b> Bill Findlay (b> <surname><forename> chez blueyonder.co.uk -- C++: The power, elegance and simplicity of a hand grenade.
From: Ole-Hjalmar Kristensen on 4 Apr 2008 02:36 Interesting. I had not thought of Ada.Synchronous_Task_Control. Apart from that, I agree that a "best effort" implementation of standard library like the membar_ops and atomic_ops which are part of the Solaris C library would likely be sufficient. >>>>> "(b" == (see below) <yaldnif.w(a)blueyonder.co.uk> writes: (b> On 02/04/2008 08:22, in article wvbr4pakzpo1.fsf(a)astra06.norway.sun.com, (b> "Ole-Hjalmar Kristensen" (b> <ole-hjalmar.kristensen(a)substitute_employer_here.com> wrote: >> Yes, I know. So pragma atomic does not help very much by itself in >> implementing lock-free algorithms. (b> Yes, it's only one essential aspect. >>>> Agreed, but you may able to cheat and pack a pair of integers into >>>> a 64-bit atomic, and a compare-and-swap is also much cheaper than a >>>> protected object it seems. >> >> (b> Yes, but this is completely implementation-dependent. >> (b> Not much above the semantic level of assembly. >> (b> I would take a lot of convincing that the performance >> (b> improvement was actually necessary. >> >> I agree that it rarely should be necessary. It would have been nice to >> have a way of implementing lock-free algorithms in Ada efficiently, >> but entryless protected objects seems to be the best solution if >> you want a portable program. (b> It would not take too much - perhaps just a standard library including (b> read-barrier and write-barrier operations, and a selection of things like (b> CAS, TAS, EXCH, etc; mapped to m/c codes where these are provided, (b> and implemented using the architecture's native sync operations where not. (b> Is this the basis for an AI? (b> Following on my second post to the point, I was reminded of the package (b> Ada.Synchronous_Task_Control, which must also impose memory barriers is it (b> is to work reliably on a multiprocessor; so I tried that in my Simpson's (b> algorithm test as well. Here is the code: >> with Ada.Synchronous_Task_Control; >> package body Wait_Free_Atomicity is >> >> procedure Sync is >> use Ada.Synchronous_Task_Control; >> Sema : Suspension_Object; >> Bool : Boolean := Current_State(Sema); -- either this >> begin >> -- Set_True(Sema); -- or >> -- Suspend_Until_True(Sema); -- this >> end Sync; >> >> procedure Update (Atomic_Item : in out an_Atomic; Item : in an_Item) is >> Row : constant a_Bistate := not Atomic_Item.Last_Row_Inspected; >> Col : constant a_Bistate := not Atomic_Item.Last_Column_Updated(Row); >> begin >> Atomic_Item.Data_Slot_Matrix(Row, Col) := Item; >> Atomic_Item.Last_Column_Updated(Row) := Col; >> Atomic_Item.Last_Row_Updated := Row; >> Sync; >> end Update; >> >> procedure Inspect (Atomic_Item : in out an_Atomic; Item : out an_Item) is >> Row : constant a_Bistate := Atomic_Item.Last_Row_Updated; >> Col : a_Bistate; >> pragma Atomic (Col); >> begin >> Atomic_Item.Last_Row_Inspected := Row; >> Col := Atomic_Item.Last_Column_Updated(Row); >> Item := Atomic_Item.Data_Slot_Matrix(Row, Col); >> Sync; >> end Inspect; >> >> end Wait_Free_Atomicity; (b> It works nicely, and is an order of magnitude faster than a protected (b> object: (b> 20_000_000 updates (b> with Suspend_Until_True sync: (b> No consistency failures. (b> 7.30 real 14.14 user 0.04 sys (b> with Current_State sync: (b> No consistency failures. (b> 3.57 real 6.97 user 0.02 sys (b> -- (b> Bill Findlay (b> <surname><forename> chez blueyonder.co.uk -- C++: The power, elegance and simplicity of a hand grenade.
From: Ole-Hjalmar Kristensen on 15 Apr 2008 08:05 After reading your post about use of suspension objects, I looked in the RM to see what guarantees we have about the effect of sharing variables between tasks synchronized with suspension objects. I completely agree that to work reliably on a multiprocessor the suspension operations *has* to impose a memory barrier, but I was unable to find any explicit statement about sharing variables synchronized with suspension objects. The sections I have reproduced below is the closest I could get. Since Suspend_Until_True is potentially blocking, it *could* be signaled (as defined in 9.10), but I find it strange that it is not mentioned explicitly. Also, the statement "can be used for two-stage suspend operations and as a simple building block for implementing higher-level queues", seems to indicate that the intent indeed is to be able to use suspension objects to synchronize access to shared variables. Comments, anyone? From the AARM: 9.10 Shared Variables 9.a Reason: The underlying principle here is that for one action to ``signal'' a second, the second action has to follow a potentially blocking operation, whose blocking is dependent on the first action in some way. Protected procedures are not potentially blocking, so they can only be "signalers," they cannot be signaled. and D.10 Synchronous Task Control 1 [This clause describes a language-defined private semaphore (suspension object), which can be used for two-stage suspend operations and as a simple building block for implementing higher-level queues.] .... 9 The procedure Suspend_Until_True blocks the calling task until the state of the object S is true; at that point the task becomes ready and the state of the object becomes false. 10 {potentially blocking operation [Suspend_Until_True]} {blocking, potentially [Suspend_Until_True]} {Program_Error (raised by failure of run-time check)} Program_Error is raised upon calling Suspend_Until_True if another task is already waiting on that suspension object. Suspend_Until_ True is a potentially blocking operation (see 9.5.1). >>>>> "(b" == (see below) <yaldnif.w(a)blueyonder.co.uk> writes: <snip> (b> Following on my second post to the point, I was reminded of the package (b> Ada.Synchronous_Task_Control, which must also impose memory barriers is it (b> is to work reliably on a multiprocessor; so I tried that in my Simpson's (b> algorithm test as well. Here is the code: >> with Ada.Synchronous_Task_Control; >> package body Wait_Free_Atomicity is >> >> procedure Sync is >> use Ada.Synchronous_Task_Control; >> Sema : Suspension_Object; >> Bool : Boolean := Current_State(Sema); -- either this >> begin >> -- Set_True(Sema); -- or >> -- Suspend_Until_True(Sema); -- this >> end Sync; >> >> procedure Update (Atomic_Item : in out an_Atomic; Item : in an_Item) is >> Row : constant a_Bistate := not Atomic_Item.Last_Row_Inspected; >> Col : constant a_Bistate := not Atomic_Item.Last_Column_Updated(Row); >> begin >> Atomic_Item.Data_Slot_Matrix(Row, Col) := Item; >> Atomic_Item.Last_Column_Updated(Row) := Col; >> Atomic_Item.Last_Row_Updated := Row; >> Sync; >> end Update; >> >> procedure Inspect (Atomic_Item : in out an_Atomic; Item : out an_Item) is >> Row : constant a_Bistate := Atomic_Item.Last_Row_Updated; >> Col : a_Bistate; >> pragma Atomic (Col); >> begin >> Atomic_Item.Last_Row_Inspected := Row; >> Col := Atomic_Item.Last_Column_Updated(Row); >> Item := Atomic_Item.Data_Slot_Matrix(Row, Col); >> Sync; >> end Inspect; >> >> end Wait_Free_Atomicity; (b> It works nicely, and is an order of magnitude faster than a protected (b> object: (b> 20_000_000 updates (b> with Suspend_Until_True sync: (b> No consistency failures. (b> 7.30 real 14.14 user 0.04 sys (b> with Current_State sync: (b> No consistency failures. (b> 3.57 real 6.97 user 0.02 sys (b> -- (b> Bill Findlay (b> <surname><forename> chez blueyonder.co.uk -- C++: The power, elegance and simplicity of a hand grenade.
|
Next
|
Last
Pages: 1 2 Prev: Are there noticable differences in Ada acceptance by country? Next: Compiler bug |