From: Heiko Carstens on
On Mon, May 17, 2010 at 02:34:57PM +1000, Anton Blanchard wrote:
>
> When looking at a performance problem on PowerPC, I noticed some awful code
> generation:
>
> c00000000051fc98: 3b 60 00 01 li r27,1
> ...
> c00000000051fca0: 3b 80 00 00 li r28,0
> ...
> c00000000051fcdc: 93 61 00 70 stw r27,112(r1)
> c00000000051fce0: 93 81 00 74 stw r28,116(r1)
> c00000000051fce4: 81 21 00 70 lwz r9,112(r1)
> c00000000051fce8: 80 01 00 74 lwz r0,116(r1)
> c00000000051fcec: 7d 29 07 b4 extsw r9,r9
> c00000000051fcf0: 7c 00 07 b4 extsw r0,r0
>
> c00000000051fcf4: 7c 20 04 ac lwsync
> c00000000051fcf8: 7d 60 f8 28 lwarx r11,0,r31
> c00000000051fcfc: 7c 0b 48 00 cmpw r11,r9
> c00000000051fd00: 40 c2 00 10 bne- c00000000051fd10
> c00000000051fd04: 7c 00 f9 2d stwcx. r0,0,r31
> c00000000051fd08: 40 c2 ff f0 bne+ c00000000051fcf8
> c00000000051fd0c: 4c 00 01 2c isync
>
> We create two constants, write them out to the stack, read them straight back
> in and sign extend them. What a mess.
>
> It turns out this bad code is a result of us defining atomic_t as a
> volatile int.
>
> We removed the volatile attribute from the powerpc atomic_t definition years
> ago, but commit ea435467500612636f8f4fb639ff6e76b2496e4b (atomic_t: unify all
> arch definitions) added it back in.

With your patches we can also revert 39475179d40996b4efa662e3825735a84d2526d1
"[S390] Improve code generated by atomic operations." which was created for
the very same reason.

Thanks!
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Jamie Lokier on
Linus Torvalds wrote:
>
>
> On Mon, 17 May 2010, Anton Blanchard wrote:
> >
> > It turns out this bad code is a result of us defining atomic_t as a
> > volatile int.
>
> Heh. Ok, as you point out in the commit message, I obviously agree with
> this patch. "volatile" on data is evil, with the possible exception of
> "jiffies" type things.

I wonder if

extern unsigned long __nv_jiffies;
#define jiffies (*(volatile unsigned long *)*__nv_jiffies)

would improve any code in the same way as this atomic_t change.

-- Jamie
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Nick Piggin on
On Mon, May 17, 2010 at 08:01:54AM -0700, Linus Torvalds wrote:
>
>
> On Mon, 17 May 2010, Anton Blanchard wrote:
> >
> > It turns out this bad code is a result of us defining atomic_t as a
> > volatile int.
>
> Heh. Ok, as you point out in the commit message, I obviously agree with
> this patch. "volatile" on data is evil, with the possible exception of
> "jiffies" type things.
>
> So applied.

I wonder, Linus, is there a good reason to use volatile for these at
all?

I asked you about it quite a while back, and IIRC you said it might
be OK to remove volatile from bitops, provided that callers were audited
(ie. that nobody used bitops on volatile variables).

For atomic_read it shouldn't matter unless gcc is *really* bad at it.
Ah, for atomic_read, the required semantic is surely ACCESS_ONCE, so
that's where the volatile is needed? (maybe it would be clearer to
explicitly use ACCESS_ONCE?)

The case I was thinking about for bitops was for multiple non-atomic
bitops, which would be nice to combine. In reality a lot of performance
critical code (like page allocator) bites the bullet and does the
open-coded bitwise ops. But it would be nice if that just worked for
__set_bit / __clear_bit too.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Paul E. McKenney on
On Wed, May 19, 2010 at 11:03:27PM +1000, Nick Piggin wrote:
> On Mon, May 17, 2010 at 08:01:54AM -0700, Linus Torvalds wrote:
> >
> >
> > On Mon, 17 May 2010, Anton Blanchard wrote:
> > >
> > > It turns out this bad code is a result of us defining atomic_t as a
> > > volatile int.
> >
> > Heh. Ok, as you point out in the commit message, I obviously agree with
> > this patch. "volatile" on data is evil, with the possible exception of
> > "jiffies" type things.
> >
> > So applied.
>
> I wonder, Linus, is there a good reason to use volatile for these at
> all?
>
> I asked you about it quite a while back, and IIRC you said it might
> be OK to remove volatile from bitops, provided that callers were audited
> (ie. that nobody used bitops on volatile variables).
>
> For atomic_read it shouldn't matter unless gcc is *really* bad at it.
> Ah, for atomic_read, the required semantic is surely ACCESS_ONCE, so
> that's where the volatile is needed? (maybe it would be clearer to
> explicitly use ACCESS_ONCE?)

Explicit use of ACCESS_ONCE() where needed makes a lot of sense to me,
and allows better code to be generated for initialization and cleanup
code where no other task has access to the atomic_t.

> The case I was thinking about for bitops was for multiple non-atomic
> bitops, which would be nice to combine. In reality a lot of performance
> critical code (like page allocator) bites the bullet and does the
> open-coded bitwise ops. But it would be nice if that just worked for
> __set_bit / __clear_bit too.

FWIW, a similar debate in the C-language standards committee seems to
be headed in the direction of allowing combining of adjacent atomic
operations.

Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: David Miller on
From: "Paul E. McKenney" <paulmck(a)linux.vnet.ibm.com>
Date: Wed, 19 May 2010 08:01:32 -0700

> On Wed, May 19, 2010 at 11:03:27PM +1000, Nick Piggin wrote:
>> For atomic_read it shouldn't matter unless gcc is *really* bad at it.
>> Ah, for atomic_read, the required semantic is surely ACCESS_ONCE, so
>> that's where the volatile is needed? (maybe it would be clearer to
>> explicitly use ACCESS_ONCE?)
>
> Explicit use of ACCESS_ONCE() where needed makes a lot of sense to me,
> and allows better code to be generated for initialization and cleanup
> code where no other task has access to the atomic_t.

I agree and I want to see this too, but I think with the tree the size
that it is we have to work backwards at this point.

Existing behavior by default, and optimized cases get tagged by using
a new interface (atomic_read_light(), test_bit{,s}_light(), etc.)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/