x86: enlightenment for ticket spin locks - base implementation [Kernel]

Prev: [PATCH] act_mirred: combine duplicate code
Next: [X86] Optimize memcpy by avoiding memory false dependece

From: Jan Beulich on 30 Jun 2010 05:10

>>> On 30.06.10 at 10:05, Peter Zijlstra <peterz(a)infradead.org> wrote:
> On Tue, 2010-06-29 at 15:31 +0100, Jan Beulich wrote:
>> Add optional (alternative instructions based) callout hooks to the
>> contended ticket lock and the ticket unlock paths, to allow hypervisor
>> specific code to be used for reducing/eliminating the bad effects
>> ticket locks have on performance when running virtualized.
>
> Uhm, I'd much rather see a single alternative implementation, not a
> per-hypervisor lock implementation.

How would you imaging this to work? I can't see how the mechanism
could be hypervisor agnostic. Just look at the Xen implementation
(patch 2) - do you really see room for meaningful abstraction there?
Not the least that not every hypervisor may even have a way to
poll for events (like Xen does), in which case a simple yield may be
needed instead.

>> For the moment, this isn't intended to be used together with pv-ops,
>> but this is just to simplify initial integration. The ultimate goal
>> for this should still be to replace pv-ops spinlocks.
>
> So why not start by removing that?

Because I wouldn't get around to test it within the time constraints
I have?

>> +config ENLIGHTEN_SPINLOCKS
>
> Why exactly are these enlightened? I'd say CONFIG_UNFAIR_SPINLOCKS would
> be much better.

The naming certainly isn't significant to me. If consensus can be
reached on any one name, I'll be fine with changing it. I just don't
want to play ping pong here.

>> +#define X86_FEATURE_SPINLOCK_YIELD (3*32+31) /* hypervisor yield interface
> */
>
> That name also sucks chunks, yield isn't a lock related term.

Not sure what's wrong with the name (the behavior *is* a yield of
some sort to the underlying scheduler). But again, any name
acceptable to all relevant parties will be fine with me.

>> +#define ALTERNATIVE_TICKET_LOCK \
>
> But but but, the alternative isn't a ticket lock..!?

??? Of course it is. Or do you mean the macro doesn't
represent the full lock operation? My reading of the name is that
this is the common alternative instruction sequence used in a lock
operation. And just as above - I don't care much about the actual
name, and I'll change any or all of them as long as I'm not going to
be asked to change them back and forth.

Jan

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Jan Beulich on 30 Jun 2010 07:50

>>> On 30.06.10 at 11:56, Jeremy Fitzhardinge <jeremy(a)goop.org> wrote:
> On 06/30/2010 11:11 AM, Peter Zijlstra wrote:
>> On Wed, 2010-06-30 at 10:00 +0100, Jan Beulich wrote:
>>
>>>>>> On 30.06.10 at 10:05, Peter Zijlstra <peterz(a)infradead.org> wrote:
>>>>>>
>>>> On Tue, 2010-06-29 at 15:31 +0100, Jan Beulich wrote:
>>>>
>>>>> Add optional (alternative instructions based) callout hooks to the
>>>>> contended ticket lock and the ticket unlock paths, to allow hypervisor
>>>>> specific code to be used for reducing/eliminating the bad effects
>>>>> ticket locks have on performance when running virtualized.
>>>>>
>>>> Uhm, I'd much rather see a single alternative implementation, not a
>>>> per-hypervisor lock implementation.
>>>>
>>> How would you imaging this to work? I can't see how the mechanism
>>> could be hypervisor agnostic. Just look at the Xen implementation
>>> (patch 2) - do you really see room for meaningful abstraction there?
>>>
>> I tried not to, it made my eyes bleed..
>>
>> But from what I hear all virt people are suffering from spinlocks (and
>> fair spinlocks in particular), so I was thinking it'd be a good idea to
>> get all interested parties to collaborate on one. Fragmentation like
>> this hardly ever works out well.
>>
>
> The fastpath of the spinlocks can be common, but if it ends up spinning
> too long (however that might be defined), then it needs to call out to a
> hypervisor-specific piece of code which is effectively "yield this vcpu
> until its worth trying again". In Xen we can set up an event channel
> that the waiting CPU can block on, and the current lock holder can
> tickle it when it releases the lock (ideally it would just tickle the
> CPU with the next ticket, but that's a further refinement).

It does tickle just the new owner - that's what the list is for.

Jan

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Jan Beulich on 30 Jun 2010 08:00

>>> On 30.06.10 at 13:48, Peter Zijlstra <peterz(a)infradead.org> wrote:
> On Wed, 2010-06-30 at 12:43 +0100, Jan Beulich wrote:
>
>> It does tickle just the new owner - that's what the list is for.
>
> But if you have a FIFO list you don't need the ticket stuff and can
> implement a FIFO lock instead.

The list is LIFO (not FIFO, as only the most recently added entry is
a candidate for needing wakeup as long as there's no interrupt
re-enabling in irqsave lock paths), and is only used for tickling (not
for deciding who's going to be the next owner).

Jan

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Jan Beulich on 30 Jun 2010 08:00

>>> On 30.06.10 at 12:50, Jeremy Fitzhardinge <jeremy(a)goop.org> wrote:
> On 06/30/2010 11:11 AM, Peter Zijlstra wrote:
>>>> Uhm, I'd much rather see a single alternative implementation, not a
>>>> per-hypervisor lock implementation.
>>>>
>>> How would you imaging this to work? I can't see how the mechanism
>>> could be hypervisor agnostic. Just look at the Xen implementation
>>> (patch 2) - do you really see room for meaningful abstraction there?
>>>
>> I tried not to, it made my eyes bleed..
>>
>> But from what I hear all virt people are suffering from spinlocks (and
>> fair spinlocks in particular), so I was thinking it'd be a good idea to
>> get all interested parties to collaborate on one. Fragmentation like
>> this hardly ever works out well.
>>
>
> Yes. Now that I've looked at it a bit more closely I think these
> patches put way too much logic into the per-hypervisor part of the code.

I fail to see that: Depending on the hypervisor's capabilities, the
two main functions could be much smaller (potentially there wouldn't
even be a need for the unlock hook in some cases), and hence I
continue to think that all code that is in xen.c indeed is non-generic
(while I won't say that there may not be a second hypervisor where
the code might look almost identical).

>> Ah, right, after looking a bit more at patch 2 I see you indeed
>> implement a ticket like lock. Although why you need both a ticket and a
>> FIFO list is beyond me.
>>
>
> That appears to be a mechanism to allow it to take interrupts while
> spinning on the lock, which is something that stock ticket locks don't
> allow. If that's a useful thing to do, it should happen in the generic
> ticketlock code rather than in the per-hypervisor backend (otherwise we
> end up with all kinds of subtle differences in lock behaviour depending
> on the exact environment, which is just going to be messy). Even if
> interrupts-while-spinning isn't useful on native hardware, it is going
> to be equally applicable to all virtual environments.

While we do interrupt re-enabling in our pv kernels, I intentionally
didn't do this here - it complicates the code quite a bit further, and
that did seem right for an initial submission.

The list really juts is needed to not pointlessly tickle CPUs that
won't own the just released lock next anyway (or would own
it, but meanwhile went for another one where they also decided
to go into polling mode).

Jan

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Jan Beulich on 30 Jun 2010 09:30

>>> On 30.06.10 at 14:53, Jeremy Fitzhardinge <jeremy(a)goop.org> wrote:
> On 06/30/2010 01:52 PM, Jan Beulich wrote:
>> I fail to see that: Depending on the hypervisor's capabilities, the
>> two main functions could be much smaller (potentially there wouldn't
>> even be a need for the unlock hook in some cases),
>
> What mechanism are you envisaging in that case?

A simple yield is better than not doing anything at all.

>> The list really juts is needed to not pointlessly tickle CPUs that
>> won't own the just released lock next anyway (or would own
>> it, but meanwhile went for another one where they also decided
>> to go into polling mode).
>
> Did you measure that it was a particularly common case which was worth
> optimising for?

I didn't measure this particular case. But since the main problem
with ticket locks is when (host) CPUs are overcommitted, it
certainly is a bad idea to create even more load on the host than
there already is (the more that these are bursts).

Jan

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

|
Pages: 1
Prev: [PATCH] act_mirred: combine duplicate code
Next: [X86] Optimize memcpy by avoiding memory false dependece