From: Thomas Gleixner on
On Tue, 6 Apr 2010, Avi Kivity wrote:

> On 04/06/2010 06:28 PM, Darren Hart wrote:
> > Alan Cox wrote:
> > > On Tue, 06 Apr 2010 15:35:31 +0200
> > > Peter Zijlstra <peterz(a)infradead.org> wrote:
> > >
> > > > On Tue, 2010-04-06 at 16:28 +0300, Avi Kivity wrote:
> > > > > Yes, but that's the best case for spinning. You could simply use a
> > > > > userspace spinlock in this case.
> > > > Userspace spinlocks are evil.. they should _never_ be used.
> > >
> > > Thats a gross and inaccurate simplification. For the case Avi is talking
> > > about spinning in userspace makes sense in a lot of environments. Once
> > > you've got one thread pinned per cpu (or gang scheduling >-) ) there are
> > > various environments where it makes complete and utter sense.
> >
> > Hi Alan,
> >
> > Do you feel some of these situations would also benefit from some kernel
> > assistance to stop spinning when the owner schedules out? Or are you saying
> > that there are situations where pure userspace spinlocks will always be the
> > best option?
> >
> > If the latter, I'd think that they would also be situations where
> > sched_yield() is not used as part of the spin loop. If so, then these are
> > not our target situations for FUTEX_LOCK_ADAPTIVE, which hopes to provide a
> > better informed mechanism for making spin or sleep decisions. If sleeping
> > isn't part of the locking construct implementation, then FUTEX_LOCK_ADAPTIVE
> > doesn't have much to offer.
>
> IMO the best solution is to spin in userspace while the lock holder is
> running, fall into the kernel when it is scheduled out.

That's just not realistic as user space has no idea whether the lock
holder is running or not and when it's scheduled out without a syscall :)

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Avi Kivity on
On 04/06/2010 05:09 PM, Peter Zijlstra wrote:
> On Tue, 2010-04-06 at 16:41 +0300, Avi Kivity wrote:
>
>> On 04/06/2010 04:35 PM, Peter Zijlstra wrote:
>>
>>> On Tue, 2010-04-06 at 16:28 +0300, Avi Kivity wrote:
>>>
>>>
>>>> Yes, but that's the best case for spinning. You could simply use a
>>>> userspace spinlock in this case.
>>>>
>>>>
>>> Userspace spinlocks are evil.. they should _never_ be used.
>>>
>>>
>> But in this case they're fastest. If we don't provide a non-evil
>> alternative, people will use them.
>>
>>
> That's what FUTEX_LOCK is about.
>

That works for the uncontended case. For the contended case, the waiter
and the owner have to go into the kernel and back out to transfer
ownership. In the non-adaptive case you have to switch to the idle task
and back as well, and send an IPI. That's a lot of latency if the
unlock happened just after the waiter started the descent into the kernel.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Avi Kivity on
On 04/06/2010 07:14 PM, Thomas Gleixner wrote:
>
>> IMO the best solution is to spin in userspace while the lock holder is
>> running, fall into the kernel when it is scheduled out.
>>
> That's just not realistic as user space has no idea whether the lock
> holder is running or not and when it's scheduled out without a syscall :)
>

The kernel could easily expose this information by writing into the
thread's TLS area.

So:

- the kernel maintains a current_cpu field in a thread's tls
- lock() atomically writes a pointer to the current thread's current_cpu
when acquiring
- the kernel writes an invalid value to current_cpu when switching out
- a contended lock() retrieves the current_cpu pointer, and spins as
long as it is a valid cpu

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Alan Cox on
> Do you feel some of these situations would also benefit from some kernel
> assistance to stop spinning when the owner schedules out? Or are you
> saying that there are situations where pure userspace spinlocks will
> always be the best option?

There are cases its the best option - you are assuming for example that
the owner can get scheduled out. Eg nailing one thread per CPU in some
specialist high performance situations means they can't.

> If the latter, I'd think that they would also be situations where
> sched_yield() is not used as part of the spin loop. If so, then these
> are not our target situations for FUTEX_LOCK_ADAPTIVE, which hopes to
> provide a better informed mechanism for making spin or sleep decisions.
> If sleeping isn't part of the locking construct implementation, then
> FUTEX_LOCK_ADAPTIVE doesn't have much to offer.

I am unsure about the approach. As Avi says knowing that the lock owner is
scheduled out allows for far better behaviour. It doesn't need complex
per lock stuff or per lock notifier entries on pre-empt either.

A given task is either pre-empted or not and in the normal case of things
you need this within a process so you've got shared pages anyway. So you
only need one instance of the 'is thread X pre-empted' bit somewhere in a
non swappable page.

That gives you something along the lines of

runaddr = find_run_flag(lock);
do {
while(*runaddr == RUNNING) {
if (trylock(lock))
return WHOOPEE;
cpu relax
}
yield (_on(thread));
} while(*runaddr != DEAD);


which unlike blindly spinning can avoid the worst of any hit on the CPU
power and would be a bit more guided ?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Alan Cox on
> > IMO the best solution is to spin in userspace while the lock holder is
> > running, fall into the kernel when it is scheduled out.
>
> That's just not realistic as user space has no idea whether the lock
> holder is running or not and when it's scheduled out without a syscall :)

Which is the real problem that wants addressing and can be addressed very
cheaply. That would bring us up to par with 1970s RTOS environments ;)

Alan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/