From: Darren Hart on
On 07/13/2010 04:52 AM, Thomas Gleixner wrote:
> On Tue, 13 Jul 2010, Thomas Gleixner wrote:
>>
>> This code causes braindamage. I really wonder whether we need to
>> remove it according to the "United Nations Convention against Torture
>> and Other Cruel, Inhuman or Degrading Treatment or Punishment".
>>
>
> Ok, finally managed to twist my brain around it. Mike, can you give it
> a test ride ?

Since Mike is out I built this version and ran it a few times. I saw a
100% reproduction rate previously. I haven't seen any errors in a
handful of runs (~10) with this patch.

I do still see the hrtimer latency message on the first run, so this is
likely unrelated to the issue we're addressing:

Jul 13 14:47:59 elm9m94 kernel: hrtimer: interrupt took 123924 ns

As for Thomas's changes, only a couple nitpics below:


> @@ -2255,18 +2265,51 @@ static int futex_wait_requeue_pi(u32 __u
> /* Queue the futex_q, drop the hb lock, wait for wakeup. */
> futex_wait_queue_me(hb,&q, to);
>
> - spin_lock(&hb->lock);
> - ret = handle_early_requeue_pi_wakeup(hb,&q,&key2, to);
> - spin_unlock(&hb->lock);
> - if (ret)
> - goto out_put_keys;
> + /*
> + * Avoid races with requeue and trying to block on two mutexes
> + * (hb->lock and uaddr2's rtmutex) by serializing access to
> + * pi_blocked_on with pi_lock.
> + */
> + raw_spin_lock_irq(&current->pi_lock);
> + if (current->pi_blocked_on) {
> + /* Requeue happened already */

This comment isn't quite accurate. The requeue may be in progress, which
means the q.lock_ptr is not trustworthy as noted below. Consider:

/*
* We have been requeued, or are in the process
* of being requeued.
*/

> + raw_spin_unlock_irq(&current->pi_lock);
> + } else {
> + /*
> + * Setting pi_blocked_on to PI_WAKEUP_INPROGRESS
> + * prevents a concurrent requeue from enqueuein us on

s/enqueuein/enqueueing/ not that my dictionary thinks either one of
them are words ;-)

--
Darren Hart
IBM Linux Technology Center
Real-Time Linux Team
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/