Prev: [PATCH] x86: Fix vsyscall on gcc 4.5 with -Os
Next: [PATCH 10/12] scsi: megaraid_sas - Add input parameter for max_sectors
From: Oleg Nesterov on 21 Jun 2010 14:10 On 06/21, Oleg Nesterov wrote: > > On 06/21, Oleg Nesterov wrote: > > > > So, I am thinking about the first attempt > > > > #define while_each_thread(g, t) \ > > while ((t = next_thread(t)) != g && pid_alive(g)) > > > > again. But this means while_each_thread() can miss more threads > > than it currently can under the same conditions. Correct, but > > not good. > > Not good, but correct ;) Probably it makes sense to fix the problem > anyway, then think about the more optimal fix. > > static inline struct task_struct * > next_thread_careful(const struct task_struct *g, const struct task_struct *t) > { > t = next_thread(t); > /* > * this pairs with the implicit barrier between detach_pid() > * and list_del_rcu(g->thread_group) in __unhash_process(g). > */ > smp_rmb(); > if (likely(pid_alive(g))) > return t; > else > return g; > } > > #define while_each_thread(g, t) \ > while ((t = next_thread_careful(t)) != g) > > I think this should work. detach_pid() does unlock + lock at least > once and thus we have the barrier (this worth a comment or we > can add the explicit wmb() in __unhash_process). > > Paul, Roland, do you see any problems from the correctness pov, > or a better fix for now? > > Perhaps it also makes sense to keep the old variant renamed to > while_each_thread_locked(), I dunno. Well. but current_is_single_threaded() and zap_threads() have to use next_thread() or while_each_thread_locked() in this case... Oleg. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Roland McGrath on 21 Jun 2010 15:10 > Paul, Roland, do you see any problems from the correctness pov, > or a better fix for now? > > Perhaps it also makes sense to keep the old variant renamed to > while_each_thread_locked(), I dunno. Did we verify that only de_thread() can create the situation where a while_each_thread-style loop without either lock can be confused? If that's so, then just changing it to avoid the situation seems like it would be less invasive overall. Thanks, Roland -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Oleg Nesterov on 21 Jun 2010 16:10 On 06/21, Roland McGrath wrote: > > > Paul, Roland, do you see any problems from the correctness pov, > > or a better fix for now? > > > > Perhaps it also makes sense to keep the old variant renamed to > > while_each_thread_locked(), I dunno. > > Did we verify that only de_thread() can create the situation where a > while_each_thread-style loop without either lock can be confused? I think yes, this is is the only case. I mean, while_each_thread(group_leader, t). If g != group_leader, then the lockless while_each_thread() has problems with the plain exit(g). Afaics. The more I think about this, the more I feel confused ;) But if we start from ->group_leader, then while_each_thread() must stop eventually. Otherwise we should assume that the dead (unhashed) tasks can create the circular list, obviously this is not possible. > If > that's so, then just changing it to avoid the situation seems like it > would be less invasive overall. How? We should change ->group_leader uner write_lock_irq(tasklist), synchronize_rcu() is not an option. We can't do call_rcu(release_task), we can't take tasklist for writing in the softirq context. But even if we could, this can't help in fact or I missed something. Oleg. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Eric W. Biederman on 21 Jun 2010 17:30 Oleg Nesterov <oleg(a)redhat.com> writes: > On 06/21, Roland McGrath wrote: >> >> > Paul, Roland, do you see any problems from the correctness pov, >> > or a better fix for now? >> > >> > Perhaps it also makes sense to keep the old variant renamed to >> > while_each_thread_locked(), I dunno. >> >> Did we verify that only de_thread() can create the situation where a >> while_each_thread-style loop without either lock can be confused? > > I think yes, this is is the only case. > > I mean, while_each_thread(group_leader, t). If g != group_leader, then > the lockless while_each_thread() has problems with the plain exit(g). > > Afaics. The more I think about this, the more I feel confused ;) > > But if we start from ->group_leader, then while_each_thread() must > stop eventually. Otherwise we should assume that the dead (unhashed) > tasks can create the circular list, obviously this is not possible. > >> If >> that's so, then just changing it to avoid the situation seems like it >> would be less invasive overall. > > How? We should change ->group_leader uner write_lock_irq(tasklist), > synchronize_rcu() is not an option. We can't do call_rcu(release_task), > we can't take tasklist for writing in the softirq context. But even > if we could, this can't help in fact or I missed something. We already do: call_rcu(&p->rcu, delayed_put_task_struct); in release_task. We don't call release_task until after we have removed it as leader and dropped the write lock. At first glance it sounds like the group leader is safe as a stopping point for a rcu while_each_thread, and I expect the fact that de_thread takes everything down to a single thread, could have nice properties here. If pid_alive were only to fail on the group leader when de_thread is called I think we could legitimately say that an event we won't worry about. It is close enough to a new thread being created anyway. Eric -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Oleg Nesterov on 22 Jun 2010 10:40
On 06/21, Eric W. Biederman wrote: > > Oleg Nesterov <oleg(a)redhat.com> writes: > > >> If > >> that's so, then just changing it to avoid the situation seems like it > >> would be less invasive overall. > > > > How? We should change ->group_leader uner write_lock_irq(tasklist), > > synchronize_rcu() is not an option. We can't do call_rcu(release_task), > > we can't take tasklist for writing in the softirq context. But even > > if we could, this can't help in fact or I missed something. > > We already do: call_rcu(&p->rcu, delayed_put_task_struct); in release_task. > We don't call release_task until after we have removed it as leader and > dropped the write lock. Yes. I meant that while this is needed to ensure that next_thread/etc returns the rcu-safe data, this (or more rcu_call's) can help to fix while_each_thread. I think I was unclear. de_thread() changes ->group_leader, but this does not matter at all. I mentioned this only because we discussed the possibility to check ->group_leader in while_each_thread. What does matter is the single line in __unhash_process() list_del_rcu(&p->thread_group); this breaks while_each_thread(). IOW. Why list_for_each_rcu(head) actually works? It works because this head itself can't be removed from list. while_each_thread(g, t) is almost equal to list_for_each_rcu() and it can only work if g can't be removed from list (OK, strictly speaking until other sub-threads go away, but this doesn't matter). However, g can be removed if a) it is not ->group_leader and exits, or b) its subthread execs. > At first glance it sounds like the group leader is safe as a stopping > point for a rcu while_each_thread, and I expect the fact that > de_thread takes everything down to a single thread, could have nice > properties here. If pid_alive were only to fail on the group leader > when de_thread is called I think we could legitimately say that an event > we won't worry about. It is close enough to a new thread being created > anyway. Not sure I understand exactly... I mean, I don't understand whether you agree or not with the fix which adds pid_alive() check into next_thread_careful(). Oleg. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |