From: Chris Friesen on
On 06/24/2010 12:07 PM, Paul E. McKenney wrote:

> 3. The thread-group leader might do pthread_exit(), removing itself
> from the thread group -- and might do so while the hapless reader
> is referencing that thread.
>
> But isn't this prohibited? Or is it really legal to do a
> pthread_create() to create a new thread and then have the
> parent thread call pthread_exit()? Not something I would
> consider trying in my own code! Well, I might, just to
> be perverse, but... ;-)

I believe SUS allows the main thread to explicitly call pthread_exit(),
leaving the other threads to run. If the main() routine just returns
then it implicitly calls exit().

Chris

--
Chris Friesen
Software Developer
GENBAND
chris.friesen(a)genband.com
www.genband.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Roland McGrath on
> First, what "bad things" can happen to a reader scanning a thread
> group?
>
> 1. The thread-group leader might do exec(), destroying the old
> list and forming a new one. In this case, we want any readers
> to stop scanning.

This doesn't do anything different (for these concerns) from just all the
other threads happening to exit right before the exec. There is no
"destroying the old" and "forming the new", it's just that all the other
threads are convinced to die now. There is no problem here.

> 2. Some other thread might do exec(), destroying the old list and
> forming a new one. In this case, we also want any readers to
> stop scanning.

Again, the list is not really destroyed, just everybody dies. What is
different here is that ->group_leader changes. This is the only time
that ever happens. Moreover, it's the only time that a task that was
previously pointed to by any ->group_leader can be reaped before the
rest of the group has already been reaped first (and thus the
thread_group made a singleton).

> 3. The thread-group leader might do pthread_exit(), removing itself
> from the thread group -- and might do so while the hapless reader
> is referencing that thread.

This is called the delay_group_leader() case. It doesn't happen in a
way that has the problems you are concerned with. The group_leader
remains in EXIT_ZOMBIE state and can't be reaped until all the other
threads have been reaped. There is no time at which any thread in the
group is in any hashes or accessible by any means after the (final)
group_leader is reaped.

> 4. Some other thread might do pthread_exit(), removing itself
> from the thread group, and again might do so while the hapless
> reader is referencing that thread. In this case, we want
> the hapless reader to continue scanning the remainder of the
> thread group.

This is the most normal case (and #1 is effectively just this repeated
by every thread in parallel).

> 5. The thread-group leader might do exit(), destroying the old
> list without forming a new one. In this case, we want any
> readers to stop scanning.

All this means is everybody is convinced to die, and the group_leader
dies too. It is not discernibly different from #6.

> 6. Some other thread might do exit(), destroying the old list
> without forming a new one. In this case, we also want any
> readers to stop scanning.

This just means everybody is convinced to die and is not materially
different from each individual thread all happening to die at the same
time.

You've described all these cases as "we want any readers to stop
scanning". That is far too strong, and sounds like some kind of
guaranteed synchronization, which does not and need not exist. Any
reader that needs a dead thread to be off the list holds siglock
and/or tasklist_lock. For the casual readers that only use
rcu_read_lock, we only "want any readers' loops eventually to
terminate and never to dereference stale pointers". That's why
normal RCU listiness is generally fine.

The only problem we have is in #2. This is only a problem because
readers' loops may be using the old ->group_leader pointer as the
anchor for their circular-list round-robin loop. Once the former
leader is removed from the list, that loop termination condition can
never be met.


Thanks,
Roland
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Oleg Nesterov on
On 06/24, Paul E. McKenney wrote:
>
> On Wed, Jun 23, 2010 at 05:24:21PM +0200, Oleg Nesterov wrote:
> > It is very possible that I missed something here, my only point is
> > that I think it would be safer to assume nothing about the leaderness.
>
> It is past time that I list out my assumptions more carefully. ;-)
>
> First, what "bad things" can happen to a reader scanning a thread
> group?

(I assume you mean the lockless case)

Currently, the only bad thing is that while_each_thread(g) can loop
forever if we race with exec(), or exit() if g is not leader.

And, to simplify, let's consider the same example again

t = g;
do {
printk("pid %d\n", t->pid);
} while_each_thread(g, t);


> 1. The thread-group leader might do exec(), destroying the old
> list and forming a new one. In this case, we want any readers
> to stop scanning.

I'd say, it is not that we want to stop scanning, it is OK to stop
scanning after we printed g->pid

> 2. Some other thread might do exec(), destroying the old list and
> forming a new one. In this case, we also want any readers to
> stop scanning.

The same.

If the code above runs under for_each_process(g) or it did
"g = find_task_by_pid(tgid)", we will see either new or old leader
and print its pid at least.

> 3. The thread-group leader might do pthread_exit(), removing itself
> from the thread group

No. It can exit, but it won't be removed from thread group. It will
be zombie untill all sub-threads disappear.

> 4. Some other thread might do pthread_exit(), removing itself
> from the thread group, and again might do so while the hapless
> reader is referencing that thread. In this case, we want
> the hapless reader to continue scanning the remainder of the
> thread group.

Yes.

But, if that thread was used as a starting point g, then

before the patch: loop forever
after the patch: break

> 5. The thread-group leader might do exit(), destroying the old
> list without forming a new one. In this case, we want any
> readers to stop scanning.
>
> 6. Some other thread might do exit(), destroying the old list
> without forming a new one. In this case, we also want any
> readers to stop scanning.

Yes. But again, it is fine to print more pids as far as we know it
is safe to iterate over the exiting thread group. However,
next_thread_careful() can stop earlier compared to next_thread().
Either way, we can miss none/some/most/all threads if we race with
exit_group().

> Anything else I might be missing?

I think this is all.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Oleg Nesterov on
On 06/24, Chris Friesen wrote:
>
> On 06/24/2010 12:07 PM, Paul E. McKenney wrote:
>
> > 3. The thread-group leader might do pthread_exit(), removing itself
> > from the thread group -- and might do so while the hapless reader
> > is referencing that thread.
> >
> > But isn't this prohibited? Or is it really legal to do a
> > pthread_create() to create a new thread and then have the
> > parent thread call pthread_exit()? Not something I would
> > consider trying in my own code! Well, I might, just to
> > be perverse, but... ;-)
>
> I believe SUS allows the main thread to explicitly call pthread_exit(),
> leaving the other threads to run. If the main() routine just returns
> then it implicitly calls exit().

Correct.

But, to clarify, if the main thread does pthread_exit() (sys_exit,
actually), it won't be removed from the group. It will be zombie
until all other threads exit.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Eric W. Biederman on
Oleg Nesterov <oleg(a)redhat.com> writes:

> On 06/24, Chris Friesen wrote:
>>
>> On 06/24/2010 12:07 PM, Paul E. McKenney wrote:
>>
>> > 3. The thread-group leader might do pthread_exit(), removing itself
>> > from the thread group -- and might do so while the hapless reader
>> > is referencing that thread.
>> >
>> > But isn't this prohibited? Or is it really legal to do a
>> > pthread_create() to create a new thread and then have the
>> > parent thread call pthread_exit()? Not something I would
>> > consider trying in my own code! Well, I might, just to
>> > be perverse, but... ;-)
>>
>> I believe SUS allows the main thread to explicitly call pthread_exit(),
>> leaving the other threads to run. If the main() routine just returns
>> then it implicitly calls exit().
>
> Correct.
>
> But, to clarify, if the main thread does pthread_exit() (sys_exit,
> actually), it won't be removed from the group. It will be zombie
> until all other threads exit.

That we don't cleanup that zombie leaders is unfortunate really, it
means we have the entire de_thread special case. But short fixing
libpthread to not make bad assumptions there is little we can do about
it really.

I'm only half following this conversation.

If what we are looking for is a stable list_head that won't disappear
on us we should be able to put one in sighand_struct or signal_struct
(I forget which is which at the moment) and have a list_head that
lives for the life of the longest living thread, and that won't get
messed up by things like de_thread, and then next_thread could simply
return NULL when we hit the end of the list.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/