From: Oleg Nesterov on
On 06/20, Eric W. Biederman wrote:
>
> Unsharing of the pid namespace unlike unsharing of other namespaces
> does not take affect immediately. Instead it affects the children
> created with fork and clone.

Cough. It is too late to me to even try to understand the changelog.

Instead I tried to quickly read the patch. Most probably I missed
somthing, but still I'd like to ask the quiestion.

So. If I understand correctly, the patch is simple:

- unshare(CLONE_NEWPID) changes current->proxy->pid_ns,
but do not change current->pids[] and thus it doesn't
change task_active_pid_ns().

- since copy_process() uses ->proxy->pid_ns for alloc_pid()
the new children will fall into the new ns.

IOW, the caller becomes the "swapper" for the new namespace.

Correct?

If yes, I'm afraid nobody except you will understand this magic ;)

But what if the task T does unshare(CLONE_NEWPID) and then, say,
pthread_create() ? Unless I missed something, the new thread won't
be able to see T ?

OK, suppose it does fork() after unshare(), then another fork().
In this case the second child lives in the same namespace with
init created by the 1st fork, but it is not descendant ? This means
in particular that if the new init exits, zap_pid_ns_processes()->
do_wait() can't work.

I hope I missed something, this all is too subtle for me. And I
still do not understand 4/6 which adds ns->dead.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Oleg Nesterov on
On 06/20, Oleg Nesterov wrote:
>
> On 06/20, Eric W. Biederman wrote:
> >
> > Unsharing of the pid namespace unlike unsharing of other namespaces
> > does not take affect immediately. Instead it affects the children
> > created with fork and clone.
>
> Cough. It is too late to me to even try to understand the changelog.
>
> Instead I tried to quickly read the patch. Most probably I missed
> somthing, but still I'd like to ask the quiestion.
>
> So. If I understand correctly, the patch is simple:
>
> - unshare(CLONE_NEWPID) changes current->proxy->pid_ns,
> but do not change current->pids[] and thus it doesn't
> change task_active_pid_ns().
>
> - since copy_process() uses ->proxy->pid_ns for alloc_pid()
> the new children will fall into the new ns.
>
> IOW, the caller becomes the "swapper" for the new namespace.
>
> Correct?
>
> If yes, I'm afraid nobody except you will understand this magic ;)
>
> But what if the task T does unshare(CLONE_NEWPID) and then, say,
> pthread_create() ? Unless I missed something, the new thread won't
> be able to see T ?

and, in this case the exiting sub-namespace init also kills its
parent?

> OK, suppose it does fork() after unshare(), then another fork().
> In this case the second child lives in the same namespace with
> init created by the 1st fork, but it is not descendant ? This means
> in particular that if the new init exits, zap_pid_ns_processes()->
> do_wait() can't work.
>
> I hope I missed something, this all is too subtle for me. And I
> still do not understand 4/6 which adds ns->dead.

And, forgot to mention. With this change proc_flush_task()->mntput()
becomes even more wrong.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Eric W. Biederman on
Oleg Nesterov <oleg(a)redhat.com> writes:

> On 06/20, Eric W. Biederman wrote:
>>
>> Unsharing of the pid namespace unlike unsharing of other namespaces
>> does not take affect immediately. Instead it affects the children
>> created with fork and clone.
>
> Cough. It is too late to me to even try to understand the changelog.
>
> Instead I tried to quickly read the patch. Most probably I missed
> somthing, but still I'd like to ask the quiestion.
>
> So. If I understand correctly, the patch is simple:
>
> - unshare(CLONE_NEWPID) changes current->proxy->pid_ns,
> but do not change current->pids[] and thus it doesn't
> change task_active_pid_ns().
>
> - since copy_process() uses ->proxy->pid_ns for alloc_pid()
> the new children will fall into the new ns.
>
> IOW, the caller becomes the "swapper" for the new namespace.
>
> Correct?

Roughly. The caller is not in the pid namespace so shows up as pid 0.

> If yes, I'm afraid nobody except you will understand this magic ;)
>
> But what if the task T does unshare(CLONE_NEWPID) and then, say,
> pthread_create() ? Unless I missed something, the new thread won't
> be able to see T ?

Good question. I need to go back and look at that.

> OK, suppose it does fork() after unshare(), then another fork().
> In this case the second child lives in the same namespace with
> init created by the 1st fork, but it is not descendant ? This means
> in particular that if the new init exits, zap_pid_ns_processes()->
> do_wait() can't work.

do_wait() can't work and I missed that dependency the first time
around. Having looked at my earlier bug report from Daniel when
I was playing with this patchset earlier it is clear that he was
triggering the proc_mnt race with such a process.

So except for ptrace I don't think the proc_mnt problem is possible
to trigger in the current code.

> I hope I missed something, this all is too subtle for me. And I
> still do not understand 4/6 which adds ns->dead.

ns->dead is just a flag to say no more processes in the pid namespace.
Which means an unshare into the pid namespace after zap_pid_ns_processes
has been called will fail().

Eric



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/