From: Eric W. Biederman on
Louis Rilling <Louis.Rilling(a)kerlabs.com> writes:

> On 08/07/10 21:39 -0700, Eric W. Biederman wrote:
>>
>> Currently it is possible to put proc_mnt before we have flushed the
>> last process that will use the proc_mnt to flush it's proc entries.
>>
>> This race is fixed by not flushing proc entries for dead pid
>> namespaces, and calling pid_ns_release_proc unconditionally from
>> zap_pid_ns_processes after the pid namespace has been declared dead.
>
> One comment below.
>
>>
>> To ensure we don't unnecessarily leak any dcache entries with skipped
>> flushes pid_ns_release_proc flushes the entire proc_mnt when it is
>> called.
>>
>> Signed-off-by: Eric W. Biederman <ebiederm(a)xmission.com>
>> ---
>> fs/proc/base.c | 9 +++++----
>> fs/proc/root.c | 3 +++
>> kernel/pid_namespace.c | 1 +
>> 3 files changed, 9 insertions(+), 4 deletions(-)
>>
>> diff --git a/fs/proc/base.c b/fs/proc/base.c
>> index acb7ef8..e9d84e1 100644
>> --- a/fs/proc/base.c
>> +++ b/fs/proc/base.c
>> @@ -2742,13 +2742,14 @@ void proc_flush_task(struct task_struct *task)
>>
>> for (i = 0; i <= pid->level; i++) {
>> upid = &pid->numbers[i];
>> +
>> + /* Don't bother flushing dead pid namespaces */
>> + if (test_bit(PIDNS_DEAD, &upid->ns->flags))
>> + continue;
>> +
>
> IMHO, nothing prevents zap_pid_ns_processes() from setting PIDNS_DEAD and
> calling pid_ns_release_proc() right now. zap_pid_ns_processes() does not wait
> for EXIT_DEAD (self-reaping) children to be released.

Good point we need something probably a lock to prevent proc_mnt from
going away here. We might do a little better if we were starting with
a specific dentry, those at least have some rcu properties but that isn't
a big help.

Hmm. Perhaps there is a way to completely restructure this flushing
of dentries. It is just an optimization after all so we don't get too many
stale dentries building up.

It might just be worth it simply kill proc_flush_mnt altogether. I know
it is measurable when we don't do the flushing but perhaps there can
be a work struct that periodically wakes up and smacks stale proc dentries.

Right now I really don't think proc_flush_task is worth the hassle it
causes.

Grumble, Grumble more thinking to do.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/