From: David Howells on

The attached patch should suffice to fix get_task_cred(), and should render
Jiri's patch unnecessary.

David
---
From: David Howells <dhowells(a)redhat.com>
Subject: [PATCH] CRED: Move get_task_cred() out of line and make it use atomic_inc_not_zero()

It's possible for get_task_cred() as it currently stands to 'corrupt' a set of
credentials by incrementing their usage count after their replacement by the
task being accessed.

What happens is that get_task_cred() engages the RCU read lock, accesses the cred


TASK_1 TASK_2 RCU_CLEANER
-->get_task_cred(TASK_2)
rcu_read_lock()
__cred = __task_cred(TASK_2)
-->commit_creds()
old_cred = TASK_2->real_cred
TASK_2->real_cred = ...
put_cred(old_cred)
call_rcu(old_cred)
[__cred->usage == 0]
get_cred(__cred)
[__cred->usage == 1]
rcu_read_unlock()
-->put_cred_rcu()
[__cred->usage == 1]
panic()

However, since a tasks credentials are generally not changed very often, we can
reasonably make use of a loop involving reading the creds pointer and using
atomic_inc_not_zero() to attempt to increment it if it hasn't already hit zero.

If successful, we can safely return the credentials in the knowledge that, even
if the task we're accessing has released them, they haven't gone to the RCU
cleanup code.

Signed-off-by: David Howells <dhowells(a)redhat.com>
---

include/linux/cred.h | 21 +--------------------
kernel/cred.c | 25 +++++++++++++++++++++++++
2 files changed, 26 insertions(+), 20 deletions(-)


diff --git a/include/linux/cred.h b/include/linux/cred.h
index 75c0fa8..ce40cbc 100644
--- a/include/linux/cred.h
+++ b/include/linux/cred.h
@@ -153,6 +153,7 @@ struct cred {
extern void __put_cred(struct cred *);
extern void exit_creds(struct task_struct *);
extern int copy_creds(struct task_struct *, unsigned long);
+extern const struct cred *get_task_cred(struct task_struct *);
extern struct cred *cred_alloc_blank(void);
extern struct cred *prepare_creds(void);
extern struct cred *prepare_exec_creds(void);
@@ -282,26 +283,6 @@ static inline void put_cred(const struct cred *_cred)
((const struct cred *)(rcu_dereference_check((task)->real_cred, rcu_read_lock_held() || lockdep_tasklist_lock_is_held())))

/**
- * get_task_cred - Get another task's objective credentials
- * @task: The task to query
- *
- * Get the objective credentials of a task, pinning them so that they can't go
- * away. Accessing a task's credentials directly is not permitted.
- *
- * The caller must make sure task doesn't go away, either by holding a ref on
- * task or by holding tasklist_lock to prevent it from being unlinked.
- */
-#define get_task_cred(task) \
-({ \
- struct cred *__cred; \
- rcu_read_lock(); \
- __cred = (struct cred *) __task_cred((task)); \
- get_cred(__cred); \
- rcu_read_unlock(); \
- __cred; \
-})
-
-/**
* get_current_cred - Get the current task's subjective credentials
*
* Get the subjective credentials of the current task, pinning them so that
diff --git a/kernel/cred.c b/kernel/cred.c
index a2d5504..60bc8b1 100644
--- a/kernel/cred.c
+++ b/kernel/cred.c
@@ -209,6 +209,31 @@ void exit_creds(struct task_struct *tsk)
}
}

+/**
+ * get_task_cred - Get another task's objective credentials
+ * @task: The task to query
+ *
+ * Get the objective credentials of a task, pinning them so that they can't go
+ * away. Accessing a task's credentials directly is not permitted.
+ *
+ * The caller must also make sure task doesn't get deleted, either by holding a
+ * ref on task or by holding tasklist_lock to prevent it from being unlinked.
+ */
+const struct cred *get_task_cred(struct task_struct *task)
+{
+ const struct cred *cred;
+
+ rcu_read_lock();
+
+ do {
+ cred = __task_cred((task));
+ BUG_ON(!cred);
+ } while (!atomic_inc_not_zero(&((struct cred *)cred)->usage));
+
+ rcu_read_unlock();
+ return cred;
+}
+
/*
* Allocate blank credentials, such that the credentials can be filled in at a
* later date without risk of ENOMEM.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Jiri Olsa on
On Wed, Jul 28, 2010 at 02:17:27PM +0100, David Howells wrote:
>
> The attached patch should suffice to fix get_task_cred(), and should render
> Jiri's patch unnecessary.
>
> David
> ---
> From: David Howells <dhowells(a)redhat.com>
> Subject: [PATCH] CRED: Move get_task_cred() out of line and make it use atomic_inc_not_zero()
>
> It's possible for get_task_cred() as it currently stands to 'corrupt' a set of
> credentials by incrementing their usage count after their replacement by the
> task being accessed.
>
> What happens is that get_task_cred() engages the RCU read lock, accesses the cred
>
>
> TASK_1 TASK_2 RCU_CLEANER
> -->get_task_cred(TASK_2)
> rcu_read_lock()
> __cred = __task_cred(TASK_2)
> -->commit_creds()
> old_cred = TASK_2->real_cred
> TASK_2->real_cred = ...
> put_cred(old_cred)
> call_rcu(old_cred)
> [__cred->usage == 0]
> get_cred(__cred)
> [__cred->usage == 1]
> rcu_read_unlock()
> -->put_cred_rcu()
> [__cred->usage == 1]
> panic()
>
> However, since a tasks credentials are generally not changed very often, we can
> reasonably make use of a loop involving reading the creds pointer and using
> atomic_inc_not_zero() to attempt to increment it if it hasn't already hit zero.
>
> If successful, we can safely return the credentials in the knowledge that, even
> if the task we're accessing has released them, they haven't gone to the RCU
> cleanup code.

looks ok, I changed the task_state to use this and I'm running the
bug reproducer... so far so good ;)

wbr,
jirka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Linus Torvalds on
On Wed, Jul 28, 2010 at 6:17 AM, David Howells <dhowells(a)redhat.com> wrote:
>
> The attached patch should suffice to fix get_task_cred(), and should render
> Jiri's patch unnecessary.

Ok, I like this one. It seems to make things much simpler, and makes
sense to me.

Let's just hope it works ;) but modulo that, ACK.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Paul E. McKenney on
On Wed, Jul 28, 2010 at 01:47:06PM +0100, David Howells wrote:
> David Howells <dhowells(a)redhat.com> wrote:
>
> > Yeah. I think there are three alternatives:
>
> There's a fourth alternative too:
>
> (4) I could try and make it so that if the RCU cleanup routine sees it with a
> non-zero usage count, then it just ignores it. This, however, would
> require call_rcu() to be able to cope with requeueing.

It is perfectly legal for an RCU callback to invoke call_rcu(). However,
this should be used -only- to wait for RCU readers. If there are no
RCU readers, the callback might be re-invoked in very short order,
expecially on UP systems.

Or am I misunderstanding what you mean by "require call_rcu() to be
able to cope iwth requeueing"?

Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: David Howells on
Paul E. McKenney <paulmck(a)linux.vnet.ibm.com> wrote:

> It is perfectly legal for an RCU callback to invoke call_rcu(). However,
> this should be used -only- to wait for RCU readers. If there are no
> RCU readers, the callback might be re-invoked in very short order,
> expecially on UP systems.
>
> Or am I misunderstanding what you mean by "require call_rcu() to be
> able to cope iwth requeueing"?

I mean for call_rcu() to be called on an object that's already been
call_rcu()'d but not yet processed.

For example if struct cred gets its usage count reduced to 0, __put_cred()
will call_rcu() it, but what happens if someone comes along and resurrects it
by increasing its usage count again? And what happens if the usage count is
reduced back to zero and __put_cred() calls call_rcu() again before
put_cred_rcu() has a chance to run?

David
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/