From: Darren Hart on
On 06/29/2010 01:42 AM, Michal Hocko wrote:
> On Mon 28-06-10 18:49:08, Peter Zijlstra wrote:
>> On Mon, 2010-06-28 at 18:39 +0200, Michal Hocko wrote:
>>> Would something like the following be acceptable (just a compile
>>> tested without comments). It simply makes caller of lookup_pi_state to
>>> decide whether credentials should be checked.
>>
>> So it was Ingo, who in c87e2837be8 (pi-futex:
>> futex_lock_pi/futex_unlock_pi support) introduced the euid checks:
>>
>> +futex_find_get_task():
>> + if ((current->euid != p->euid)&& (current->euid != p->uid)) {
>> + p = NULL;
>> + goto out_unlock;
>> + }
>>
>> Ingo, do you remember the rationale behind that? It seems to be causing
>> grief when two different users contend on the same (shared) futex.
>>
>> See the below proposed solution.
>
> Here is the patch with comments and rationale:
> (reference to the original discussion: http://lkml.org/lkml/2010/6/23/52)
>
> --
> From f477a6d989dfde11c5bb5f28d5ce21d0682f4e25 Mon Sep 17 00:00:00 2001
> From: Michal Hocko<mhocko(a)suse.cz>
> Date: Tue, 29 Jun 2010 10:02:58 +0200
> Subject: [PATCH] futex: futex_find_get_task make credentials check conditional
>
> futex_find_get_task is currently used (through lookup_pi_state) from two
> contexts, futex_requeue and futex_lock_pi_atomic. While credentials check
> makes sense in the first code path, the second one is more problematic
> because this check requires that the PI lock holder (pid parameter) has
> the same uid and euid as the process's euid which is trying to lock the
> same futex (current).
>
> This results in glibc assert failure or process hang (if glibc is
> compiled without assert support) for shared robust pthread mutex with
> priority inheritance if a process tries to lock already held lock owned
> by a process with a different euid:
>
> pthread_mutex_lock.c:312: __pthread_mutex_lock_full: Assertion `(-(e)) != 3 || !robust' failed.
>
> The problem is that futex_lock_pi_atomic which is called when we try to
> lock already held lock checks the current holder (tid is stored in the
> futex value) to get the PI state. It uses lookup_pi_state which in turn
> gets task struct from futex_find_get_task. ESRCH is returned either when
> the task is not found or if credentials check fails.
> futex_lock_pi_atomic simply returns if it gets ESRCH. glibc code,
> however, doesn't expect that robust lock returns with ESRCH because it
> should get either success or owner died.
>
> Let's make credentials check conditional (as a new parameter) in
> futex_find_get_task. Then we can prevent from check in the pi lock path
> and still preserve it in the futex_requeue path.
>


Hi Michal,

All the above is accurate, however I think it emphasizes glibc's
expectations when the core of the issue is that shared PI futexes don't
work across processes with different uid's.

It seems like most users of shared futexes do so from the same uid,
however I can think of situations where it would be useful to use them
from different uid's. Since shared futexes key on their physical
address, their shouldn't be any security issues with allowing different
uids.


> Signed-off-by: Michal Hocko<mhocko(a)suse.cz>
> ---
> kernel/futex.c | 24 +++++++++++++++---------
> 1 files changed, 15 insertions(+), 9 deletions(-)
>
> diff --git a/kernel/futex.c b/kernel/futex.c
> index e7a35f1..79b69e5 100644
> --- a/kernel/futex.c
> +++ b/kernel/futex.c
> @@ -425,8 +425,9 @@ static void free_pi_state(struct futex_pi_state *pi_state)
> /*
> * Look up the task based on what TID userspace gave us.
> * We dont trust it.
> + * Check the credentials if required by check_cred

While we're changing comment blocks, please update it to a proper
kerneldoc function descriptor:

/**
* futex_find_get_task() - Lookup task by TID
* @pid: TID of the task_struct to find
* @check_cred: check credentials (1) or not (0)
*
* Look up the task based on the TID userspace gave us. We don't trust
* it. Optionally check the credentials.
*
* Returns a valid task_struct pointer or an error code embedded in the
* pointer value.
*/

The above should probably also include whatever motivation Ingo comes
back with for having done the uid check in the first place - which I
confess I am not seeing.

> */
> -static struct task_struct * futex_find_get_task(pid_t pid)
> +static struct task_struct * futex_find_get_task(pid_t pid, bool check_cred)

bool is nice, not used elsewhere, but clearly defines purpose. I may
need to update some of the other flags throughout the file in a
follow-on patch.

> {
> struct task_struct *p;
> const struct cred *cred = current_cred(), *pcred;
> @@ -436,10 +437,12 @@ static struct task_struct * futex_find_get_task(pid_t pid)
> if (!p) {
> p = ERR_PTR(-ESRCH);
> } else {
> - pcred = __task_cred(p);
> - if (cred->euid != pcred->euid&&
> - cred->euid != pcred->uid)
> - p = ERR_PTR(-ESRCH);
> + if (check_cred) {
> + pcred = __task_cred(p);
> + if (cred->euid != pcred->euid&&
> + cred->euid != pcred->uid)
> + p = ERR_PTR(-ESRCH);
> + }
> else
> get_task_struct(p);
> }
> @@ -504,9 +507,10 @@ void exit_pi_state_list(struct task_struct *curr)
> raw_spin_unlock_irq(&curr->pi_lock);
> }
>
> +/* check_cred is just passed through to futex_find_get_task */
> static int
> lookup_pi_state(u32 uval, struct futex_hash_bucket *hb,
> - union futex_key *key, struct futex_pi_state **ps)
> + union futex_key *key, struct futex_pi_state **ps, bool check_cred)

Wrap at 80.

> {
> struct futex_pi_state *pi_state = NULL;
> struct futex_q *this, *next;
> @@ -563,7 +567,7 @@ lookup_pi_state(u32 uval, struct futex_hash_bucket *hb,
> */
> if (!pid)
> return -ESRCH;
> - p = futex_find_get_task(pid);
> + p = futex_find_get_task(pid, check_cred);
> if (IS_ERR(p))
> return PTR_ERR(p);
>
> @@ -704,8 +708,10 @@ retry:
> /*
> * We dont have the lock. Look up the PI state (or create it if
> * we are the first waiter):
> + * Do not ask for credentials check because we want to share the
> + * lock between processes with different (e)uids

Please merge the new comments into the old. Keeping the original colon
confuses the comment block. Try:

/*
* We dont have the lock. Look up the PI state (or create it if
* we are the first waiter). Don't ask for a credentials check
* as we need to allow shared locks between processes with
* different (e)uids.

Thanks,

--
Darren Hart
IBM Linux Technology Center
Real-Time Linux Team
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Darren Hart on
On 06/29/2010 09:41 AM, Linus Torvalds wrote:
> On Tue, Jun 29, 2010 at 1:42 AM, Michal Hocko<mhocko(a)suse.cz> wrote:
>>
>> futex_find_get_task is currently used (through lookup_pi_state) from two
>> contexts, futex_requeue and futex_lock_pi_atomic. While credentials check
>> makes sense in the first code path, the second one is more problematic
>> because this check requires that the PI lock holder (pid parameter) has
>> the same uid and euid as the process's euid which is trying to lock the
>> same futex (current).
>
> So exactly why does it make sense to check the credentials in the
> first code path then? Shouldn't the futex issue in the end depend on
> whether you have a shared page or not - and not on credentials at all?
> Any two processes that share a futex in the same shared page should be
> able to use that without any regard for whether they are the same
> user. That's kind of the point, no?

I agree and haven't been able to come up with a need for the test
either, but I wanted to hear back from Ingo as the he authored the
original check.

I was trying to see if futex_lock_pi() could somehow be abused, but if
so, I don't see it:

TaskUserA TaskUserB
futex_lock_pi(addrA)
*addrB = TID_OF(TaskUserA)
futex_lock_pi(addrB)

TaskUserB would lookup the pi_state, not find it as addrB and addrA
don't hash to the same key, create a new pi_state and mark TaskUserA as
the owner, then block.

Once TaskUserA exits, the pi_list will contain the pi_state for the
addrB futex. This is "bad", but the kernel cleans it up, releases the
lock - but doesn't wake TaskUserB. That seems acceptable to me since
TaskUserB is in the wrong here.


> IOW, I personally dislike these kinds of conditional checks,
> especially since the discussion (at least the part I've seen) hasn't
> made it clear why it should be conditional - or exist - in the first
> place.
>
> So I'd like the patch to include an explanation of exactly why the two
> cases are different.

Agreed, waiting on Ingo at the moment.

> The other thing I'd like to see is to move the whole cred checking up
> a level. There's no reason to check the credentials in
> futex_find_get_task() that I can see - why not do it in the caller
> instead? IOW, I think futex_find_get_task() should just look something
> like this instead:


/me beats head on desk, duh. Still, I'm hoping this isn't necessary and
we can lose the credentials checking entirely.

Thanks,

--
Darren Hart
IBM Linux Technology Center
Real-Time Linux Team
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/