From: Oleg Nesterov on
Hello.

This code is really old, and I do not know whom should I ask. And,
despite the fact it is really trivial, I have no idea how to fix it.
And even more, I am not sure it actually needs fixes. I'd better ask
the questions. Please help ;)



First of all, task_struct->personality is "int", but sys_personality()
takes "long". This means that every comparison or assignment inside of
sys_personality() paths is not right.

Probably we need something like this trivial patch

--- x/kernel/exec_domain.c
+++ x/kernel/exec_domain.c
@@ -192,7 +192,9 @@ SYSCALL_DEFINE1(personality, u_long, per
{
u_long old = current->personality;

- if (personality != 0xffffffff) {
+ if (personality > 0xffffffff)
+ return -EINVAL;
+ else if (personality != 0xffffffff) {
set_personality(personality);
if (current->personality != personality)
return -EINVAL;

?

Or, perhaps we shouldn't allow personality >= int32_max = 0x7ffffff ?
Otherwise, on 32bit machine the value returned to the user-space can
look like -ESOMETHING.

Even on x86_64, in user-space it is declared as

int personality(unsigned long persona);

if the kernel returns the "large" old it looks negative to the user-space,
and the test-case thinks that the syscall failed but errno is not set.

This is the actual reason for the question. I am really surprized I
do not know how to close the redhat-internal bugzilla entry, the
problem is very trivial.

------------------------------------------------------------------------------

But there are other oddities I can't understand. Let's forget about
the sizeof(task_struct->personality), let's suppose it is "long" too.
And note that is was long before 97dc32cdb1b53832801159d5f634b41aad9d0a23
which did s/long/int/ to reduce the sizeof task_struct.

__set_personality(). What is the point to check
ep == current_thread_info()->exec_domain ? This buys nothing afaics.
IOW, it could be simplified:

int __set_personality(u_long personality)
{
struct exec_domain *oep = current_thread_info()->exec_domain;

current_thread_info()->exec_domain = lookup_exec_domain(personality);
current->personality = personality;
module_put(oep->module);

return 0;
}

Now let's look at the caller, sys_personality()

set_personality(personality);
if (current->personality != personality)
return -EINVAL;

but __set_personality() always sets current->personality = personality,
what is the point to check equality?

IOW, when we should return -EINVAL? Perhaps, lookup_exec_domain() should
return NULL instead of default_exec_domain when the search in exec_domains
fails? And probably we shouldn't change task->personality/exec_domain in
this case? It is really strange that sys_personality() can return -EINVAL
but change ->personality.

But this can probably break exec. alpha does set_personality(PER_OSF4)
but I do not see the corresponding register_exec_domain(). On the other
hand, I do not understand why it puts PER_OSF4 into PER_MASK. PER_OSF4
is only used by sys_osf_readv/sys_osf_writev.

Thanks,

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Roland McGrath on
I don't have any special insight either, just poking around as you are.

I suspect that most of the logic and code in the kernel for this just
predates 64-bit stuff, and so was written with long==int and then maybe
tweaked slightly later.

I don't think we're ever going to need or want a 64-bit personality word.
(There are still 10 bits unused in the middle for more flags.)

Though the high bit might be set on 32-bit, there still should not really
be a danger of misinterpreting a value as an error code--as long as we
haven't used up all 10 of those middle bits. The test userland (glibc)
uses is not "long < 0" but "u_long > -4095UL". So as long as at least
one bit in 0xff00 remains clear, it won't match.

For 64-bit you want to avoid sign-extension of the old value just so it
looks valid (even though it won't look like an error code). I think the
most sensible thing is to change the task_struct field to 'unsigned int'.
Then:

u_long old = current->personality;

will do what:

u_long old = (unsigned int) current->personality;

would do now, i.e. zero-extend rather than sign-extend.

I think the 0xffffffff check is intended specifically for personality(-1)
to be a no-op that returns the old value, and nothing more. So I'd just
make that check be "((int) personality != -1)" instead.

OTOH, this:

set_personality(personality);
if (current->personality != personality)
return -EINVAL;

will then both do the job in set_personality() and return -EINVAL, when
some high bits are set. So, perhaps you are right about checking high
bits. Then I'd make it:

if ((int) personality != -1) {
if (unlikely((unsigned int) personality != personality))
return -EINVAL;
set_personality(personality);
if (current->personality != personality)
return -EINVAL;
}

> __set_personality(). What is the point to check
> ep == current_thread_info()->exec_domain ? This buys nothing afaics.
> IOW, it could be simplified:

Agreed.

> Now let's look at the caller, sys_personality()
>
> set_personality(personality);
> if (current->personality != personality)
> return -EINVAL;
>
> but __set_personality() always sets current->personality = personality,
> what is the point to check equality?

Good point. I suspect that at some point in the past set_personality()
would sometimes refuse to make the change.

> IOW, when we should return -EINVAL? Perhaps, lookup_exec_domain() should
> return NULL instead of default_exec_domain when the search in exec_domains
> fails? And probably we shouldn't change task->personality/exec_domain in
> this case? It is really strange that sys_personality() can return -EINVAL
> but change ->personality.

Agreed.

> But this can probably break exec. alpha does set_personality(PER_OSF4)
> but I do not see the corresponding register_exec_domain(). On the other
> hand, I do not understand why it puts PER_OSF4 into PER_MASK. PER_OSF4
> is only used by sys_osf_readv/sys_osf_writev.

No idea about that.


Thanks,
Roland
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Oleg Nesterov on
On 05/25, Roland McGrath wrote:
>
> I don't think we're ever going to need or want a 64-bit personality word.
> (There are still 10 bits unused in the middle for more flags.)

OK,

> Though the high bit might be set on 32-bit, there still should not really
> be a danger of misinterpreting a value as an error code--as long as we
> haven't used up all 10 of those middle bits. The test userland (glibc)
> uses is not "long < 0" but "u_long > -4095UL". So as long as at least
> one bit in 0xff00 remains clear, it won't match.

Yes, libc itself is fine. But from the application's pov, personality()
returns int, not long.

> For 64-bit you want to avoid sign-extension of the old value just so it
> looks valid (even though it won't look like an error code). I think the
> most sensible thing is to change the task_struct field to 'unsigned int'.

it is already 'unsigned int' ;)

> I think the 0xffffffff check is intended specifically for personality(-1)
> to be a no-op that returns the old value, and nothing more.

I think the same.

> OTOH, this:
>
> set_personality(personality);
> if (current->personality != personality)
> return -EINVAL;
>
> will then both do the job in set_personality() and return -EINVAL, when
> some high bits are set.

Yes! and despite the fact it returns -EINVAL, current->personality was
changed. This can't be right.

> So, perhaps you are right about checking high
> bits. Then I'd make it:
>
> if ((int) personality != -1) {
> if (unlikely((unsigned int) personality != personality))
> return -EINVAL;

Well. Think about personality(0xffffffff - 1). It passes both checks
and we change current->personality. Then the application calls
personality() again, we return the old value, and since the user-space
expects "int" it gets -2.

How about

if (personality != 0xffffffff) {
if (personality >= 0x7fffffff)
return -EINVAL;
set_personality(personality);
}

? Now that personality always fits into "insigned int" we don't need
to recheck current->personality == personality, and "< 0x7fffffff"
gurantees that "int old_personality = personality(whatever)" in user
space can be never misinterpeted as error.

As for the other oddities, they need the separate patches. Or we can
just leave this code alone ;)

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Roland McGrath on
> > Though the high bit might be set on 32-bit, there still should not really
> > be a danger of misinterpreting a value as an error code--as long as we
> > haven't used up all 10 of those middle bits. The test userland (glibc)
> > uses is not "long < 0" but "u_long > -4095UL". So as long as at least
> > one bit in 0xff00 remains clear, it won't match.
>
> Yes, libc itself is fine. But from the application's pov, personality()
> returns int, not long.

That doesn't really matter to error/success ambiguity. Since what I said
is true, it won't ever return exactly -1 for a non-error. But even if it
did, the application can use errno=0;personality(x);errno!=0 checking.

> > For 64-bit you want to avoid sign-extension of the old value just so it
> > looks valid (even though it won't look like an error code). I think the
> > most sensible thing is to change the task_struct field to 'unsigned int'.
>
> it is already 'unsigned int' ;)

Ok, then there is no bug right now, is there?

> Yes! and despite the fact it returns -EINVAL, current->personality was
> changed. This can't be right.

Agreed.

> > So, perhaps you are right about checking high
> > bits. Then I'd make it:
> >
> > if ((int) personality != -1) {
> > if (unlikely((unsigned int) personality != personality))
> > return -EINVAL;
>
> Well. Think about personality(0xffffffff - 1). It passes both checks
> and we change current->personality. Then the application calls
> personality() again, we return the old value, and since the user-space
> expects "int" it gets -2.

Yes, it never really made any sense to me that it doesn't validate any of
the flag bits.

> How about
>
> if (personality != 0xffffffff) {
> if (personality >= 0x7fffffff)
> return -EINVAL;
> set_personality(personality);
> }
>
> ? Now that personality always fits into "insigned int" we don't need
> to recheck current->personality == personality, and "< 0x7fffffff"
> gurantees that "int old_personality = personality(whatever)" in user
> space can be never misinterpeted as error.

Sure.

> As for the other oddities, they need the separate patches. Or we can
> just leave this code alone ;)

I can't see any sign that anybody cares.


Thanks,
Roland
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: H. Peter Anvin on
On 05/26/2010 01:31 PM, Roland McGrath wrote:
>> As for the other oddities, they need the separate patches. Or we can
>> just leave this code alone ;)
>
> I can't see any sign that anybody cares.

I certainly care, because we have had some very strange behavior with
weird personality bits in the past. So cleaning this up would be good.

-hpa
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/