From: H. Peter Anvin on
On 03/15/2010 01:35 PM, Benjamin Herrenschmidt wrote:
> On Mon, 2010-03-15 at 12:41 -0700, H. Peter Anvin wrote:
>> I don't see why syscall() can't change the type for its first argument
>> -- it seems to be exactly what symbol versioning is for.
>>
>> Doesn't change the fact that it is fundamentally broken, of course.
>
> No need to change the type of the first arg and go for symbol
> versionning if you do something like I proposed earlier, there will be
> no conflict between syscall() and __syscall() and both variants can
> exist.
>

Basically symbol versioning done "by hand", actually using symbol
versioning is better, IMNSHO.

-hpa
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Steven Munroe on
On Tue, 2010-03-16 at 07:35 +1100, Benjamin Herrenschmidt wrote:
> On Mon, 2010-03-15 at 12:41 -0700, H. Peter Anvin wrote:
> > I don't see why syscall() can't change the type for its first argument
> > -- it seems to be exactly what symbol versioning is for.
> >
> > Doesn't change the fact that it is fundamentally broken, of course.
>
> No need to change the type of the first arg and go for symbol
> versionning if you do something like I proposed earlier, there will be
> no conflict between syscall() and __syscall() and both variants can
> exist.
>
One concern is the new syscall and the kernel have to match and mixing
will not work. your proposal seems to impact all syscalls not just the
one called via syscall API. These syscalls get generated inline which
makes static linking very dangerous ...

So I think you do need both symbol versioning and kernel feature stubs
(like xstat). That gets to be a lot of work

> Cheers,
> Ben.
>
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Benjamin Herrenschmidt on
On Tue, 2010-03-16 at 16:56 -0500, Steven Munroe wrote:
> On Tue, 2010-03-16 at 07:35 +1100, Benjamin Herrenschmidt wrote:
> > On Mon, 2010-03-15 at 12:41 -0700, H. Peter Anvin wrote:
> > > I don't see why syscall() can't change the type for its first argument
> > > -- it seems to be exactly what symbol versioning is for.
> > >
> > > Doesn't change the fact that it is fundamentally broken, of course.
> >
> > No need to change the type of the first arg and go for symbol
> > versionning if you do something like I proposed earlier, there will be
> > no conflict between syscall() and __syscall() and both variants can
> > exist.
> >
> One concern is the new syscall and the kernel have to match and mixing
> will not work. your proposal seems to impact all syscalls not just the
> one called via syscall API. These syscalls get generated inline which
> makes static linking very dangerous ...
>
> So I think you do need both symbol versioning and kernel feature stubs
> (like xstat). That gets to be a lot of work

What do you mean ? My proposal is purely a change to the syscall()
function, nothing else. No kernel change, no ABI change, no change to
the way glibc normally calls syscalls internally, etc... just the
exported syscall() function to shift its arguments in order to avoid
losing register pair alignment.

And the change would still be compatible with existing userland code who
manually splits the 64-bit arguments to avoid the problem on power.

IE. Unless I've missed something, this would be a 100% backward
compatible change that simply make a whole class of syscall() use work
that didn't before on power (but did on x86), such as the one I hit in
hdparm for example.

Cheers,
Ben.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Ulrich Drepper on
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 03/16/2010 05:31 PM, Benjamin Herrenschmidt wrote:
> My proposal is purely a change to the syscall()
> function, nothing else. No kernel change, no ABI change, no change to
> the way glibc normally calls syscalls internally, etc...

How can this be? People are today actively working around the problem
of 64-bit arguments. You have to break something since you cannot
recognize these situations. And since it became meanwhile clear that
there is no way to "fix" all archs magically I really don't want to
introduce anything. There are mechanisms in place to abstract out some
of the issues. And for the rest, well, if you're using syscalls
directly you already have to encoded lowlevel knowledge. One more bit
doesn't hurt. It's not as if this happens every day.

- --
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/

iEYEARECAAYFAkugbhsACgkQ2ijCOnn/RHQzlACeMp0UK2jZuZOgXhJjB8Z9p4kh
rCoAn0zaJqFYV9tQ0Ct49Mprfa0O5iKh
=71la
-----END PGP SIGNATURE-----
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Benjamin Herrenschmidt on
On Tue, 2010-03-16 at 22:52 -0700, Ulrich Drepper wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 03/16/2010 05:31 PM, Benjamin Herrenschmidt wrote:
> > My proposal is purely a change to the syscall()
> > function, nothing else. No kernel change, no ABI change, no change to
> > the way glibc normally calls syscalls internally, etc...
>
> How can this be? People are today actively working around the problem
> of 64-bit arguments. You have to break something since you cannot
> recognize these situations.

Ok, so I -may- be missing something, but I believe this won't break
anything:

- You keep the existing syscall() exported by glibc for binary
compatibility

- You add a new __syscall() (or whatever you want to name it) that adds
a dummy argument at the beginning, and whose implementation shifts
everything by 2 instead of 1 argument before calling into the kernel

- You define in unistd.h or whatever is relevant, a macro that does:

#define syscall(__sysno, __args..) __syscall(0, _sysno, __args)

I believe that should cover it, at least for powerpc, possibly for other
archs too though as I said, I may have missed something there.

IE. Whether your app writes:

syscall(SYS_foo, my_64bit_arg);

Or

syscall(SYS_foo, (u32)(my_64bit_arg >> 32), (u32)(my_64bit_arg));

Both should still work with the new approach and end up doing the right
thing.

Hence, apps that use the first form today because it works on x86 would
end up working at least on powerpc where they would have been otherwise
broken unless they used some arch specific #ifdef to do the second form.

> And since it became meanwhile clear that
> there is no way to "fix" all archs magically I really don't want to
> introduce anything. There are mechanisms in place to abstract out some
> of the issues. And for the rest, well, if you're using syscalls
> directly you already have to encoded lowlevel knowledge. One more bit
> doesn't hurt. It's not as if this happens every day.

It doesn't happen everyday. However, if my proposal ends up fixing a
bunch of cases where it does without breaking anything, then I suppose
it's worth considering, though as I said, it's possible that I miss some
subtlety here in which case I'd be glad to stand corrected :-)

Cheers,
Ben.

> - --
> ➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.10 (GNU/Linux)
> Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/
>
> iEYEARECAAYFAkugbhsACgkQ2ijCOnn/RHQzlACeMp0UK2jZuZOgXhJjB8Z9p4kh
> rCoAn0zaJqFYV9tQ0Ct49Mprfa0O5iKh
> =71la
> -----END PGP SIGNATURE-----


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/