From: Jeremy Fitzhardinge on
On 12/08/09 23:36, Ingo Molnar wrote:
>> The old version that actually passed the stack frame was better. Why
>> pick the inferior version?
>>
> Yeah, agreed. I missed that detail.
>

Which detail is that? The whole patch? ;)

> Jeremy, mind sending a patch that updates this code to use the less
> obfuscated 32-bit version, not the 64-bit version? (a delta patch
> against tip:master would be nice, as there's a fair amount of testing in
> the unification change itself already, which we dont want to discard.)
>

Sure.

But I'm not sure I understand the objection to task_pt_regs(); is it
considered deprecated? This patch received quite a lot of discussion
with no mention of it. Should we consider all its uses as suspect?

Would it be better to have something similar which just returns a
pointer to the saved [re]flags, since that's all we care about? That
should be easier to make robust against

J
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: H. Peter Anvin on
Linus clearly prefers the style with pt_regs passed on the stack as the sole form. Since we *have* to use that form for things like clone(), it makes sense to use it as the only form.

For what it's worth I did look at this when the patch first came up; it does make the individual patch a fair bit uglier, but I can understand Linus' consistency argument.

As far as I know, we don't allow any system calls from inside v86 mode.


-hpa

"Jeremy Fitzhardinge" <jeremy(a)goop.org> wrote:

>On 12/08/09 23:36, Ingo Molnar wrote:
>>> The old version that actually passed the stack frame was better. Why
>>> pick the inferior version?
>>>
>> Yeah, agreed. I missed that detail.
>>
>
>Which detail is that? The whole patch? ;)
>
>> Jeremy, mind sending a patch that updates this code to use the less
>> obfuscated 32-bit version, not the 64-bit version? (a delta patch
>> against tip:master would be nice, as there's a fair amount of testing in
>> the unification change itself already, which we dont want to discard.)
>>
>
>Sure.
>
>But I'm not sure I understand the objection to task_pt_regs(); is it
>considered deprecated? This patch received quite a lot of discussion
>with no mention of it. Should we consider all its uses as suspect?
>
>Would it be better to have something similar which just returns a
>pointer to the saved [re]flags, since that's all we care about? That
>should be easier to make robust against
>
> J

--
Sent from my mobile phone. Please excuse any lack of formatting.
From: Jeremy Fitzhardinge on
On 12/08/09 23:36, Ingo Molnar wrote:
> Jeremy, mind sending a patch that updates this code to use the less
> obfuscated 32-bit version, not the 64-bit version? (a delta patch
> against tip:master would be nice, as there's a fair amount of testing in
> the unification change itself already, which we dont want to discard.)
>

How does this look?

git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen.git fix-iopl

From: Jeremy Fitzhardinge<jeremy.fitzhardinge(a)citrix.com>
Date: Wed, 9 Dec 2009 10:26:59 -0800
Subject: [PATCH] x86: Make sys_iopl use passed-in pt_regs

Rather than using task_pt_regs, use the pt_regs * passed into the syscall.
The ABI differences are handled in small 32/64-bit specific functions,
and everything else is handled in the common do_iopl().

Signed-off-by: Jeremy Fitzhardinge<jeremy.fitzhardinge(a)citrix.com>

diff --git a/arch/x86/include/asm/syscalls.h b/arch/x86/include/asm/syscalls.h
index 5336ce2..70497f0 100644
--- a/arch/x86/include/asm/syscalls.h
+++ b/arch/x86/include/asm/syscalls.h
@@ -33,11 +33,11 @@ long sys_rt_sigreturn(struct pt_regs *);
asmlinkage int sys_set_thread_area(struct user_desc __user *);
asmlinkage int sys_get_thread_area(struct user_desc __user *);

-/* kernel/ioport.c */
-asmlinkage long sys_iopl(unsigned int);
-
/* X86_32 only */
#ifdef CONFIG_X86_32
+/* kernel/ioport.c */
+asmlinkage long sys_iopl(struct pt_regs *);
+
/* kernel/process_32.c */
int sys_clone(struct pt_regs *);
int sys_execve(struct pt_regs *);
@@ -71,6 +71,9 @@ int sys_vm86(struct pt_regs *);

/* X86_64 only */

+/* kernel/ioport.c */
+asmlinkage long sys_iopl(unsigned int, struct pt_regs *);
+
/* kernel/process_64.c */
asmlinkage long sys_clone(unsigned long, unsigned long,
void __user *, void __user *,
diff --git a/arch/x86/kernel/ioport.c b/arch/x86/kernel/ioport.c
index bad4f22..ac3cf88 100644
--- a/arch/x86/kernel/ioport.c
+++ b/arch/x86/kernel/ioport.c
@@ -105,6 +105,7 @@ asmlinkage long sys_ioperm(unsigned long from, unsigned long num, int turn_on)
*/
static int do_iopl(unsigned int level, struct pt_regs *regs)
{
+ struct thread_struct *t =&current->thread;
unsigned int old = (regs->flags>> 12)& 3;

if (level> 3)
@@ -116,21 +117,20 @@ static int do_iopl(unsigned int level, struct pt_regs *regs)
}
regs->flags = (regs->flags& ~X86_EFLAGS_IOPL) | (level<< 12);

+ t->iopl = level<< 12;
+ set_iopl_mask(t->iopl);
+
return 0;
}

-asmlinkage long sys_iopl(unsigned int level)
+#ifdef CONFIG_X86_64
+asmlinkage long sys_iopl(unsigned int level, struct pt_regs *regs)
{
- struct thread_struct *t =&current->thread;
- struct pt_regs *regs = task_pt_regs(current);
- int rc;
-
- rc = do_iopl(level, regs);
- if (rc< 0)
- goto out;
-
- t->iopl = level<< 12;
- set_iopl_mask(t->iopl);
-out:
- return rc;
+ return do_iopl(level, regs);
+}
+#else
+asmlinkage long sys_iopl(struct pt_regs *regs)
+{
+ return do_iopl(regs->bx, regs);
}
+#endif


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Linus Torvalds on


On Wed, 9 Dec 2009, H. Peter Anvin wrote:
> "Jeremy Fitzhardinge" <jeremy(a)goop.org> wrote:
> >
> >But I'm not sure I understand the objection to task_pt_regs(); is it
> >considered deprecated? This patch received quite a lot of discussion
> >with no mention of it. Should we consider all its uses as suspect?

I personally would be much happier without ever seeing a single
task_pt_regs() call outside of pure ptrace() users (and quite frankly,
even there I'd be happier if it was basically done once, and then 'struct
pt_regs' being passed around as an argument as much as possible).

In the case of 'ptrace' we control the stack of the tracee.

In other cases, we do _not_.

For example, think about us ever implementing 'tasklets' or async system
calls in general. It's _very_ possible that we'd have a special stolen
stack for them, and run the low-level system call function from that
stack. Then something like task_pt_regs() migth very well do the wrong
thing.

Now, I'm not saying that 'sys_iopl()' would ever be a valid target for
async system calls, but I _am_ saying that system calls that depend on
task_pt_regs() are fundamnetally fragile and broken, and have subtle
interactions.

In contrast, if you make it very explicit that the system call gets passed
a pointer to its pt_regs, then it still has all the same issues, but now
it's not subtle any more - now it's explicitly encoded in the call
sequence on both the caller and callee sides, and somebody doing tasklets
would hopefully see that "oh, iopl() has that same special thing that
fork/clone/vfork/execve has, and touches the stack frame register set
explicitly".

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Linus Torvalds on


On Wed, 9 Dec 2009, Jeremy Fitzhardinge wrote:
>
> How does this look?

I would actually prefer it if the calling convention was just made to
match on both x86 and x86-64. Wouldn't it be nice if they both just had

> +/* kernel/ioport.c */
> +asmlinkage long sys_iopl(unsigned int, struct pt_regs *);

as the prototype, and looked the same?

I realize that right now the 32-bit PTREGSCALL() thing doesn't support
that (very different macros for entry.S x86-32 and -64), but isn't that
just another thing we should try to fix too?

IOW, maybe something like this would be good, and would change the x86-32
calling convention to match the x86-64 one?

NOTE NOTE NOTE! Totally untested. Is the second argument even in %edx? I
don't remember, I didn't check, I'm just throwing this out as a "hey,
maybe something _like_ this can work" patch, and will be immediately
removing it from my machine after sending this email.

Linus

---
arch/x86/kernel/entry_32.S | 24 ++++++++++++------------
1 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/arch/x86/kernel/entry_32.S b/arch/x86/kernel/entry_32.S
index 50b9c22..22b4431 100644
--- a/arch/x86/kernel/entry_32.S
+++ b/arch/x86/kernel/entry_32.S
@@ -725,22 +725,22 @@ END(syscall_badsys)
/*
* System calls that need a pt_regs pointer.
*/
-#define PTREGSCALL(name) \
+#define PTREGSCALL(name, reg) \
ALIGN; \
ptregs_##name: \
- leal 4(%esp),%eax; \
+ leal 4(%esp),reg; \
jmp sys_##name;

-PTREGSCALL(iopl)
-PTREGSCALL(fork)
-PTREGSCALL(clone)
-PTREGSCALL(vfork)
-PTREGSCALL(execve)
-PTREGSCALL(sigaltstack)
-PTREGSCALL(sigreturn)
-PTREGSCALL(rt_sigreturn)
-PTREGSCALL(vm86)
-PTREGSCALL(vm86old)
+PTREGSCALL(iopl,%edx)
+PTREGSCALL(fork,%eax)
+PTREGSCALL(clone,%eax)
+PTREGSCALL(vfork,%eax)
+PTREGSCALL(execve,%eax)
+PTREGSCALL(sigaltstack,%eax)
+PTREGSCALL(sigreturn,%eax)
+PTREGSCALL(rt_sigreturn,%eax)
+PTREGSCALL(vm86,%eax)
+PTREGSCALL(vm86old,%eax)

.macro FIXUP_ESPFIX_STACK
/*
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/