From: Serge E. Hallyn on
Quoting Andrew Lutomirski (luto(a)mit.edu):
> On Mon, Apr 19, 2010 at 5:39 PM, Serge E. Hallyn <serge(a)hallyn.com> wrote:
> > Quoting Andrew Lutomirski (luto(a)mit.edu):
> >> >
> >> > ( I did like using new securebits as in [2], but I prefer the
> >> > automatic not-raising-privs of [1] to simply -EPERM on uid/gid
> >> > change and lack kof checking for privs raising of [2]. )
> >> >
> >> > Really the trick will be finding a balance to satisfy those wanting
> >> > this as a separate LSM, without traipsing into LSM stacking territory.
> >>
> >> I think that making this an LSM is absurd. �Containers (and anything
> >> else people want to do with namespaces or with other new features that
> >> interact badly with setuid) are features that people should be able to
> >
> > Yes, but that's a reason to aim for targeted caps. �Exec_nopriv or
> > whatever is more a sandbox than a namespace feature.
> >
> >> use easily, and system's choice of LSM shouldn't have anything to do
> >> with them. �Not to mention that we're trying to *add* rights (e.g.
> >> unprivileged unshare), and LSM is about *removing* rights.
>
> Is a targeted cap something like "process A can call setdomainname,
> but only on one particular UTS namespace?"

Right, only to the UTS ns in which you live. See for instance
http://thread.gmane.org/gmane.linux.kernel.containers/15934 . It's
how we express for instance that root in a child user_namespace has
CAP_DAC_OVERRIDE over files in the container, but not over the host.

-serge
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Andrew Lutomirski on
On Tue, Apr 20, 2010 at 8:37 AM, Stephen Smalley <sds(a)tycho.nsa.gov> wrote:
> On Mon, 2010-04-19 at 16:39 -0500, Serge E. Hallyn wrote:
>> Quoting Andrew Lutomirski (luto(a)mit.edu):
>> > � 1. LSM transitions already scare me enough, and if anyone relies on
>> > them working in concert with setuid, then the mere act of separating
>> > them might break things, even if the "privileged" (by LSM) app in
>> > question is well-written.
>>
>> hmm...
>>
>> A good point.
>
> At least in the case of SELinux, context transitions upon execve are
> already disabled in the nosuid case, and Eric's patch updated the
> SELinux test accordingly.

I don't see that code in current -linus, nor do I see where SELinux
affects dumpability. What's supposed to happen? I'm writing a patch
right now to clean this stuff up.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Serge E. Hallyn on
Quoting Andrew Lutomirski (luto(a)mit.edu):
> On Tue, Apr 20, 2010 at 8:37 AM, Stephen Smalley <sds(a)tycho.nsa.gov> wrote:
> > On Mon, 2010-04-19 at 16:39 -0500, Serge E. Hallyn wrote:
> >> Quoting Andrew Lutomirski (luto(a)mit.edu):
> >> > � 1. LSM transitions already scare me enough, and if anyone relies on
> >> > them working in concert with setuid, then the mere act of separating
> >> > them might break things, even if the "privileged" (by LSM) app in
> >> > question is well-written.
> >>
> >> hmm...
> >>
> >> A good point.
> >
> > At least in the case of SELinux, context transitions upon execve are
> > already disabled in the nosuid case, and Eric's patch updated the
> > SELinux test accordingly.
>
> I don't see that code in current -linus, nor do I see where SELinux
> affects dumpability. What's supposed to happen? I'm writing a patch
> right now to clean this stuff up.

check out security/selinux/hooks.c:selinux_bprm_set_creds()

if (bprm->file->f_path.mnt->mnt_flags & MNT_NOSUID)
new_tsec->sid = old_tsec->sid;

I assume that's it?

-serge
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Stephen Smalley on
On Tue, 2010-04-20 at 11:53 -0400, Andrew Lutomirski wrote:
> On Tue, Apr 20, 2010 at 11:34 AM, Stephen Smalley <sds(a)tycho.nsa.gov> wrote:
> >>
> >> True, but I think it's still asking for trouble -- other LSMs could
> >> (and almost certainly will, especially the out-of-tree ones) do
> >> something, and I think that any action at all that an LSM takes in the
> >> bprm_set_creds hook for a nosuid (or whatever it's called) process is
> >> wrong or at best misguided.
> >>
> >> Can you think of anything that an LSM should do (or even should be
> >> able to do) when a nosuid process calls exec, other than denying the
> >> request outright? With my patch, LSMs can still reject the open_exec
> >> call.
> >
> > In the case where the context transition would shed permissions rather
> > than gain permissions, it has been suggested that we shouldn't disable
> > the transition even in the presence of nosuid. But automatically
> > computing that for a domain transition is non-trivial, so we have the
> > present behavior for SELinux.
> >
> > There also can be state updates even in the non-suid exec case, e.g.
> > saved uids, clearing capabilities, etc.
>
> Ah, right.
>
> In my patch, execve_nosecurity is (or will be, anyway) documented to
> skip all of this, and it's a new syscall, so nothing should need to be
> done. It doesn't allow anything that a userland ELF loader couldn't
> already do. (I'm not thrilled with changing the behavior of the
> original execve syscall, but one way or another, any nosuid mechanism
> will probably allow programs to exec other things without losing
> permissions that the admin might have expected. I don't see this is a
> real problem, though.)

The further you deviate from existing execve semantics, the less likely
your solution will work cleanly as a transparent replacement for execve
for userland running in this nosuid state, and the less compelling the
case for implementing execve_nosecurity in the kernel vs. just userspace
emulation of it.

> Is it even possible to purely drop permissions in SELinux? If your
> original type was orig_t and your new type is new_t, and if the rights
> granted to orig_t and new_t overlap nontrivially, then what are you
> supposed to do? Check both types for each hook? (Some annoying admin
> could even *change* the rights for orig_t or new_t after execve
> finishes.)

It has always been possible to configure policy such that one type is
less privileged than its caller, and the typebounds construct introduced
in more recent SELinux provides a kernel-enforced mechanism for ensuring
that one type is strictly bounded by the permissions of another type.

--
Stephen Smalley
National Security Agency

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Andrew Lutomirski on
On Tue, Apr 20, 2010 at 10:35 AM, Serge E. Hallyn <serue(a)us.ibm.com> wrote:
> Quoting Andrew Lutomirski (luto(a)mit.edu):
>> On Tue, Apr 20, 2010 at 8:37 AM, Stephen Smalley <sds(a)tycho.nsa.gov> wrote:
>> > On Mon, 2010-04-19 at 16:39 -0500, Serge E. Hallyn wrote:
>> >> Quoting Andrew Lutomirski (luto(a)mit.edu):
>>
>> >> > and LSM �transitions. �I
>> >> > think this is a terrible idea for two reasons:
>> >> > � 1. LSM transitions already scare me enough, and if anyone relies on
>> >> > them working in concert with setuid, then the mere act of separating
>> >> > them might break things, even if the "privileged" (by LSM) app in
>> >> > question is well-written.
>> >>
>> >> hmm...
>> >>
>> >> A good point.
>> >
>> > At least in the case of SELinux, context transitions upon execve are
>> > already disabled in the nosuid case, and Eric's patch updated the
>> > SELinux test accordingly.
>> >
>>
>> True, �but I think it's still asking for trouble -- other LSMs could
>> (and almost certainly will, especially the out-of-tree ones) do
>> something, and I think that any action at all that an LSM takes in the
>> bprm_set_creds hook for a nosuid (or whatever it's called) process is
>> wrong or at best misguided.
>
> I could be wrong, but I think the point is that your reasoning is
> correct, and that the same reasoning must apply if we're just
> executing a file out of an fs which has been mounted with '-o nosuid'.

I think Stephen has just convinced me that MNT_NOSUID will never make
sense -- there's odd legacy behavior in there and we'll probably never
get anyone to change it.

So if we give up on changing nosuid, there are a couple of things we
might want to do:

1. A mode where execve acts like all filesystems are MNT_NOSUID. This
sounds like a bad idea (if nothing else, it will cause apps that use
selinux's exec_sid mechanism (runcon?) to silently malfunction).

2. A mode where execve (or a new syscall?) has no effect on
credentials at all. This is conceptually simple and it would be great
for new userspace code, especially code that wants to do something
sandbox-like. For simplicity, even things like the effective and
inherited capability sets should probably remain unchanged. In this
mode, we'll have to disallow execing unreadable files. securebits are
(almost) irrelevant. This is what my patch does. Dealing with
AT_SECURE will be awkward at best, so programs that enter this mode
should sanitize their own environments and should be very careful if
they were setuid. (But they should do that anyway.)

There are a couple of annoyances to deal with. First, there are LSM
API issues, like this code in SELinux:

new_tsec->osid = old_tsec->sid;

/* Reset fs, key, and sock SIDs on execve. */
new_tsec->create_sid = 0;
new_tsec->keycreate_sid = 0;
new_tsec->sockcreate_sid = 0;

and this code in commoncap:

new->suid = new->fsuid = new->euid;
new->sgid = new->fsgid = new->egid;

I have no problem keeping these.

The other annoyance is cap_effective. We could clear it on every exec
(what commoncap does for non-legacy executables, I think), but that
would completely break any legacy code running as root. We could set
it to cap_permitted on every exec, which sounds like bad engineering
even though I don't see any specific problem with it. We could also
just leave it alone across exec, which might have odd side effects for
programs which change their effective set and then call exec without
thinking. (We could also emulate current behavior: in SECURE_NOROOT
mode, clear effective, and otherwise set it depending on euid. This
may be the best idea, since securebits already affects setuid(). This
emulation should *not* extend to cap_permitted or cap_inheritable.)

Empirically, my Fedora system is almost completely usable in this mode
(with cap_effective just passing through unchanged).

3. Some intermediate mode meant for userspace code that wants to
create containers or otherwise manipulate dangerous things but that
still want to execute legacy code. Breaking out of containers on exec
sounds like a really bad idea. Off the top of my head, I can think of
a couple of possibilities:

3a. Treat all executables like they have some standard (safe) label.
This could be: fP = 0, fI = everything, no setuid/setgid, and whatever
LSM label makes sense (file_t or something new for selinux, perhaps).
LSMs might want to add weird rules for what can exec what, but they
*must not* ever increase permission. Decreasing permission (with
selinux typebounds?) could be done, but I'm happy to leave that for
new features that the LSM people could add if they want.

3b. Whatever the final version of Eric's patch was.


Any thoughts on what we want to do? (2) seems most likely to survive
bashing on LKML.


--Andy

P.S. Rather than targeted capabilities, why not have namespaces come
with file descriptors that let you control them? sethostname and
setdomainname could be ioctls on the UTS namespace fd, and a network
namespace could come with two fds: one would be (or function as) a
netlink socket and the other would either let you bind low-numbered
ports just by possessing it or would have ioctls or something that
replace bind. FS namespaces still seem scary.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/