From: Joshua Maurice on 17 Apr 2010 17:02
On Apr 17, 11:00 am, "Ersek, Laszlo" <la...(a)caesar.elte.hu> wrote:
> On Fri, 16 Apr 2010, Joshua Maurice wrote:
> > On Apr 16, 4:19 pm, sc...(a)slp53.sl.home (Scott Lurndal) wrote:
> >> Joshua Maurice <joshuamaur...(a)gmail.com> writes:
> >>> However, if a process misbehaves, like not settings "close on
> >>> exec" when opening the file descriptor (an option only available in
> >>> recent Linux kernels)
> >> The "Close on Exec" option has been part of _every_ unix and linux kernel
> >> since basically forever. In Unix v7 it was an ioctl (FIOCLEX/FIONCLEX),
> >> in System V it was made an fcntl(2) flag.
> > Race condition. Up until a recent Linux kernel version, you could not
> > set close on exec in open; you could only set it with fcntl. In a
> > multithreaded program, there is a small window between open and fcntl
> > in which fork could be called, resulting in that file descriptor being
> > leaked. This lack of possible correctness was fixed when you could
> > specify O_CLOEXEC to open. See
> > http://udrepper.livejournal.com/20407.html
> > for full details.
> Okay, now I see; I've read your other posting earlier. I would not have
> suggested the redirection of fd 77 from a temporary regular file under
> these circumstances.
> However, I can't help but note the following (perhaps I'm conflating two
> different objectives of yours):
> - first you wish to get rid of the complete inherited process environment,
> - then you complain you can't name a single property that would permeate a
> whole process tree, connected by nothing else than fork() lineage.
> I kind of see a contradiction between these points.
Only if you take it at face value and twist the intent. My goal for
using processes is to guarantee a degree of fault isolation. If one
process behaves badly, preferably it should not affect other
Fault isolation is generally required for long lasting systems as
As a subtly different point, I also need this guarantee to write a
long lasting system. If rarely a process will leak file descriptors or
other resources to their children, and children keep spawning children
(in a controlled way), then eventually I'll run out of resources. It
seems to me to be a simple request to have a simple way to prevent
The other post I made about being able to kill a process tree is again
about fault isolation and preventing resource leaks. If I have a job,
and I know that job may be faulty (such as tests on code under
development, or more generally any piece of code), I would like a way
to kill that piece of code and reclaim all of its acquired resources.
> What's worse, the wish to close file descriptors for security reasons
> offers a false sense of security. As long as a process can ptrace()
> another one, it is all snake oil. I can attach a gdb instance to any
> process, call fstat() on fd's 0 to 1024, call lseek(fd, 0, SEEK_CUR) to
> find out file offsets, call getpeername() to find out about internet
> peers, call pipe() and dup2() and fork() to embed a sniffer child between
> the original program code and the socket it writes to. I could read byte
> arrays before encryption, all on the process' behalf.
> Unless ptrace() is disabled on a system by default, or it is guaranteed
> that all subprocesses that should not have address-space level access to
> the parent and/or each other, are exec()'d from setuid images with
> pairwise different non-privileged uid's, I think this debate about setting
> FD_CLOEXEC atomically with open() is pointless. (Or, at least,
> insufficient in itself.)
> In the stackoverflow.com example, the parent itself is privileged enough
> to set different uid's for its children between the fork() and exec()
> calls. Unfortunately, if an external library calls fork() + exec()
> anywhere (even in a synchronously called subroutine), it can (and most
> probably will) omit this crucial step, and then again we'll have a child
> process that can ptrace() the parent or do whatever else it wants. The end
> result is that one can't link a library into a binary designed to be run
> as root without knowing that library inside out. But in that case, all
> fork() points are known, and the programmer might as well use manual
> close() instead of FD_CLOEXEC.
Security is another issue. That requires more than I'm asking. I'm not
asking to prevent malicious code from messing with the system (though
something like what I want would probably be required). Instead, I'm
merely trying to create a stable system, one without resource leaks.
From: Kenny McCormack on 17 Apr 2010 18:21
In article <aec191c5-abb5-4179-b93d-1e1963cedc3b(a)e7g2000yqf.googlegroups.com>,
Joshua Maurice <joshuamaurice(a)gmail.com> wrote:
>Security is another issue. That requires more than I'm asking. I'm not
>asking to prevent malicious code from messing with the system (though
>something like what I want would probably be required). Instead, I'm
>merely trying to create a stable system, one without resource leaks.
I understand where you're coming from (and that security from malicious
code is a side issue). But I think in the real world, most people deal
with this problem by the simple expedient of rebooting frequently.
Given that in the real world, most systems are running (cough, cough)
operating systems made in Redmond, this seems a safe bet.
In line with this, I think the suggestion to run your builds in a VM
that boots, runs your build, and shuts down, is the best idea going.
(This discussion group is about C, ...)
Wrong. It is only OCCASIONALLY a discussion group
about C; mostly, like most "discussion" groups, it is
off-topic Rorsharch revelations of the childhood
traumas of the participants...
From: William Ahern on 17 Apr 2010 19:38
Kenny McCormack <gazelle(a)shell.xmission.com> wrote:
> In article <aec191c5-abb5-4179-b93d-1e1963cedc3b(a)e7g2000yqf.googlegroups.com>,
> Joshua Maurice <joshuamaurice(a)gmail.com> wrote:
> >Security is another issue. That requires more than I'm asking. I'm not
> >asking to prevent malicious code from messing with the system (though
> >something like what I want would probably be required). Instead, I'm
> >merely trying to create a stable system, one without resource leaks.
> I understand where you're coming from (and that security from malicious
> code is a side issue). But I think in the real world, most people deal
> with this problem by the simple expedient of rebooting frequently.
> Given that in the real world, most systems are running (cough, cough)
> operating systems made in Redmond, this seems a safe bet.
I'm not sure what that has to do w/ the price of tea in China. The type of
products that other people use in the privacy and sanctity of their own
server rooms is their own business. Certainly I'm not going to base my own
expectations around their decisions. (Maybe they derive a certain
satisfaction from rebooting the same way I do compulsively calling sync from
the command-line--a habit I picked up in the early days of Linux when the
kernel was significantly less reliable.)
> In line with this, I think the suggestion to run your builds in a VM
> that boots, runs your build, and shuts down, is the best idea going.
Duplicated descriptors aren't really a resource leak in the common sense of
the term (cf. leak in a security context). The resources usually aren't
lost, so to speak; their lifetimes are just prolonged. (This is comparable
to garbage collected languages where the developer fails to explicitly close
a descriptor before losing the object reference.) They could be a resource
leak, but only in the most contrived scenarios, such as unbounded recursive
forking. (Who does that? A dyed-in-the-wool Lisp programmer hacking shell
scripts?) Disregarding issues with dismounting filesystems, the issue is
Stay away from libraries that keep static or global state, and the problem
vanishes (assuming your own code is smart enough to cleanup after itself).
In my experience, blindly closing all descriptors is usually a step in a
belt+suspenders utilitarian approach to some task.
Descriptors "leaking" into other processes is, of course, entirely intended.
It's fundamental to the Unix process model--and sensibilities--and it's no
surprise that this is the default behavior. Consider how `/bin/sh -c "cat
<&4"' works. The common case is made simple, and the uncommon case--spawning
long-lived (i.e. indefinite) processes--is burdened w/ the extra work. If
this actually caused issues in reality, then Unix would be the platform
everybody rebooted the most. (And there's no reason to presume that Unix
developers possess more expertise than Window developers.)
From: Ersek, Laszlo on 17 Apr 2010 20:00
(I think the practical utility of my post will be nil, so read on with
that in mind.)
On Sat, 17 Apr 2010, Joshua Maurice wrote:
> [...] My goal for using processes is to guarantee a degree of fault
> isolation. If one process behaves badly, preferably it should not affect
> other processes.
The kernel does support such a separation between processes. The problem
is, as I see it, that when one process *leaks* file descriptors, in your
terminology, the kernel actually sees a *request* to bequeath file
descriptors to a child process. fork() was *designed* to do that, among
Thus the root of the issue seems to be library code forking without your
knowledge or permission. Unfortunately, within a single process, the
kernel seems not to provide preemptive protection; all parts must
co-operate. If you bethink it, the library code can do much worse things
to your state than asking the kernel (on your behalf) to pass on file
descriptors: it can trample all over your data.
In short (and this may be as misguided as of little consolation), within
the process, you're exposed to much greater dangers, and between
processes, the kernel only does what your process explicitly asks it to
do. Your idea of where the enemy territory begins differs from the
Let's replace for a second the usual library concept with a separate
process that is co-operating via RPC, via AF_UNIX sockets. Performance
would go down the drain, but the co-operating process could not leak
*your* resources inadvertently, eg. log files soon to be rotated.
I googled around previously when reading your posts. I found this:
Re: Providing an ELF flag for disabling LD_PRELOAD/ptrace()
If Alan Cox calls "the complete lack of a security boundary between
processes of the same user" "the normal Unix model", then we might be
allowed to call the non-separation between different functions in the same
process the normal Unix model too.
Whether this suits modern heterogeneous software development, that's a
different question. I feel your pain.
Looking back at your original post:
1) resources inherited through fork() and exec():
2) "spawn process ala win32": you could write a server program that takes
command lines over some kind of IPC mechanism and starts an according
process from a pristine environment. Now that we're talking about it, I
seem to remember one such server program; it's usually called "sshd".
3) If you want to trap fork() calls in library code, you could write your
own fork() wrapper. You could identify your own calls by relying on a
static variable in the wrapper, or in a multi-threaded process, by saving
and comparing thread identifiers, or by checking thread-specific data.
From: David Schwartz on 17 Apr 2010 20:17
On Apr 17, 7:23 am, Casper H.S. Dik <Casper....(a)Sun.COM> wrote:
> In Solaris it is possible to open a file descriptor, dup is to the
> highest available file descriptor and then lower the limit; it has
> closefrom() which is what Solaris applications use.
Wow, I never thought of that. In fact, that's a fairly realistic fear
in this case. An application with a high limit might well drop the
limit on open file descriptors before exec'ing another process.