From: Joshua Maurice on 16 Apr 2010 17:56 I'm somewhat new to POSIX. It seems that the only way to create a new process is fork. However, fork inherits all file descriptors. exec closes only the file descriptors marked as "close on exec". I generally spawn a separate process because of the isolation this affords. If a process misbehaves, like if it has a resource leak, I know that when that process dies the resource leak will generally go away. However, if a process misbehaves, like not settings "close on exec" when opening the file descriptor (an option only available in recent Linux kernels), it's possible that I will leak a file descriptor to that child and all direct and indirect grandchildren. So, how does one generally deal with this? Close all file descriptors from 3 to the max possible file descriptor? "proc/self/fd" is a good alternative, but not portable in fact and not POSIX aka not portable in theory. What do other people do? Also, what other resources should I be concerned about when doing a fork + exec? What other possible resources can "leak" into the child and all grandchildren? PS: There really should be a spawn process ala win32. This should not replace fork, but there should be an alternative to fork to bring up a clean process. That or there should be sane interfaces to accomplish the same: to guarantee that I don't have any random open resources which I will continue to leak and leak into my direct and indirect grandchildren. And no, posix_spawn is not that. It is defined to have the same semantics as fork + exec and all of the baggage which comes along with it. I'm just trying to program defensively, and POSIX is making it hard for me to do that.
From: Chris Friesen on 16 Apr 2010 19:09 On 04/16/2010 03:56 PM, Joshua Maurice wrote: > So, how does one generally deal with this? Close all file descriptors > from 3 to the max possible file descriptor? Yep. If you want to be really anal, close all possible file descriptors and then reopen 0/1/2 as desired. There's good information on this at: http://stackoverflow.com/questions/899038/getting-the-highest-allocated-file-descriptor > Also, what other resources should I be concerned about when doing a > fork + exec? What other possible resources can "leak" into the child > and all grandchildren? This is all covered in the man pages for fork() and exec(). Generally open files of various kinds are what you need to worry about. File locks are not preserved over fork() but are over exec(). > PS: There really should be a spawn process ala win32. This should not > replace fork, but there should be an alternative to fork to bring up a > clean process. Arguably, yes. Chris
From: Scott Lurndal on 16 Apr 2010 19:19 Joshua Maurice <joshuamaurice(a)gmail.com> writes: > However, if a process misbehaves, like not settings "close on >exec" when opening the file descriptor (an option only available in >recent Linux kernels) The "Close on Exec" option has been part of _every_ unix and linux kernel since basically forever. In Unix v7 it was an ioctl (FIOCLEX/FIONCLEX), in System V it was made an fcntl(2) flag. >, it's possible that I will leak a file >descriptor to that child and all direct and indirect grandchildren. Most applications that use fork/exec to spawn processes will stick a loop between the fork and exec to close all file descriptors except 0, 1 and 2 (and will often redirect those, perhaps to pipes, as well) Given that a process opened by the shell will typically (but not always) have file descriptors 0, 1 and 2 in use and all others closed, the only file descriptors you don't have control over are those used by libraries. The above loop will accomodate applications which use libraries that open files and forget to set CLOEXEC. > >So, how does one generally deal with this? Close all file descriptors >from 3 to the max possible file descriptor? Yes, this is the typical solution for applications that don't control all the files that may be opened. When I was on the X/Open base working group in the 90's, I lobbied for a 'closeall' function that would close all file descriptors above the provided fd, but it was never accepted (primarily since at the time, X/Open didn't invent, but rather attempted to standardize existing practice). >Also, what other resources should I be concerned about when doing a >fork + exec? What other possible resources can "leak" into the child >and all grandchildren? man exec. > >PS: There really should be a spawn process ala win32. This should not man posix_spawn scott
From: Joshua Maurice on 16 Apr 2010 20:21 On Apr 16, 4:19 pm, sc...(a)slp53.sl.home (Scott Lurndal) wrote: > Joshua Maurice <joshuamaur...(a)gmail.com> writes: > > However, if a process misbehaves, like not settings "close on > >exec" when opening the file descriptor (an option only available in > >recent Linux kernels) > > The "Close on Exec" option has been part of _every_ unix and linux kernel > since basically forever. In Unix v7 it was an ioctl (FIOCLEX/FIONCLEX), > in System V it was made an fcntl(2) flag. Race condition. Up until a recent Linux kernel version, you could not set close on exec in open; you could only set it with fcntl. In a multithreaded program, there is a small window between open and fcntl in which fork could be called, resulting in that file descriptor being leaked. This lack of possible correctness was fixed when you could specify O_CLOEXEC to open. See http://udrepper.livejournal.com/20407.html for full details. > >, it's possible that I will leak a file > >descriptor to that child and all direct and indirect grandchildren. > > Most applications that use fork/exec to spawn processes will stick > a loop between the fork and exec to close all file descriptors > except 0, 1 and 2 (and will often redirect those, perhaps to pipes, > as well) > > Given that a process opened by the shell will typically (but not always) > have file descriptors 0, 1 and 2 in use and all others closed, the only > file descriptors you don't have control over are those used by > libraries. The above loop will accomodate applications which use libraries > that open files and forget to set CLOEXEC. > >So, how does one generally deal with this? Close all file descriptors > >from 3 to the max possible file descriptor? > > Yes, this is the typical solution for applications that don't > control all the files that may be opened. > > When I was on the X/Open base working group in the 90's, I lobbied > for a 'closeall' function that would close all file descriptors > above the provided fd, but it was never accepted (primarily since at > the time, X/Open didn't invent, but rather attempted to standardize > existing practice). Yes, but the potential max can be quite large, and that's just wasted time. I suppose it's not that bad if you're not spawning that many processes for a suitably small value. I just hope I don't run into a system where the max file desc is a 64 bit int max. Then there's still the problem that I want to program defensively, and not have to rely upon a library guarantee that it creates all file handles "close on exec". Then when I'm doing automated testing of my product, preferably I want to isolate these leaks for software which is under development. > >Also, what other resources should I be concerned about when doing a > >fork + exec? What other possible resources can "leak" into the child > >and all grandchildren? > > man exec. Thanks for the terseness. [Sarcasm]. I was looking for more pearls of wisdom from those more experienced, like common gotchas. > >PS: There really should be a spawn process ala win32. This should not > > man posix_spawn Did you even read my full post? I specifically mentioned that posix_spawn is not that in the next sentence of my previous post, the one to which you're replying. It carries all of the same semantics of fork + exec, which includes possibly leaking over process boundaries. That extra baggage is exactly what I don't want to deal with most of the time. Most of the time, I just want to be able to create a new process without worrying about leaked file handles, which signal masks get inherited, etc.
From: William Ahern on 16 Apr 2010 20:30
Chris Friesen <cbf123(a)mail.usask.ca> wrote: > On 04/16/2010 03:56 PM, Joshua Maurice wrote: > > So, how does one generally deal with this? Close all file descriptors > > from 3 to the max possible file descriptor? > Yep. > If you want to be really anal, close all possible file descriptors and > then reopen 0/1/2 as desired. > There's good information on this at: > http://stackoverflow.com/questions/899038/getting-the-highest-allocated-file-descriptor A good post, but it's missing the most portable option, getdtablesize(2). It's often considered "non-portable", and yet it's available in Linux, *BSD, AIX, Solaris, and HP/UX (at least according to their online documentation). Some of the man pages say that it is equivalent to both the RLIMIT_NOFILE soft-limit, and the descriptor table size. I'm unsure whether setrlimit will successfully lower the soft-limit below the highest numbered descriptor already allocated. In any event, getdtablesize() is the best fall-back for when a local API (such as those mentioned in the URI above) isn't available. |