|
From: A on 13 Feb 2007 14:22 I am getting intermittent unexpected result from waitpid on Solaris 9 running Perl 5.8.8. Here is the scenario (the bare bones code is below). Program_A, written in Perl, is invoked about a million times every day. Most of the times, it invokes (using fork-exec) Program_B which is written in C++. Program_A uses waitpid to get the exit code of Program_B. It works fine most of the times, but about a few dozen times every day, the waitpid apparently fails and when it fails, I get $? is -1 $! is "No child processes" In all of the cases I have investigated, the child process, Program_B, started and completed gracefully with "exit(0)" and of course, the pid- s match from the trace log of both processes. The output, from the code below, in such case is Child pid=5196, exitCode=0xffffffff (No child processes) Program_A itself is transient and short lived, and, depending on its input, executes Program_B at most once. What am I doing wrong? How to detect and correct this? Thanks for your help. # ------------------------------------------- begin code ------------------------------------------------- #!/usr/local/bin/perl # program_A my $cpid; my $ec = undef; my $em = undef; sub getChildStatus { my $tc = undef; my $tm = undef; my $r = undef; while ( 1 ) { $r = waitpid($cpid, 0); $tc = $?; $em = $!; last if ( -1 == $r || $r == $cpid ); print STDERR "waitpid($cpid, 0) returned $r ( $? )\n"; } if ( $cpid == $r ) { $ec = $tc; $em = $tm; } } sub sigCLDhandler { my $sig = shift; print STDERR "caught SIG $sig\n"; getChildStatus; } sub runIt { my $oldSigCld = $SIG{CLD}; local $SIG{CLD} = \&sigChldHandler; $cpid = fork; if ( ! defined $cpid ) { print STDERR "fork failed [ $! ]\n"; return; } if ( 0 == $cpid ) { print STDERR "child pid $$ starting\n"; exec program_B, .. .. ..; print STDERR "child pid $$: exec failed [$!], exiting with -1\n"; exit(-1); } # 0 == $cpid i.e. the child getChildStatus; # only the parent reaches here $SIG{CLD} = $oldSigCld ; } # runIt # # main # runIt; if ( $ec ) { printf STDERR "Child pid=$cpid exitcode=%#08x msg=(%s)\n", $ec, $em; } # ------------------------------------------- end code -------------------------------------------------
From: xhoster on 13 Feb 2007 15:44 "A" <ad_101(a)yahoo.com> wrote: > I am getting intermittent unexpected result from waitpid on Solaris 9 > running Perl 5.8.8. > > Here is the scenario (the bare bones code is below). > > Program_A, written in Perl, is invoked about a million times every > day. Most of the times, it invokes (using fork-exec) Program_B which > is written in C++. Program_A uses waitpid to get the exit code of > Program_B. > It works fine most of the times, but about a few dozen times every > day, the waitpid apparently fails and when it fails, I get > > $? is -1 > $! is "No child processes" > > In all of the cases I have investigated, the child process, Program_B, > started and completed gracefully with "exit(0)" and of course, the pid- > s match from the trace log of both processes. > > The output, from the code below, in such case is > > Child pid=5196, exitCode=0xffffffff (No child processes) > > Program_A itself is transient and short lived, and, depending on its > input, executes Program_B at most once. > > What am I doing wrong? You are mucking with $SIG{CLD} when, as far as I can tell, you have no need to. getChildStatus (and the waitpid in it) can get called twice, once from the sig handler and once from the runIt. If it does get called twice, the second time that child no longer exists, as it was already waited on. Remove the $SIG{CLD} stuff. Xho -- -------------------- http://NewsReader.Com/ -------------------- Usenet Newsgroup Service $9.95/Month 30GB
From: A on 14 Feb 2007 11:12 On Feb 13, 3:44 pm, xhos...(a)gmail.com wrote: > > You are mucking with $SIG{CLD} when, as far as I can tell, you have > no need to. getChildStatus (and the waitpid in it) can get called twice, > once from the sig handler and once from the runIt. If it does get called > twice, the second time that child no longer exists, as it was already > waited on. Remove the $SIG{CLD} stuff. > > Xho > > - Show quoted text - Thanks for your reply. First, there's a typo in my original message. The third line after the while(1) in getChildStatus should be $tm = $!; instead of $em = $!; Now, to the point that the waitpid could get called twice. Please note that the code is designed to guard against this, the assignments to the globals $ec and $em are done if and only if waitpid returns the matching pid. So, even if it is called twice, the second time waitpid returns -1, and then getChildStatus returns without modifying the globals. On your advice to remove the $SIG{CLD}, there are 3 statements, the first statement saves the handler, the second statement installs the current one needed by this routine and the last one re-installs the saved handler. which one(s) would you suggest I remove? Yes, there's a deficiency (bug, if you will) in the code. The $SIG{CLD} should be re-installed if fork fails, but that I think, is of no consequence to the problem at hand. Thanks again.
From: xhoster on 14 Feb 2007 12:31 "A" <ad_101(a)yahoo.com> wrote: > On Feb 13, 3:44 pm, xhos...(a)gmail.com wrote: > > > > You are mucking with $SIG{CLD} when, as far as I can tell, you have > > no need to. getChildStatus (and the waitpid in it) can get called > > twice, once from the sig handler and once from the runIt. If it does > > get called twice, the second time that child no longer exists, as it > > was already waited on. Remove the $SIG{CLD} stuff. > > > > Xho > > > > - Show quoted text - > > Thanks for your reply. > > First, there's a typo in my original message. > > The third line after the while(1) in getChildStatus should be > $tm = $!; > instead of > $em = $!; > > Now, to the point that the waitpid could get called twice. > > Please note that the code is designed to guard against this, the > assignments to the globals $ec and $em are done if and only if waitpid > returns the matching pid. The waitpid of one getChildStatus returns the expected pid and sets the global $? and $!. Before it can do anything else, the waitpid of the other getChildStatus returns -1 and over writes the global $? and $! with it's own values, but for this one $r does not meet the if and so returns control to the first getChildStatus. The first getChildStatus was the right pid recorded in $r (as that was a lexical and didn't get overwritten), but has the wrong $? and $! because they did get overwritten, and now those get recorded into your $tm and $cm > > On your advice to remove the $SIG{CLD}, there are 3 statements, > > the first statement saves the handler, > the second statement installs the current one needed by this > routine > and the last one re-installs the saved handler. > > which one(s) would you suggest I remove? Probably all of them, but it is not really possible to know from what you give. We would need to see the code that set the orginal handler that is getting saved and then restored. If the handler you inherit is necessary, then why would it be safe to overwrite it with something else for even the duration of this routine? On the other hand, if the handler you inherit is not necessary, then what is the point of saving and re-installing it? If there is no other code which intalls a handler in the first place, then I'd remove all three of those things. (And even if not, remove at least two, see below) > Yes, there's a deficiency (bug, if you will) in the code. The > $SIG{CLD} should be re-installed if fork fails, but that I think, is > of no consequence to the problem at hand. Since you use local to install the handler, I think the old one will be reinstalled upon fork failure anyway. Saving the old one explicitly and reinstalling explicit seem to be unnecessary, assuming the local is doing its job. Xho -- -------------------- http://NewsReader.Com/ -------------------- Usenet Newsgroup Service $9.95/Month 30GB
From: Mark on 14 Feb 2007 19:54
On Feb 13, 11:22 am, "A" <ad_...(a)yahoo.com> wrote: > I am getting intermittent unexpected result from waitpid on Solaris 9 > > sub runIt > { > my $oldSigCld = $SIG{CLD}; > local $SIG{CLD} = \&sigChldHandler; I think you meant sigCLDhandler here. |