From: Golden California Girls on
cerr wrote:
> On Jan 13, 7:26 pm, Golden California Girls <gldncag...(a)aol.com.mil>
> wrote:
>> Let me guess, the pointer variable hasn't been initialized when you compare with
>> NULL. Just because it is a pointer doesn't mean it gets initialized to NULL.
>> You have to do it explicitly, like any other variable.
>
> Nah., I've been through that already.... they get initialized
> properly...

Then you might want to look into malloc's debug options. Because you are
calling free on something that gets used later.
From: cerr on
On Jan 14, 6:02 pm, Golden California Girls <gldncag...(a)aol.com.mil>
wrote:
> cerr wrote:
> > On Jan 13, 7:26 pm, Golden California Girls <gldncag...(a)aol.com.mil>
> > wrote:
> >> Let me guess, the pointer variable hasn't been initialized when you compare with
> >> NULL.  Just because it is a pointer doesn't mean it gets initialized to NULL.
> >> You have to do it explicitly, like any other variable.
>
> > Nah., I've been through that already.... they get initialized
> > properly...
>
> Then you might want to look into malloc's debug options.  Because you are
> calling free on something that gets used later.

Exactly that's where one of the problems was:
fre(var);
without
var=NULL;
which leads to a stale pointer that still has a "seemingly" valid
value but is wrong!
Now I'm off to huntting the next seg fault... :o - that I -
unfortunately - don't really know where it's happening cause gdb only
reports the SIGKILL but with strace I see that a SIGSEGV is leading to
SIGKILLing the other threads.... no idea why i can't see the SIGSEGV
in gdb... :(
From: David Schwartz on
On Jan 15, 8:49 am, cerr <ron.egg...(a)gmail.com> wrote:

> Exactly that's where one of the problems was:
> fre(var);
> without
> var=NULL;
> which leads to a stale pointer that still has a "seemingly" valid
> value but is wrong!
> Now I'm off to huntting the next seg fault... :o - that I -
> unfortunately - don't really know where it's happening cause gdb only
> reports the SIGKILL but with strace I see that a SIGSEGV is leading to
> SIGKILLing the other threads.... no idea why i can't see the SIGSEGV
> in gdb... :(

Catch the SIGSEGV and call 'abort'.

If you're using older versions of Linux that make crappy multi-
threaded core dumps, call 'fork' and then call 'abort' in the child.

Note that you must call the operating system's 'fork' directly. You
cannot call the library's 'fork' as it will try to invoke fork
handlers and acquire locks that is not safe to do in a signal handler.
But Linux's 'fork' system call (which you can call as __libc_fork) is
safe in a signal handler. I also recommend following it with something
to ensure the parent doesn't dump core on top of the child.

DS
From: cerr on
On Jan 15, 9:12 am, David Schwartz <dav...(a)webmaster.com> wrote:
> On Jan 15, 8:49 am, cerr <ron.egg...(a)gmail.com> wrote:
>
> > Exactly that's where one of the problems was:
> > fre(var);
> > without
> > var=NULL;
> > which leads to a stale pointer that still has a "seemingly" valid
> > value but is wrong!
> > Now I'm off to huntting the next seg fault... :o - that I -
> > unfortunately - don't really know where it's happening cause gdb only
> > reports the SIGKILL but with strace I see that a SIGSEGV is leading to
> > SIGKILLing the other threads.... no idea why i can't see the SIGSEGV
> > in gdb... :(
>
> Catch the SIGSEGV and call 'abort'.
Huh, IO don't understand. Where would I call abort().

> If you're using older versions of Linux that make crappy multi-
> threaded core dumps, call 'fork' and then call 'abort' in the child.
Using 2.6.15.1 - how does it make sense to call abort in the child? I
don't understand.

> Note that you must call the operating system's 'fork' directly. You
> cannot call the library's 'fork' as it will try to invoke fork
> handlers and acquire locks that is not safe to do in a signal handler.
> But Linux's 'fork' system call (which you can call as __libc_fork) is
> safe in a signal handler. I also recommend following it with something
> to ensure the parent doesn't dump core on top of the child.

The way the app is currently forking is:

/*already a daemon */
if (getppid() == 1)
return;

/*lit it up */
i = fork();
if (i < 0)
exit(1);
if (i > 0)
exit(0);

setsid();

I'm not quite sure how changing this would help me in finding where
the SIGSEGV is happening... :o
From: Scott Lurndal on
cerr <ron.eggler(a)gmail.com> writes:
>On Jan 15, 9:12=A0am, David Schwartz <dav...(a)webmaster.com> wrote:
>> On Jan 15, 8:49=A0am, cerr <ron.egg...(a)gmail.com> wrote:
>>
>> > Exactly that's where one of the problems was:
>> > fre(var);
>> > without
>> > var=3DNULL;
>> > which leads to a stale pointer that still has a "seemingly" valid
>> > value but is wrong!
>> > Now I'm off to huntting the next seg fault... :o - that I -
>> > unfortunately - don't really know where it's happening cause gdb only
>> > reports the SIGKILL but with strace I see that a SIGSEGV is leading to
>> > SIGKILLing the other threads.... no idea why i can't see the SIGSEGV
>> > in gdb... :(
>>
>> Catch the SIGSEGV and call 'abort'.
>Huh, IO don't understand. Where would I call abort().

#include <assert.h>
#include <signal.h>
#include <stdlib.h>

void
catch_sigsegv(int signo, siginfo_t *sip, void *ucp)
{
abort();
}



main(...)

...

struct sigaction sa;
stack_t ss;

ss.ss_sp = malloc(SIGSTKSZ);
assert(ss.ss_sp != NULL);
ss.ss_size = SIGSTKSZ;
ss.ss_flags = 0;
diag = sigaltstack(&ss, NULL);
if (diag == -1) {
//handle error
}

sigemptyset(&sa.sa_mask);
sa.sa_flags = SA_ONSTACK;
sa.sigaction = catch_sigsegv;
diag = sigaction(SIGSEGV, &sa, NULL);
if (diag == -1) {
// handle error
}


scott