From: cerr on
On Jan 15, 7:19 pm, cerr <ron.egg...(a)gmail.com> wrote:
> On Jan 15, 5:18 pm, la...(a)ludens.elte.hu (Ersek, Laszlo) wrote:
>
> > In article <x954n.11461$0X.9...(a)news.usenetserver.com>, sc...(a)slp53.sl.home (Scott Lurndal) writes:
> > >   sa.sa_flags = SA_ONSTACK;
> > >   sa.sigaction = catch_sigsegv;
>
> > I'd risk
>
> >   sa.sa_flags = SA_ONSTACK | SA_SIGINFO;
> >   sa.sa_sigaction = catch_sigsegv;
>
> okay, I'll try this asap - maybe even tonight - depending on how my
> evening develops ;)
> However, why are we not doing a
> signal(SIGSEGV, catch_sigsegv);
> to catch the signal? Is that cause I'll be missing the stack which I
> guess I'll get to keep with this:
>  ss.ss_sp = malloc(SIGSTKSZ);
>   assert(ss.ss_sp != NULL);
>   ss.ss_size = SIGSTKSZ;
>   ss.ss_flags = 0;
>   diag = sigaltstack(&ss, NULL);
> ???
> --
> roN

In my best intention, I cannot find where SIG_EIP or SIG_RIP would be
defined...:o i grepped in the /usr/include directory and didn't get
any results, aso googled and didn't really find anything... :(
From: cerr on
On Jan 15, 9:23 pm, cerr <ron.egg...(a)gmail.com> wrote:
> On Jan 15, 7:19 pm, cerr <ron.egg...(a)gmail.com> wrote:
>
>
>
>
>
> > On Jan 15, 5:18 pm, la...(a)ludens.elte.hu (Ersek, Laszlo) wrote:
>
> > > In article <x954n.11461$0X.9...(a)news.usenetserver.com>, sc...(a)slp53.sl.home (Scott Lurndal) writes:
> > > >   sa.sa_flags = SA_ONSTACK;
> > > >   sa.sigaction = catch_sigsegv;
>
> > > I'd risk
>
> > >   sa.sa_flags = SA_ONSTACK | SA_SIGINFO;
> > >   sa.sa_sigaction = catch_sigsegv;
>
> > okay, I'll try this asap - maybe even tonight - depending on how my
> > evening develops ;)
> > However, why are we not doing a
> > signal(SIGSEGV, catch_sigsegv);
> > to catch the signal? Is that cause I'll be missing the stack which I
> > guess I'll get to keep with this:
> >  ss.ss_sp = malloc(SIGSTKSZ);
> >   assert(ss.ss_sp != NULL);
> >   ss.ss_size = SIGSTKSZ;
> >   ss.ss_flags = 0;
> >   diag = sigaltstack(&ss, NULL);
> > ???
> > --
> > roN
>
> In my best intention, I cannot find where SIG_EIP or SIG_RIP would be
> defined...:o i grepped in the /usr/include directory and didn't get
> any results, aso googled and didn't really find anything... :(

Ah Okay,

I got this defined to 14...

And anyways, i got the app compiled and loaded onto the target and
this is what's happening now:
on SIGSEGV , strace starts going crazy printing continously "[pid
522] --- SIGSEGV (Segmentation fault) @ 0 (0) ---" and doesn't stop
but my app stops working properly because I - assume - one of the main
threads segfaulted out... in syslog i don' see anything even tho my
handler function looks like:

void catch_sigsegv(int signo, siginfo_t *sip, void *ucp)
{
ucontext_t *uconp = (ucontext_t *)ucp;
syslog(LOG_ERR,"failure at 0x%lx\n", uconp->uc_mcontext.gregs
[REG_EIP]);
}

Any clues? :(
From: David Schwartz on
On Jan 15, 9:42 pm, cerr <ron.egg...(a)gmail.com> wrote:

> void catch_sigsegv(int signo, siginfo_t *sip, void *ucp)
> {
>   ucontext_t *uconp = (ucontext_t *)ucp;
>   syslog(LOG_ERR,"failure at 0x%lx\n", uconp->uc_mcontext.gregs
> [REG_EIP]);
>
> }
>
> Any clues? :(

You can only call safe functions in a signal handler. I doubt 'syslog'
is a safe function.

DS
From: David Schwartz on
On Jan 15, 11:47 am, cerr <ron.egg...(a)gmail.com> wrote:

> > Catch the SIGSEGV and call 'abort'.

> Huh, IO don't understand. Where would I call abort().

In the signal handler.

> > If you're using older versions of Linux that make crappy multi-
> > threaded core dumps, call 'fork' and then call 'abort' in the child.

> Using 2.6.15.1 - how does it make sense to call abort in the child? I
> don't understand.

Some versions of Linux/glibc make crappy multi-threaded core dumps. If
you catch the SIGSEGV and call the low-level 'fork', the child will be
a single-threaded process in which only the faulting thread exists. If
you call 'abort' in that process, you'll have a non-multi-threaded
core dump, which you may have an easier time analyzing.

> > Note that you must call the operating system's 'fork' directly. You
> > cannot call the library's 'fork' as it will try to invoke fork
> > handlers and acquire locks that is not safe to do in a signal handler.
> > But Linux's 'fork' system call (which you can call as __libc_fork) is
> > safe in a signal handler. I also recommend following it with something
> > to ensure the parent doesn't dump core on top of the child.
>
> The way the app is currently forking is:
>
>     /*already a daemon */
>     if (getppid() == 1)
>         return;
>
>     /*lit it up */
>     i = fork();
>     if (i < 0)
>         exit(1);
>     if (i > 0)
>         exit(0);
>
>     setsid();
>
> I'm not quite sure how changing this would help me in finding where
> the SIGSEGV is happening... :o

It won't. I'm talking about what you do when you catch a SIGSEGV, not
what you do already in unrelated code.

DS