From: Earl Chew on
I'm looking for some advice to focus my investigation.

I'm using 2.6.31 kernel on PowerPC with glibc version 2.7.

I've been looking into some anomalous behaviour with a program
that uses clone(2).

I've narrowed down the problem to interaction between
the program and the following null clone:

int nullClone(void*)
{
return 0;
}

....

pid_t childPid = clone(nullClone, stackPointer,
CLONE_VM | SIGCHLD,
0, 0, 0, 0);

waitpid(childPid, &childStatus);


As you can see, the null clone is essentially a nop.


Commenting /* CLONE_VM | */, leaving only SIGCHLD (aka null fork(2))
makes the following problem to go away.


The problem I see is that subsequent to the clone(2):

pthread_mutex_lock(parentMutex);
...
pthread_mutex_unlock(parentMutex);

/* Null clone here */

pthread_mutex_lock(parentMutex);
...
pthread_mutex_unlock(parentMutex); <---- Gets stuck here.


The mutex in question is created with PTHREAD_PRIO_INHERIT.

There are a few more details regarding null threads which
I won't get into just yet. I need to try to distill the
problem into a smaller program.


I'm suspicious because I believe the null clone should not
have any effect on the caller -- but obviously does, and in
a way I don't understand.


Do you have suggestions as to where I should look next to explain
this anomalous behaviour ?

What effects might the null clone have on the mutex implementation
that I am not accounting for ?


Earl
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/