From: rounak on
I need a little help regarding lock mechanisms in Linux.
I want to participate in Linux Kernel development.

I am facing an issue.
Could you please help/suggest me.

Problem Scenario:

Same file is being shared by multiple processes.
Any process can create the file (multiple owner case).
Whenever a process comes up, first it tries to create the file with
the exclusive flag
and if file exists, it opens the file in read/write mode.
Whenever a file is created first time, a header is initialized.
Header has a magic number and check-sum fields to validate the
contents of header.
If magic number and check-sum fields are corrupted then the process
reinitializes the header
and dumps the running core to find out the root cause.

If a process P1 comes up and creates a file "xyz", but context-switch
happened
before it can finish the header initialization.
Process P2 comes, it tries to create the file "xyz" with exclusive
flag and it failed with EEXIST.
Then P2 opens the file "xyz" in read/write mode and reads the magic
number and check-sum fields to
validate the header. Since P1 hasn't finished the header
initialization these fields will be containing garbage values.

How can P2 concludes that header initialization is in progress and it
is not a memory corruption ?

Solutions:

1) First Solution: I can add a INITIALIZATION_IN_PROGRESS field in a
header.
What if P1 got context-switched just after creating the file
"xyz" ?
P1 wasn't able to write anything in the file.
Above approach will not be able to solve this corner case.

2) Second Solution: I can use the semaphores/mutexes to synchronize
between two processes.
If a process dies while holding up a lock kernel will not release
the lock.
To overcome this I can use timer based approach, but I *think* its
not a clean approach.

3) I can use flock() to lock a dummy file, before creating/opening the
file "xyz".
Whenever P1 finishes the header initialization it will release the
lock and delete the dummy file.
This approach will solve the above problem.

P1 will create/open the dummy file "abc" (without exclusive
flag), and use flock to acquire the lock on file "abc".
Then it carries on with the normal behavior (creating/opening the
file "xyz" and initializing header).
P1 will release the lock on file "abc", when it is finished with
initializing header.

P2 got a chance to run before P1 can finish the header
initialization. First P2 will open/create the file
"abc" (without exclusive flag) and then try to acquire the lock
on file "abc" using flock() using blocking call.
Since P1 has already acquired the lock, P2 will wait.

If P1 dies/restarts during header initialization, lock on file
"abc" will be removed by the kernel and P2 will aquire the
lock on file "abc". Then P2 will open the file "xyz" (since it is
already created by P1) and check the magic number and
check-sum fields. Since P1 hasn't finished the header
initialization these fields will be containing garbage values.
P2 will still not able to idenity whether it is a memory
corruption or another process hasn't finished header initialization.
But it is a rare scenario.

4) I can use pulse based approach, but in that case there is a
overhead of creating/maintaining
one more thread which will wait for the pulse.

Which one of the above solution is best on basis of perfromance ?
Could you please point me to any other alternatives, if any.

Thanks in Advance,
Rounak Kakkar.