From: Eric Sosman on
On 4/14/2010 5:42 PM, Rachit Agrawal wrote:
> On Apr 15, 12:05 am, Chris Friesen<cbf...(a)mail.usask.ca> wrote:
>> On 04/14/2010 05:07 AM, Rachit Agrawal wrote:
>>
>>> Hi,
>>
>>> I have a scenario where multiple process wants to write to a single
>>> file. I was wondering what would be the best method to achieve this?
>>
>>> 1. Message Queues.
>>> 2. Memory Mapped Files
>>> 3. Or any other method.
>>
>>> Speed is the main concern. Can someone please throw some light on
>>> this?
>>
>> The obvious solution is just having them all open the file and write to
>> it using the normal file operations. You will want to use file locking
>> to ensure that you don't get multiple processes trampling each other's data.
>>
>> Do you have any indications that performance is a problem with the
>> obvious solution?
>>
>> Chris
>
> @Ersek: Sorry I couldn't understand your point much.
>
> @Chris, @Eric: I have read that File locking is generally slow. So,
> instead of implementing all the possible ways, I wanted to know if
> there is any known faster method than the traditional file locking.
> Without implementing all I don't know how to compare the performance.
>
> P.S: The amount of data I will be writing will be huge. Around 4-5 GB
> of data.

Your first message told us

- there's one file
- it's written by multiple processes

In this follow-up you've told us a little more

- around 4-5GB will be written
- you've read something about file locking

Do you seriously think that these crumbs of information amount
to enough data to support a recommendation? Or even a semi-
intelligent analysis? If you keep us all in ignorance, you'll
get ignorant answers.

Please describe what you're trying to do.

--
Eric Sosman
esosman(a)ieee-dot-org.invalid
From: Chris Friesen on
On 04/14/2010 03:42 PM, Rachit Agrawal wrote:

> @Chris, @Eric: I have read that File locking is generally slow. So,
> instead of implementing all the possible ways, I wanted to know if
> there is any known faster method than the traditional file locking.
> Without implementing all I don't know how to compare the performance.
>
> P.S: The amount of data I will be writing will be huge. Around 4-5 GB
> of data.

It's a good design principle to implement it simply first and only
optimize if you need to. Do you have any indication that the simple
solution would not be satisfactory?

If you're writing massive amounts of data then the disk itself is likely
going to be a bottleneck. It's very possible that anything you do in
the apps is totally overshadowed by limitations of disk speed.

Are the accesses random or sequential? You'll want to advise the OS
appropriately.

Is this general-purpose software or do you have additional knowledge
about the underlying system? (Do you have the option to select a
filesystem or has that been chosen already? Has the hardware been chosen?)

Chris
From: Rachit Agrawal on
On Apr 15, 3:28 am, Chris Friesen <cbf...(a)mail.usask.ca> wrote:
> On 04/14/2010 03:42 PM, Rachit Agrawal wrote:
>
> > @Chris, @Eric: I have read that File locking is generally slow. So,
> > instead of implementing all the possible ways, I wanted to know if
> > there is any known faster method than the traditional file locking.
> > Without implementing all I don't know how to compare the performance.
>
> > P.S: The amount of data I will be writing will be huge. Around 4-5 GB
> > of data.
>
> It's a good design principle to implement it simply first and only
> optimize if you need to.  Do you have any indication that the simple
> solution would not be satisfactory?
>
> If you're writing massive amounts of data then the disk itself is likely
> going to be a bottleneck.  It's very possible that anything you do in
> the apps is totally overshadowed by limitations of disk speed.
>
> Are the accesses random or sequential?  You'll want to advise the OS
> appropriately.
>
> Is this general-purpose software or do you have additional knowledge
> about the underlying system?  (Do you have the option to select a
> filesystem or has that been chosen already?  Has the hardware been chosen?)
>
> Chris

Here is what I exactly I need:

Multiple processes will be writing to the file, there will be no
reading. And all will be appending at the end of the file. And the
access to this file to write can be radom, each program is a c file,
they do some computation and based on that they write something on the
file. I was also trying Memory mapped files and Message Queues.

Message Queues seemed a comparatively better solution, because it
seems synchronization is taken care automatically(Though I am not very
sure of this) and hence I feel it will be faster. But again I am not
sure.

I have already selected the file system, it is ext3 and OS is CentOS.
Physical Disk won't be a problem, as I am working on a server, with
750 GB hard-disk and 8GB RAM.

Thanks
From: David Schwartz on
On Apr 14, 9:21 pm, Rachit Agrawal <rachitsw...(a)gmail.com> wrote:

> Multiple processes will be writing to the file, there will be no
> reading. And all will be appending at the end of the file. And the
> access to this file to write can be radom, each program is a c file,
> they do some computation and based on that they write something on the
> file. I was also trying Memory mapped files and Message Queues.

How large will each block of data that cannot be interrupted by part
of another block be? And how often will they be written? Will some
process be writing most of the time? Or will no processes be writing
most of the time?

> Message Queues seemed a comparatively better solution, because it
> seems synchronization is taken care automatically(Though I am not very
> sure of this) and hence I feel it will be faster. But  again I am not
> sure.

I can't see how a message queue would help you. If you need a giant
file with all the data in it, how will a message queue help you? And
if you don't, why would you even be considering making one?

DS
From: Rachit Agrawal on
On Apr 15, 10:09 am, David Schwartz <dav...(a)webmaster.com> wrote:
> On Apr 14, 9:21 pm, Rachit Agrawal <rachitsw...(a)gmail.com> wrote:
>
> > Multiple processes will be writing to the file, there will be no
> > reading. And all will be appending at the end of the file. And the
> > access to this file to write can be radom, each program is a c file,
> > they do some computation and based on that they write something on the
> > file. I was also trying Memory mapped files and Message Queues.
>
> How large will each block of data that cannot be interrupted by part
> of another block be? And how often will they be written? Will some
> process be writing most of the time? Or will no processes be writing
> most of the time?

It would be 1 long, 1 short and 1 char so in total around (7 bytes).
They will be written very often, say 1 write after every 100
instructions. No process would be writing most of the time.
>
> > Message Queues seemed a comparatively better solution, because it
> > seems synchronization is taken care automatically(Though I am not very
> > sure of this) and hence I feel it will be faster. But  again I am not
> > sure.
>
> I can't see how a message queue would help you. If you need a giant
> file with all the data in it, how will a message queue help you? And
> if you don't, why would you even be considering making one?
>
This is the kind of model, I was thinking about, where P0, P1, and P2
will be writing to the message queue and Pn would be reading from the
message queue and dumping it into the file.
P0 _
\
P1 --- -----MQ --- Pn --->file
_ /
P2

I was just trying it out because I read it somewhere it would be a
good method to accomplish my task. Also, since I read that normal File
locking are slowing, I was trying MQ out.

Thanks