From: Moi on
On Thu, 03 Sep 2009 23:53:35 -0700, guidoreina wrote:

> Morning,
>
> Yes, it is a bit confusing to have Windows at the beginning of the file,
> it would have been better to have it in another file, as they don't
> share anything. The POST++ catches SIGSEGV to detect when it is trying
> to access a page which is not loaded. When this program crashes, POST++
> tries to start the debugger... but the program is launched by another
> program and then we never get a core dump nor we can start the debugger.
>
> POST++ is "Persistent Object Storage for C++". "My" application uses
> POST++ to open and use a database of objects. All the applications which
> need to use the database, map it to the same address... weird. Why don't
> they just let the application load the file where mmap thinks is the
> right way? I don't know.
>
> I have done some other experiments. The application has a configuration
> file where you can define the base address for mmap and the size. I have
> reduced the size but still doesn't work, that is because POST++ uses
> then the size of the file. I have just run it with gdb. It opens the
> file as O_RDWR and it uses for mmap: prot = PROT_READ | PROT_WRITE.
> flags = MAP_PRIVATE | MAP_FIXED. mmap returns the base address =
> 0x62500000. I will try later (now they want me to do something else) to
> change it so it doesn't use a fixed address. I want to see whether it
> works without fixed address... I don't know whether the application will
> work.... if it does, I make the change temporarily to get rid of the >
> 250 MB of memory leaked per day ;).

I have been reading some more in POST++'s source.
I appears POST needs the signals. It uses them to create a copy of a
"page" (for rollback purposes) once it is accessed.

The application stores objects into the mmapped memory pool, since these
objects can contain pointers (to other objects in the pool), the adresses
of the objects need to be reliable between runs.
If more applications are accessing the same mmapped file, containing the
objects, they should all mmap the file to the *same* address. The easiest
way to guarantee this is to use a *fixed* address.
In other words: you should probably keep the fixed address.

In the last part of POST's .html documentation there are some hints for
debugging. Specifically the need to instruct gdb to *not* intercept
SIGSEGV. I presume valgrind does something similar to gdb, so you'll have
to find a way to instruct valgrind to ignore some of the signals, too.

That still does not solve the mystery where the EINVAL comes from, but
maybe the EINVAL is an artifact caused by the program's interaction with
valgrind. IMHO valgrind is the problem. (not of the memory leak, of
course)

HTH,
AvK
From: guidoreina on
Hi Moi,

thank you very much for all the information. I didn't look that much
into POST++ code, what I saw when I changed the base address was that
it had to recompute the references, that's why I guess it always has
to mmap to the same address. You might be right, all the applications
will have to mmap the file to the same address, not to make POST++ go
crazy.

The other day I was debugging some other application, which also uses
POST++, and it was always crashing when I run it under gdb, luckily by
accident I read that for debugging POST++ I had to:

handle SIGSEGV nostop noprint pass

in gdb, and then the program didn't crash anymore. Thank you for
pointing that out with the signals, though the program doesn't crash,
there might be something which valgrind doesn't like.

Regards,
Guido


On Sep 5, 1:25 pm, Moi <r...(a)invalid.address.org> wrote:
> On Thu, 03 Sep 2009 23:53:35 -0700, guidoreina wrote:
> > Morning,
>
> > Yes, it is a bit confusing to have Windows at the beginning of the file,
> > it would have been better to have it in another file, as they don't
> > share anything. The POST++ catches SIGSEGV to detect when it is trying
> > to access a page which is not loaded. When this program crashes, POST++
> > tries to start the debugger... but the program is launched by another
> > program and then we never get a core dump nor we can start the debugger..
>
> > POST++ is "Persistent Object Storage for C++". "My" application uses
> > POST++ to open and use a database of objects. All the applications which
> > need to use the database, map it to the same address... weird. Why don't
> > they just let the application load the file wheremmapthinks is the
> > right way? I don't know.
>
> > I have done some other experiments. The application has a configuration
> > file where you can define the base address formmapand the size. I have
> > reduced the size but still doesn't work, that is because POST++ uses
> > then the size of the file. I have just run it with gdb. It opens the
> > file as O_RDWR and it uses formmap: prot = PROT_READ | PROT_WRITE.
> > flags = MAP_PRIVATE | MAP_FIXED.mmapreturns the base address =
> > 0x62500000. I will try later (now they want me to do something else) to
> > change it so it doesn't use a fixed address. I want to see whether it
> > works without fixed address... I don't know whether the application will
> > work.... if it does, I make the change temporarily to get rid of the >
> > 250 MB of memory leaked per day ;).
>
> I have been reading some more in POST++'s source.
> I appears POST needs the signals. It uses them to create a copy of a
> "page" (for rollback purposes) once it is accessed.
>
> The application stores objects into the mmapped memory pool, since these
> objects can contain pointers (to other objects in the pool), the adresses
> of the objects need to be reliable between runs.
> If more applications are accessing the same mmapped file, containing the
> objects, they should allmmapthe file to the *same* address. The easiest
> way to guarantee this is to use a *fixed* address.
> In other words: you should probably keep the fixed address.
>
> In the last part of POST's .html documentation there are some hints for
> debugging. Specifically the need to instruct gdb to *not* intercept
> SIGSEGV. I presume valgrind does something similar to gdb, so you'll have
> to find a way to instruct valgrind to ignore some of the signals, too.
>
> That still does not solve the mystery where the EINVAL comes from, but
> maybe the EINVAL is an artifact caused by the program's interaction with
> valgrind. IMHO valgrind is the problem. (not of the memory leak, of
> course)
>
> HTH,
> AvK

From: Scott Lurndal on
Rainer Weikusat <rweikusat(a)mssgmbh.com> writes:
>scott(a)slp53.sl.home (Scott Lurndal) writes:
>> guidoreina <guidoreina(a)gmail.com> writes:
>>>Morning,
>>
>>>POST++ is "Persistent Object Storage for C++". "My" application uses
>>>POST++ to open and use a database of objects. All the applications
>>>which need to use the database, map it to the same address... weird.
>>>Why don't they just let the application load the file where mmap
>>>thinks is the right way? I don't know.
>>
>> Because they probably use absolute pointers in the mmap region
>> rather than using offsets from the mmap base. Lazy programmers.
>
>Or people with different preferences. Designed in such a way, the
>'object database' on disk can just be mapped into the address space of
>an application and immediatly yields a set of useful 'life'
>C++-objects.

You mean like Microsoft did with Office file formats? We all know
how well that worked.

Fact is, that using fixed addresses is neither portable nor future-proof;
the next release of the OS may not support mapping at an address a prior
version did.

Not to mention the issues noted by the OP.

scott
From: Rainer Weikusat on
scott(a)slp53.sl.home (Scott Lurndal) writes:
> Rainer Weikusat <rweikusat(a)mssgmbh.com> writes:
>>scott(a)slp53.sl.home (Scott Lurndal) writes:
>>> guidoreina <guidoreina(a)gmail.com> writes:
>>>>Morning,
>>>
>>>>POST++ is "Persistent Object Storage for C++". "My" application uses
>>>>POST++ to open and use a database of objects. All the applications
>>>>which need to use the database, map it to the same address... weird.
>>>>Why don't they just let the application load the file where mmap
>>>>thinks is the right way? I don't know.
>>>
>>> Because they probably use absolute pointers in the mmap region
>>> rather than using offsets from the mmap base. Lazy programmers.
>>
>>Or people with different preferences. Designed in such a way, the
>>'object database' on disk can just be mapped into the address space of
>>an application and immediatly yields a set of useful 'life'
>>C++-objects.
>
> You mean like Microsoft did with Office file formats? We all know
> how well that worked.

I have no idea what 'Microsoft did with the Office file formats'. I
recognize this type of 'argument', however, and happen to know a
nice parody coming from some Peanuts-strip, where Linus, during a
discussion of the relative merits of Beethoven, insofar I remember
this correctly, suddenly cries: "Beethoven never supported Hitler!".

> Fact is, that using fixed addresses is neither portable

For a suitable definition of 'portable', certainly. It is an
XSI-mandated UNIX(*)-extension and chances are that this is already
much more 'portable' than what would be needed. And a file created in
this way is obviously unsuitable for data interchange between
different systems. And it wreaks havoc onto the OpenBSD 'security
concept' of 'make programming more difficult in the hope that "they"
will never figure out how to pull our trowsers down in public again'.

But all of this is, first and foremost, hypothetical and may or may
not be relevant for any given situation. Files are not necessarily
used for data interchange, they may just provide 'persistent
storage'.

> nor future-proof; the next release of the OS may not support mapping
> at an address a prior version did.

In a strict sense, nothing is future-proof.

> Not to mention the issues noted by the OP.

Another valgrind-quirk? OMG ...
From: David Schwartz on
On Sep 4, 12:30 pm, Rainer Weikusat <rweiku...(a)mssgmbh.com> wrote:

> Or people with different preferences. Designed in such a way, the
> 'object database' on disk can just be mapped into the address space of
> an application and immediatly yields a set of useful 'life'
> C++-objects. Some people always believe certain features really
> shouldn't exist. But insofar they do, there is obviously no consensus.

That something can be done correctly does not excuse when it is done
incorrectly. This is the latter case, since the program fails when the
'mmap' fails. This is a case where the 'mmap' is allowed to fail and
the program is not.

DS