From: Taras Glek on
On 04/11/2010 09:43 PM, drepper(a)gmail.com wrote:
> On Sun, Apr 11, 2010 at 19:27, Wu Fengguang <fengguang.wu(a)intel.com>
> wrote:
>> Yes, every binary/library starts with this 512b read. It is requested
>> by ld.so/ld-linux.so, and will trigger a 4-page readahead. This is not
>> good readahead. I wonder if ld.so can switch to mmap read for the
>> first read, in order to trigger a larger 128kb readahead.
>
> We first need to know the sizes of the segments and their location in
> the binary. The binaries we use now are somewhat well laid out. The
> read-only segment starts at offset 0 etc. But this doesn't have to be
> the case. The dynamic linker has to be generic. Also, even if we
> start mapping at offset zero, now much to map? The file might contain
> debug info which must not be mapped. Therefore the first read loads
> enough of the headers to make all of the decisions. Yes, we could do
> a mmap of one page instead of the read. But that's more expansive in
> general, isn't it?
Can this not be cached for prelinked files? I think it is reasonable to
optimize the gnu dynamic linker to optimize for an optimal layout
produced by gnu tools of the same generation.

Taras
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: drepper on
On Sun, Apr 11, 2010 at 19:27, Wu Fengguang <fengguang.wu(a)intel.com> wrote:
> Yes, every binary/library starts with this 512b read.  It is requested
> by ld.so/ld-linux.so, and will trigger a 4-page readahead. This is not
> good readahead. I wonder if ld.so can switch to mmap read for the
> first read, in order to trigger a larger 128kb readahead.

We first need to know the sizes of the segments and their location in the binary. The binaries we use now are somewhat well laid out. The read-only segment starts at offset 0 etc. But this doesn't have to be the case. The dynamic linker has to be generic. Also, even if we start mapping at offset zero, now much to map? The file might contain debug info which must not be mapped. Therefore the first read loads enough of the headers to make all of the decisions. Yes, we could do a mmap of one page instead of the read. But that's more expansive in general, isn't it?
From: Wu Fengguang on
On Mon, Apr 12, 2010 at 12:43:00PM +0800, drepper(a)gmail.com wrote:
> On Sun, Apr 11, 2010 at 19:27, Wu Fengguang <fengguang.wu(a)intel.com> wrote:
>> Yes, every binary/library starts with this 512b read.  It is requested
>> by ld.so/ld-linux.so, and will trigger a 4-page readahead. This is not
>> good readahead. I wonder if ld.so can switch to mmap read for the
>> first read, in order to trigger a larger 128kb readahead.
>
> We first need to know the sizes of the segments and their location
> in the binary. The binaries we use now are somewhat well laid out.
> The read-only segment starts at offset 0 etc. But this doesn't have
> to be the case. The dynamic linker has to be generic. Also, even
> if we start mapping at offset zero, now much to map? The file might
> contain debug info which must not be mapped. Therefore the first
> read loads enough of the headers to make all of the decisions. Yes,

I once read the ld code, it's more complex than I expected.

> we could do a mmap of one page instead of the read. But that's more
> expansive in general, isn't it?

Right. Without considering IO, a simple read(512) is more efficient than
mmap()+read+munmap().

Thanks,
Fengguang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Andi Kleen on
Taras Glek <tglek(a)mozilla.com> writes:

> Hello,
> I am working on improving Mozilla startup times. It turns out that
> page faults(caused by lack of cooperation between user/kernelspace)
> are the main cause of slow startup. I need some insights from someone
> who understands linux vm behavior.

I have an older patch to create dynamic bitmaps based on the last
run and only prefetch those pages.

It wasn't entirely a win for everything and didn't work for shared
libraries, but with some additional tuning the approach still has
potential I think, by combining memory saving with prefetching.

ftp://firstfloor.org/pub/ak/pbitmap/INTRO
http://halobates.de/dp2.pdf

For your use case the algorithm would likely need some glibc support.

-Andi

--
ak(a)linux.intel.com -- Speaking for myself only.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Andrew Morton on
On Mon, 05 Apr 2010 15:43:02 -0700
Taras Glek <tglek(a)mozilla.com> wrote:

> To make matters worse,
> the compile-time linker + gcc lay out code in a manner that does not
> correspond to how the resulting executable will be executed(ie the
> layout is basically random).

Yes, the linker scrambles the executable's block ordering.

This just isn't an interesting case. World-wide, the number of people
who compile their own web browser and execute it from the file which ld
produced is, umm, seven.

So I'd suggest that you always copy the executable to a temp file and
mv it back before running any timing tests.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/