From: Stephan Ceram on
Hi,

in a book I've found this about the compiler optimization
"inline expansion":

>>A function is more likely to be inlined by a static inliner
if it is only called by one procedure, as this increases spatial
locality in the instruction cache without a decrease in temporal
locality.<<

What I don't understand is why inlining does not decrease temporal
locality in the I-cache. When I inline a function than I actually
have always a loss of temporal locality:

ins0
call X
ins1

When X is inlined here then it is very likely that the instruction
ins0 and ins1 being executed in the near future will be moved
far away from each other and so possibly two I-cache accesses are
required to get ins0 and ins1 while in the non-inlined function
ins0,call and in1 could fit in one cache line.

Do you have any idea why the book might be right with the statement
about the non-decreased temporal locality?

Regards,
Chris
From: Ben Bacarisse on
Stephan Ceram <linuxkaffee_@_gmx.net> writes:

> Hi,
>
> in a book I've found this about the compiler optimization
> "inline expansion":
>
>>>A function is more likely to be inlined by a static inliner
> if it is only called by one procedure, as this increases spatial
> locality in the instruction cache without a decrease in temporal
> locality.<<
>
> What I don't understand is why inlining does not decrease temporal
> locality in the I-cache. When I inline a function than I actually
> have always a loss of temporal locality:
>
> ins0
> call X
> ins1

The spacial locality is the addressing distance between executed
instructions. Inlining X will increase the "spacial" locality in the
instruction cache because there will be no need to jump to possibly
distant X and back. The temporal locality is, presumably, the time
gap between the execution of various instructions. Inlining X does
not obviously increase the gap in time between executing any of the
instructions involved. In fact it is likely to reduce it a bit (the
gap between ins0 and ins1 is likely to be slightly shorter, rather
than longer).

> When X is inlined here then it is very likely that the instruction
> ins0 and ins1 being executed in the near future will be moved
> far away from each other

But only in addressing distance, not temporal distance.

> and so possibly two I-cache accesses are
> required to get ins0 and ins1 while in the non-inlined function
> ins0,call and in1 could fit in one cache line.

Again, this is about the addressing gap between them. Of course, the
non-local jump implied by the call is likely to clear the instruction
cache anyway.

--
Ben.