interrupting for overflow and loop termination [Computer Architecture]

Prev: Intel x86 memory model question
Next: C++: 64 bit performance vs. 32 bit

From: glen herrmannsfeldt on 12 Sep 2005 00:20

David Hopwood wrote:

(snip regarding interrupt loop termination)

> If the speculative execution circuitry is "overloaded", then the same is
> likely to be true of the interrupt checking under the same circumstances.
> Checking for interrupts (which then need to be delivered precisely) is
> not free; in terms of complexity it is similar to speculative execution.
> That is, you strongly speculate that the interrupt is not taken on each
> instruction, and have to recover if it is. The difference is that you need
> dedicated hardware resources to check the interrupt conditions on *every*
> instruction, rather than using shared resources to do it only when the
> condition is relevant.

Well, how about an imprecise interrupt, where possibly one or more
additional loop cycles will be executed. At interrupt time the program
has to correct for the additional cycles. That might make it faster,
but then it could also be done with a normal loop instruction.

-- glen

From: glen herrmannsfeldt on 12 Sep 2005 00:41

David Hopwood wrote:

> But to answer the question, some examples of where this approach can be
> effective are using unmapped guard pages to detect stack overflows,
> exhaustion of an allocation space, or null pointer accesses in
> languages that require null pointer checks.

> Note that these are cases in which the relevant operation
> (allocating a stack frame or object, indirecting through a pointer)
> is extremely common, and the interrupt-based optimization is in
> "infrastructure" code so that a single implementation can apply
> to many programs. Even then it may not pay off: hacks
> like this should be reevaluated every so often to see
> whether they are really giving a performance improvement
> that justifies their complexity.

They should, but a protected mode system must check for invalid
addresses, and a virtual memory system must check for a valid
page table entry.

-- glen

From: Terje Mathisen on 12 Sep 2005 03:47

andrewspencers(a)yahoo.com wrote:

> Terje Mathisen wrote:

>>d) How do you propose to generate loops unless you have some kind of (in
>>your case unconditional) branch at the bottom?
>
> Yes, I proposed using an unconditional branch at the bottom.

This costs _exactly_ as much as a conditional jump, assuming both are
correctly predicted to be taken. (No, it isn't true that unconditional
branches are free.)

The exit comparison logic has to be there anyway, either you run it
after/during every single clock cycle, or you just do it once, when the
CMP reg,exit_value opcodes turns up.

> Another case would be a linear search for which the target string is
> generally very long and there are a lot of partial matches within the
> search space.

I have written code like that, and it is indeed possible to get into
situations where you have a _lot_ of exit tests. However, since these
are all different, a "silent/parallel/interrupt-based" version of it
wouldn't work at all, unless you could setup every single one of these
tests in a list of stuff to be monitored.

> Another case, I think (but I'm too tired right now to think clearly
> enough to be sure), would be a relational join algorithm.
>
> Is my analysis accurate?

No.

>
>>Noting my argument (e), the overhead of inserting INTO opcodes at
>>relevant sites in the code can be close to zero, since the cpu can treat
>>it like a "strongly predicted to fall through" branch, i.e. it doesn't
>>even need to take up any branch predictor buffer space.
>
> But I'm not proposing using INTO (a soft interrupt); I'm proposing
> using a hard interrupt, which works the same way that breakpointing
> works.

If you had Nick's perfect world where user processes could register to
handle hw exceptions, then your ideas could make a little more sense,
but still only when you have a lot of different code locations that can
share the same logic.

I.e. stack overflow fixups is a good example, loop processing isn't.

Terje

--
- <Terje.Mathisen(a)hda.hydro.com>
"almost all programming can be viewed as an exercise in caching"

From: chl on 12 Sep 2005 03:54

<andrewspencers(a)yahoo.com> wrote in message news:1126320161.417289.140570(a)z14g2000cwz.googlegroups.com...

> Thus for a high-repetition loop, an interrupt-triggered exit is better
> than a compare-and-branch exit.

In your scenario, what is going to happen when an "interrupt exited" loop
wants to call another "interrupt exited" loop?

>

From: andrewspencers on 12 Sep 2005 04:50

glen herrmannsfeldt wrote:
> David Hopwood wrote:
>
>
> > But to answer the question, some examples of where this approach can be
> > effective are using unmapped guard pages to detect stack overflows,
> > exhaustion of an allocation space, or null pointer accesses in
> > languages that require null pointer checks.
>
> > Note that these are cases in which the relevant operation
> > (allocating a stack frame or object, indirecting through a pointer)
> > is extremely common, and the interrupt-based optimization is in
> > "infrastructure" code so that a single implementation can apply
> > to many programs. Even then it may not pay off: hacks
> > like this should be reevaluated every so often to see
> > whether they are really giving a performance improvement
> > that justifies their complexity.
>
> They should, but a protected mode system must check for invalid
> addresses, and a virtual memory system must check for a valid
> page table entry.
I assumed that David was referring to a non-protected mode system, in
which a program is responsible for checking itself.
True, in a protected mode system, with the OS responsible for checking
the program, interrupt-based checking is necessary in order to operate
the virtual memory system, but from the program's perspective, the
virtual memory system doesn't exist, and the associated interrupts are
invisible. In contemporary systems programs don't have arbitrary access
even to their own address spaces, but in principle they could (with
access attempts to unallocated pages simply resulting in the system
allocating them, so that the program sees its own entire address space
as preallocated memory), so that from the system's perspective, those
programs are incapable of making any mistakes. In that case, David's
remarks would apply even for a program running in a protected mode
system; the program would be responsible for checking its own null
pointers, stack overflows, internally-allocated buffer overruns, etc,
either by test-branch code or by some theoretical user-mode interrupt
system, since the OS wouldn't know or care if the program screwed
itself up.

First | Prev | Next | Last
Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Prev: Intel x86 memory model question
Next: C++: 64 bit performance vs. 32 bit