From: David Hopwood on
andrewspencers(a)yahoo.com wrote:
> Terje Mathisen wrote:
>
>>I have written code like that, and it is indeed possible to get into
>>situations where you have a _lot_ of exit tests. However, since these
>>are all different, a "silent/parallel/interrupt-based" version of it
>>wouldn't work at all, unless you could setup every single one of these
>>tests in a list of stuff to be monitored.
>
> [snip]
>
>>If you had Nick's perfect world where user processes could register to
>>handle hw exceptions, then your ideas could make a little more sense,
>>but still only when you have a lot of different code locations that can
>>share the same logic.
>>
>>I.e. stack overflow fixups is a good example, loop processing isn't.
>
> In that case, wouldn't bignum code, with a lot of arithmetic overflow
> tests, be a good example?

Bignum code doesn't have a lot of cases in which arithmetic overflow is
exceptional. It tests the carry flag (or equivalent) a lot, but carries
are not a sufficiently uncommon case that it would make sense to use a
trap to detect them.

A slightly different situation is where you have code that in practice
always handles integers that fit in a single word, but that can't be
statically guaranteed to do so, and the language specification says that
bignum arithmetic must be supported -- the obvious example being Smalltalk.
There were some attempts to support this in hardware (e.g. "Smalltalk on
a RISC"; also something on SPARC that I can't remember the details of),
but it turned out to be easier and faster for implementations of Smalltalk
and similar languages to use other tricks that don't require hardware support.

--
David Hopwood <david.nospam.hopwood(a)blueyonder.co.uk>
From: David Hopwood on
chl wrote:
> <andrewspencers(a)yahoo.com> wrote:
>
>>>In your scenario, what is going to happen when an "interrupt exited" loop
>>>wants to call another "interrupt exited" loop?
>>
>>Isn't this the same as "what happens when a loop using register x wants
>>to call another loop which uses x (in an incompatible way)?"?
>
> Yes, and this eventually means the caller or the callee have to save the current
> interrupt vector. You still believe this is efficient?

If this optimization pays off at all (which I doubt), it only pays off for
innermost loops.

--
David Hopwood <david.nospam.hopwood(a)blueyonder.co.uk>
From: glen herrmannsfeldt on
andrewspencers(a)yahoo.com wrote:

(snip)

> I assumed that David was referring to a non-protected mode system, in
> which a program is responsible for checking itself.
> True, in a protected mode system, with the OS responsible for checking
> the program, interrupt-based checking is necessary in order to operate
> the virtual memory system, but from the program's perspective, the
> virtual memory system doesn't exist, and the associated interrupts are
> invisible. In contemporary systems programs don't have arbitrary access
> even to their own address spaces, but in principle they could (with
> access attempts to unallocated pages simply resulting in the system
> allocating them, so that the program sees its own entire address space
> as preallocated memory),

There are some interesting examples in IBM's VM series of mainframe
operating systems. I am not sure which version it appeared in, but
some have copy on write for modifying system pages. To speed up
loading, CMS (and any other guest, if desired) can be stored as a
saved system, allowing its pages to be shared among different virtual
machines. As a users are allowed to do just about anything to their
own virtual machine, they can also write into shared pages. At that
point VM makes a non-shared copy of the page for that virtual machine,
as you say transparent to the user program (guest OS).

> so that from the system's perspective, those
> programs are incapable of making any mistakes. In that case, David's
> remarks would apply even for a program running in a protected mode
> system; the program would be responsible for checking its own null
> pointers, stack overflows, internally-allocated buffer overruns, etc,
> either by test-branch code or by some theoretical user-mode interrupt
> system, since the OS wouldn't know or care if the program screwed
> itself up.

It is not only stack space. It seems that on many systems now the
space returned by C's malloc() isn't actually allocated until used.
In some cases the page table entries all point to a single zeroed
page until a page is actually modified. I believe it is done by the
OS, transparent to the user program and C library.

-- glen

From: glen herrmannsfeldt on
David Hopwood wrote:

> glen herrmannsfeldt wrote:

(snip on interrupt loop termination)

>> Well, how about an imprecise interrupt, where possibly one or more
>> additional loop cycles will be executed.

> For it to be feasible to recover from the execution of those additional
> loop cycles in software, it's necessary for the loop to have no
> irreversible side effects (including side effects internal to
> the language implementation, even if the code is pure functional).
> In some cases, you could record what side effects would have
> been performed in the loop and then do them afterward,
> or you could attempt to undo side effects. But in general this
> optimization is probably too hard. Just to save one
> predicted-not-taken conditional branch per loop iteration,
> it's not worth it.

It could only be done if the user program asked that it be done.
I was thinking about string searching where the interrupt (or other
match indication) might come a little late. The result, then, is that
a small correction is made to the match pointer. I suppose similar
to the idea of branch delay slots, consider the case where the loop
increment or compare is in the branch delay slot.

-- glen

From: Terje Mathisen on
Frode Vatvedt Fjeld wrote:

> Terje Mathisen <terje.mathisen(a)hda.hydro.com> writes:
>
>
>>I really can't see any way implicit interrupt handling would help
>>with _any_ of this code though!
>
> FWIW, I recently implemented in my compiler that addidion of two
> addends known to be immediates (i.e. through type inference) and where
> the result might overflow into the bignum range, compiles to something
> like ADDL eax ebx, INTO. While I hope there's something to be gained
> by avoiding the conditional branch, perhaps equally important is the
> gain in code-size; I think some 10-15 bytes of machine code are saved
> here, for what is a fairly common operation.

Hmmm ...

At the ADD site:

add eax,[esi]
jo local_handler
....
local_handler:
call global_handler
jmp return_to_normal_code

Yeah, it seems like this would use 9 bytes of code, vs just 1 for INTO.

Terje
--
- <Terje.Mathisen(a)hda.hydro.com>
"almost all programming can be viewed as an exercise in caching"