From: Skybuck Flying on
Hello,

I just had an idea how to protect the return address on the stack.

The call instruction could make that region "write protected".

The return instruction would then remove the "write protection".

This would not prevent buffers overruns persee, but it would at least
prevent the return address from being overwritten, thereby potentially
avoiding attacks.

The idea is so simple that even this patent says it's simple and obvious...
I haven't bothered reading the whole thing:

http://www.faqs.org/patents/app/20090063801

But what's the deal this ? ;)

Bye,
Skybuck.


From: Skybuck Flying on

"Mike Hore" <mike_horeREM(a)OVE.invalid.aapt.net.au> wrote in message
news:htpsov$2hl$1(a)news.eternal-september.org...
> Skybuck Flying wrote:
>> Hello,
>>
>> I just had an idea how to protect the return address on the stack.
>>
>> The call instruction could make that region "write protected".
>>
>> The return instruction would then remove the "write protection".
>>
>> This would not prevent buffers overruns persee, but it would at least
>> prevent the return address from being overwritten, thereby potentially
>> avoiding attacks.
>>
>> The idea is so simple that even this patent says it's simple and
>> obvious... I haven't bothered reading the whole thing:
>>
>> http://www.faqs.org/patents/app/20090063801
>
> I didn't want to wade through the whole thing either, but I'm wondering,
> what happens when the called routine isn't a leaf, and calls another
> subroutine? What happens to THAT return address? This idea would seem to
> need a granualarity of one address for the protection mechanism. Certainly
> not a page, unless you waste a lot of space on the stack.

I would imagine that every bit has a protection bit, and therefore it should
not be a problem...

Bye,
Skybuck.


From: MitchAlsup on
The only way this has a chance of working (on existing hardware) if
for the minimum stack frame to be at least 1 page in size, and for the
applications to run with write access to the memory management tables.

Thus, there is no chance for this ever happening.

Mitch
From: Paul A. Clayton on
On May 29, 5:51 pm, MitchAlsup <MitchAl...(a)aol.com> wrote:
> The only way this has a chance of working (on existing hardware) if
> for the minimum stack frame to be at least 1 page in size, and for the
> applications to run with write access to the memory management tables.
>
> Thus, there is no chance for this ever happening.
>
> Mitch

A (probably crazy) idea that came to me was to use ECC bit inversion
to indicate a special "word". (This assumes that inverting all the
extra
bits would not significantly reduce error protection. It also
assumes
that the granularity of specialness is equal to the granularity ECC at
all levels of cache. [Normal reads could also handle specially
encoded words. Non-managed sub-(ECC)-word writes would be a
problem. Flushing to non-ECC main memory (or disk swap!) would
also be a problem--treating the entire content of clean data cache
block as special {ECC inversion of all words in block} {with a PTE
attribute such could be limited to stack pages though that might
have little benefit}, creating some vulnerability. For single
special
word per cache block, this could perhaps be handled with the
addition of one bit per cache block to indicate the presence of a
special word--on a write back to non-special-able memory if a
special block does not contain any special words an exception is
generated.])

(This metadata storage technique could have other uses.
[ECC storage of metadata seems a common proposal, though
the version I have been exposed to used protection over a
larger chunk of data to free some bits. The above technique
might be especially appropriate for encoding 'poison' bits.])

BTW, writing to the memory management tables need not be a
problem--x86 supports hardware updating accessed and modified
bits. Having a return address write set a bit and a read clear a bit
would not seem to be a problem. Page granular protection could
be a problem. :-)

(Skimming part of the patent application, the idea seems to be to
hold a collection of protected addresses in a separate hardware
structure. That seems like a lot of overhead for a single issue.)

A saner calling convention--function-scoped [i.e., stack] unchecked
arrays being allocated to a separate FILO buffer--would contain
this particular buffer overrun issue. (There seem to be abundant
proposed solutions for this problem. E.g., use the return address
predictor to validate return addresses or check for a call
instruction
immediately preceding the return address--both mentioned on
comp.arch.)
From: MitchAlsup on
On May 29, 8:45 pm, "Paul A. Clayton" <paaronclay...(a)embarqmail.com>
wrote:
> On May 29, 5:51 pm, MitchAlsup <MitchAl...(a)aol.com> wrote:
>
> > The only way this has a chance of working (on existing hardware) if
> > for the minimum stack frame to be at least 1 page in size, and for the
> > applications to run with write access to the memory management tables.
>
> > Thus, there is no chance for this ever happening.
>
> > Mitch
>
> A (probably crazy) idea that came to me was to use ECC bit inversion
> to indicate a special "word".  (This assumes that inverting all the
> extra
> bits would not significantly reduce error protection.

Modern processors utilize background HW scrubbers that periodically
rumage through the caches and fix single bit ECC errors before they
become double bit (or worse) errors. How are you going to inform these
scrubbers that you don't want certain cached lines scrubbed? Start
with multibit errors and trust that machine check software can figure
out that the user has farted around with the page tables?

nadda gonna 'happen.

> BTW, writing to the memory management tables need not be a
> problem--x86 supports hardware updating accessed and modified
> bits.  Having a return address write set a bit and a read clear a bit
> would not seem to be a problem.  Page granular protection could
> be a problem. :-)

Just what kind of operating system is going to let user-level software
write page table entries?
Answer: none that even purports to be safe or robust. User-level
software is not even supposed to know that page tables even exist!

That is the whole concept behind virtual memory; user-level code don't
know what is present or absent. Now that machine virtualization is
underway, there is a OS level of the page tables and a HyperVisor
level of the page tables. In addition, when a page table entry is
weakened, the rest of the multiprocessor must be informed to shootdown
those TLB entries. So, now you have user-level code sending shootdown
mesages around the multiprocessing system.

nadda gonna 'appen.

Mitch