From: D Yuniskis on
[forwarded from a conversation elsewhere]

>>> What manifest constants do folks use to define the limits
>>> on physical memory (of a particular flavor -- e.g., RAM)?
>>>
>>> For example, I do:
>>>
>>> do_something(void *location, size_t size) {
>>> ASSERT( (location >= FIRST_RAM) && (location <= LAST_RAM) );
>>> ASSERT( size <= ((LAST_RAM - FIRST_RAM) + 1) )
>>> ASSERT( location <= ((LAST_RAM + 1) - size) );
>>
>> I would do that like
>>
>> do_something(void *location, size_t size) {
>> ASSERT( (location >= BEGIN_RAM) && (location < END_RAM) );
>> ASSERT( size <= (END_RAM - BEGIN_RAM) )
>> ASSERT( location + size < END_RAM );
>>
>> this saves all those little +- 1's and makes some of the comparisons
>> clearer like the last one.
>
> This is a reflection of the way I write "live" code. I used to
> use a BEGIN_ END_ type naming scheme as a throwback to how you
> typically did things in ASM. For example:
>
> FOO: DS 312*4
> END_FOO:
>
> BAR: DW 1
>
> which let you locate the end of "FOO" at END_FOO-1 -- and the
> size of FOO as END_FOO-FOO.
>
> But, END_FOO is not part of FOO in these semantics. Just like
> the element 'n' in an n-element array isn't at [n].
>
> Instead, I use the convention that I reference the first and
> last elements of an object (in this case, memory addresses).
> So, FIRST_ and LAST_ are each part of the object in question.
> (a consequence of this is all objects must have at least
> one element!!). Note that I can still determine size as
> LAST_ - FIRST_ + 1. If these are truly manifest constants,
> then there is no cost in doing so. (and, if they were to
> appear in live code, it's just an "off by one" operation)
>
> Where it comes in handy is when you are mucking around in the
> dark corners of a particular implementation. For example,
> if LAST_ happens to indicate the end of physical memory,
> then LAST_+1 -- i.e., *your* definition of "END_" -- may
> result in arithmetic overflow (depends on the preprocessor,
> in this example; in live code, it depends on the width of the
> register!). I.e., your END_ may be LESS THAN your BEGIN_
> (if LAST_+1 wraps to 0x0). This can lead to some nasty
> bugs to ferret out when/if things creep up to those limits.
> Consider, for example, "location + size < END" on a 32b
> machine with a 32b address space (i.e., END = 0x100000000
> which is 0x0 at runtime) so EVERY set of location and size
> fails this criteria!
>
> My previous use of END_ and BEGIN_ led me to adopt a different
> set of names -- for fear the END_ practice would lurk behind
> the scenes in my thinking. FIRST_ and LAST_ seem to reinforce
> their intended semantics (i.e., think of the first and last
> persons in a *line*; contrast that with how you would define
> the beginning and end of that same line!)
>
> But, the point of my question was to address how you reference
> memory regions, etc. The days of "RAM + ROM" are long past.
> With some of the memory protection units, I find myself
> needing to treat what would previously have been lumped into
> the category of "ROM" ("CODE", "TEXT", etc.) as *different*
> regions. "REGION1", "REGION2", etc. just seems pretty hokey.
> Even RAM is disjoint in some of my designs -- "RAM1", "RAM2"?
> Perhaps I could develop more meaningful names -- "WCS", VRAM",
> etc. -- but that is application specific (think code reuse)?
From: Paul Keinanen on
On Mon, 19 Jul 2010 17:05:09 -0700, D Yuniskis
<not.going.to.be(a)seen.com> wrote:

>Hi Tim,
>
>Tim Wescott wrote:
>> On 07/19/2010 10:58 AM, D Yuniskis wrote:
>>
>>> What manifest constants do folks use to define the limits
>>> on physical memory (of a particular flavor -- e.g., RAM)?
>>> For example, I do:
>>>
>>> do_something(void *location, size_t size) {
>>> ASSERT( (location >= FIRST_RAM) && (location <= LAST_RAM) );
>>> ASSERT( size <= ((LAST_RAM - FIRST_RAM) + 1) )
>>> ASSERT( location <= ((LAST_RAM + 1) - size) );
>>>
>>> ...
>>> }
>>>
>>> This gets tedious with discontiguous memory but I don't see
>>> any way around it.
>>
>> What errors are you trying to catch?
>
>Running off the end/start of a memory region. And consider that
>region may be as big as the largest int supported by the compiler
>
>> If you over-allocate any static memory the linker will catch the problem
>
>Not worried about static allocations -- for the reasons
>you outline. Nor static assignments (e.g., foo[x] = foo[y] where
>x and y are both >= 0 and < sizeof(foo)).
>
>Not all things fit nicely into static allocations (e.g., any
>product that has a variable memory complement has "limits"
>that the linkage editor is not aware of -- late binding).
>
>Any product that does more than the resource complement
>appears capable of supporting has to dynamically reallocate
>resources.
>
>And, even a statically allocated region/object needs limit
>checking (e.g., imagine these were FIRST_HEAP and LAST_HEAP
>and you're trying to allocate a stack *in* that heap for a
>newly/dynamically created task).

I really do not understand what you are trying to do.

A standard practice for small systems for decades has been to put the
initial stack pointer at the top of the RAM in the initialization
routine and the stack grows downwards (on most processors). If the RAM
size is not know in advance, at reset, the address space can be
scanned for last R/W location.

The linker will allocate the static R/W data at the beginning of RAM
and a special label is inserted at the end of this data area (e.g.
ZZZZZZ if alphabetic program section ordering is used).

The dynamic memory allocator base allocator will use this label as the
initial value for the allocator (top_of_the_heap pointer). The dynamic
memory allocation grows upwards towards the stack.

When the malloc needs to obtain new memory from RAM, it temporarily
adds the size requested (+overhead) to the current top_of_ the_heap
pointer and then compares to the current hardware stack pointer (with
some free space for ISR register save etc.). If no more free memory is
available, the top_of_the_heap is not moved and malloc returns NULL.
This must be done in an assembler routine.

Each time a new function is entered, it allocates a new stack frame.
Before doing this allocation, it should check that the new tentative
hardware stack pointer is not lower than the top_of_the_heap pointer.
If it is, it should do something sensible, such as reseting the system
to avoid any further damage.

This can be easily implemented, if a separate preamble coroutine is
used e.g. for register save/restore, inserting this check (in
assembler) is easy. Alternatively, the compiler should generate this
check.

If this is not possible, an assembler routine should be called
immediately at subfunction entry and if overlap is detected, the
system should restart immediately, since the heap is already
corrupted.

When malloc() and free() routines are used, the pool may be
fragmented, causing the top of the heap pointer creeping upwards,
finally hitting the stack.

Thus, you can not only use static memory tests and the memory checks
in assembler needs to be done also in production code (and not only in
debug code) if also free() calls are used.

From: D Yuniskis on
Hi Paul,

Paul Keinanen wrote:
> On Mon, 19 Jul 2010 17:05:09 -0700, D Yuniskis
> <not.going.to.be(a)seen.com> wrote:
>
>> Tim Wescott wrote:
>>> On 07/19/2010 10:58 AM, D Yuniskis wrote:
>>>
>>>> What manifest constants do folks use to define the limits
>>>> on physical memory (of a particular flavor -- e.g., RAM)?
>>>> For example, I do:
>>>>
>>>> do_something(void *location, size_t size) {
>>>> ASSERT( (location >= FIRST_RAM) && (location <= LAST_RAM) );
>>>> ASSERT( size <= ((LAST_RAM - FIRST_RAM) + 1) )
>>>> ASSERT( location <= ((LAST_RAM + 1) - size) );
>>>>
>>>> ...
>>>> }
>>>>
>>>> This gets tedious with discontiguous memory but I don't see
>>>> any way around it.
>>> What errors are you trying to catch?
>> Running off the end/start of a memory region. And consider that
>> region may be as big as the largest int supported by the compiler
>>
>>> If you over-allocate any static memory the linker will catch the problem
>> Not worried about static allocations -- for the reasons
>> you outline. Nor static assignments (e.g., foo[x] = foo[y] where
>> x and y are both >= 0 and < sizeof(foo)).
>>
>> Not all things fit nicely into static allocations (e.g., any
>> product that has a variable memory complement has "limits"
>> that the linkage editor is not aware of -- late binding).
>>
>> Any product that does more than the resource complement
>> appears capable of supporting has to dynamically reallocate
>> resources.
>>
>> And, even a statically allocated region/object needs limit
>> checking (e.g., imagine these were FIRST_HEAP and LAST_HEAP
>> and you're trying to allocate a stack *in* that heap for a
>> newly/dynamically created task).
>
> I really do not understand what you are trying to do.

Forget what I am trying to do. I have N different types
(technologies) of memory in a design. They are used for
M different (incompatible) purposes. Each has a starting
point and an ending point (based on their sizes) in the
processor's *physical* address space (this may differ
from their *logical* addresses).

In some configurations, these are hard constants that are
known at compile time. In other configurations, they are
determined dynamically at run time -- though, thereafter,
effectively *treated* as "hard constants".

In traditional/classic architectures, you might have
START_OF_ROM, END_OF_ROM, START_OF_RAM and END_OF_RAM.
Where these map to the processor's address space can
vary. In some architectures, ROM might start at 0x0.
In others, that might be RAM and ROM is constraineed to
*end* at, e.g., "the highest memory address".

In classic designs, all ROM and all RAM were contiguous.

Nowadays, (at least in my case) "ROM" can be littered
throughout the address space as can RAM. There can be
other types of "memory" as well: FLASH, DPRAM, etc.
And, memory mapped I/O (START_OF_XIP, etc.)

My question is simple: what do you use as a naming convention
for these regions? E.g., we symbolically reference "ULONG_MAX".
You probably wouldn't adopt "RAM_MAX" as a naming convention for
"highest RAM location" (or, would you?). Some folks like
START_ and END_ naming schemes (I object to this for grounds
explained in another post). I prefer FIRST_ and LAST_.

But, that still leaves the naming convention for the regions
themselves: _ROM, _RAM, _FLASH, _ROM1 ... _ROMn, _RAM1 ... _RAMn,
etc.

> A standard practice for small systems for decades has been to put the

--------------------------^^^^^^^^^^^^^^

My current projects have very large address spaces (typically
containing various types of discontiguous memory).

> initial stack pointer at the top of the RAM in the initialization

I don't have *a* stack pointer but several (multitasking/multithreaded).

> routine and the stack grows downwards (on most processors). If the RAM
> size is not know in advance, at reset, the address space can be
> scanned for last R/W location.

My POST tests and sizes memory (ies). All (contiguous) "free memory"
is placed in the System Heap. Out of this, each task's stack is
allocated. Tasks can also create heaps of their own. The behaviors
of these Heaps are determined by their clients.

Additionally, other "memories" (FLASH, VRAM, DPRAM, etc.) are sized
and individually managed (e.g., the same routines that create and
slice up the "system heap" are applied to DPRAM to partition it
for each consumer/producer, etc.)

The point is, each of these has boundaries. Their particular
characteristics preclude one type of memory from being used as
another (e.g., RAM is inappropriate for use as FLASH; FLASH is
inappropriate for use as DPRAM; etc.). When you run out of one,
its gone.

Where in the processor's address space each resides can vary
(depends on how I program the address decoder, characteristics
of the processor, etc.). E.g., one "region" could end coincident
with the end of the processor's address space. Or, *begin* at
the *start* of the processor's address space. Or, touch neither
"edge". Or, *both* edges (e.g., imagine an entirely RAM based
implementation).

> The linker will allocate the static R/W data at the beginning of RAM
> and a special label is inserted at the end of this data area (e.g.
> ZZZZZZ if alphabetic program section ordering is used).
>
> The dynamic memory allocator base allocator will use this label as the
> initial value for the allocator (top_of_the_heap pointer). The dynamic
> memory allocation grows upwards towards the stack.

Where you put the heap(s) and stack(s) is up to the implementation.
E.g., you can site the heap "above" the stack so there is no
possibility of the stack growing down into the heap. You can allocate
a large pseudo-static buffer pool out of the top end of the heap
to effectively limit stack growth (this, by itself, doesn't *enforce*
any limits on stack growth. But, by placing it in this position,
you have *effectively* limited how deep the stack can grow without
corrupting the contents of the buffer pool, etc.)

> When the malloc needs to obtain new memory from RAM, it temporarily
> adds the size requested (+overhead) to the current top_of_ the_heap
> pointer and then compares to the current hardware stack pointer (with
> some free space for ISR register save etc.). If no more free memory is
> available, the top_of_the_heap is not moved and malloc returns NULL.
> This must be done in an assembler routine.

You're thinking of sbrk().

I manage memory in "heaps". A heap has defined borders (though the
size of the heap may not be known at compile time). Memory *within*
a heap (there can be many) can be parceled out in various sized
fragments -- the constraints placed on the sizes offered and how
they are "withdrawn" from the heap are determined by the consumer
(e.g., all requests can be forced to a particular size; requests
can be forced to chose from among existing "free fragments" within
the heap; requests can be fine-grained, etc.).

In terms of your example, the system heap is created at POST
and passed to the RTOS. It uses this to create stacks for each
(dynamically created) task -- as well as any heaps that those
tasks require. Any memory needs that the RTOS itself has are
similarly addressed from heap(s) owned by the RTOS. (e.g.,
message queues are carved out of a heap set aside for that
purpose -- as are other large structures like the interrupt
vector table, etc.)

Tasks are free to create heaps to fit their own needs (as long
as they stay within their total memory allocation -- so the
system doesn't run out of backing store). The policies applied
to those heaps determines how memory within them is allocated
and managed (i.e., where, physically, requests are satisfied;
how they are interpreted; coordination of access among multiple
consumers; etc.).

> Each time a new function is entered, it allocates a new stack frame.
> Before doing this allocation, it should check that the new tentative
> hardware stack pointer is not lower than the top_of_the_heap pointer.
> If it is, it should do something sensible, such as reseting the system
> to avoid any further damage.

This varies with applications. Note that, for the most part,
this is a "/* CAN'T HAPPEN */" scenario -- memory has been
sized and algorithms designed such that you never exhaust
a resource. "That's a bug". I carefully design my algorithms
so I know the extent of stack penetration for each, limits on
how deep recursive algorithms plunge, etc.

At run-time (production code), I omit the overhead of trying to
check the current stack depth at each function invocation.
Instead, I set up a fence at the edge of the stack region
(recall, I don't use the sbrk() notion of partitioning
memory into heap vs. stack -- so the stack has limits just
like a heap!) and check that fence's integrity at each task
switch. This is an option specified when the task is
created -- just like how to treat its use of the FPU, etc.
So, I can turn it on for all tasks during development
and, as each task is debugged, omit this extra testing on
a per-task basis. The testing should *not* be necessary
once the system is debugged/released.

> This can be easily implemented, if a separate preamble coroutine is
> used e.g. for register save/restore, inserting this check (in
> assembler) is easy. Alternatively, the compiler should generate this
> check.
>
> If this is not possible, an assembler routine should be called
> immediately at subfunction entry and if overlap is detected, the
> system should restart immediately, since the heap is already
> corrupted.
>
> When malloc() and free() routines are used, the pool may be
> fragmented, causing the top of the heap pointer creeping upwards,
> finally hitting the stack.

Again, I don't have that problem as each heap is a bounded
entity. You can exhaust *a* heap without impacting anything else
in the system (what you *do* in that condition is up to you...).

I can create the behavior you describe by treating that
heap+stack region as a single region; intitializing the SP
at one end of it; and specifying a fragment selection
criteria that forces allocations to be satisfied with a
bias towards the *other* end. But, this isn't usually
necessary in a thread (use lots of threads and keep their
roles simple and well-defined; spawn and kill them freely
to suit your application's needs).

> Thus, you can not only use static memory tests and the memory checks
> in assembler needs to be done also in production code (and not only in
> debug code) if also free() calls are used.

But none of this addresses my "how do you label memory boundaries"
question... ;-) (I didn't think this would be that hard of
a question! :< )