From: David Brown on
John Speth wrote:
>> I'm working on code that requires the use of large number of global
>> variables. To restrict access I'm thinking of making the globals read only
>> in all source files bar the file that writes it.
>>
>> The construct I would like to implement uses two declarations for one
>> variable definition, I know this does not comply with MISRA guidelines,
>> but
>> can anyone see any real problems with the following code
>> where the variable is written in sa.c but requires read access only in
>> sb.c.
>
> I'm no compiler expert but I wonder if implementing your plan might result
> in this problem:
>
> If you declare a label const, some tool chains would place the data in the
> code (RO) section. Whereas, if you declare it with const absent, the same
> tool chains would place the data in the data (RW) section. I would *expect*
> a high quality linker to complain of the duality of the label and fail to
> produce an output if two modules refer to the same label but with different
> section attributes. So I think it's more than just compiler semantics -
> it's also about locating data in memory.
>
> Compiler experts: Am I on track with my answer?
>

The "extern" declarations don't place data anywhere, they merely tell
the compiler to reference it. As long as the actual definition is not
"const", the variable will go in read-write memory.

It is legal to cast a non-const variable to a const (but not vice versa,
obviously). However, there are some compilers (older versions of
ImageCraft's AVR compiler springs to mind - their latest version is
different) which are non-conforming in that they use "const" to indicate
that data is in flash, and use the AVR's flash-access opcodes to read
"const" data. Thus a cast to a const will *not* work there. That's
because of non-standard "const" behaviour in those compilers, and their
later versions have standard "const" behaviour - but it is worth
checking the compilers you use to be sure.

However, legal or not, such hacks are ugly. They are an indication of
badly structured or badly modularised code, or a failure to define and
follow rules for properly accessing resources across modules. It would
be better to think about how the code is to be structured, whether you
can define and apply access rules by other means, an perhaps whether you
really need all these global variables.

A golden rule about include files is that every time the file is
included within a project, it generates the same code (or declarations).
If this is not the case, you should have an incredibly good reason for
it. A second golden rule is that all #include's should be at the top of
the header or C file they are used in, and there should be nothing
except comments (and the old "#ifndef __header_h\n #define __header_h"
lines at the start of every header) before the #include's. A third rule
is that it should not matter which order #include's are given. This
hack breaks all these rules.

Global variables are not in themselves a bad thing (some people will
tell you otherwise - these are mostly people who live in a world where
processors are big, memory is cheap, and pointer access is small and
fast). But uncontrolled access to globals can lead to disaster - as can
any uncontrolled access to shared resources. You should know what code
accesses shared resources at different times, so that you can avoid
conflicts - with too many globals, it's hard to keep track.

Access functions can be costly in terms of time and space overhead.
Sometimes it is worth it, sometimes not. It is worth noting that it is
often possible to use "static inline" access functions - these typically
cook down to zero overhead. You can even use code such as :

static inline int getGlobalVar(void) {
extern int globalVar;
return globalVar;
}

in a header to give you a zero overhead, read-only access to
"globalVar", while the name "globalVar" itself is hidden from other
modules, and thus cannot be accidentally written. It feels somewhat
"wrong" to have "extern" within a function, but at least it is in the
header file where it belongs. You also retain the error-checking power
you get from the defining module #include'ing its own header.
From: Rich Webb on
On Tue, 17 Jun 2008 22:24:59 +0200, David Brown
<david.brown(a)hesbynett.removethisbit.no> wrote:


>A golden rule about include files is that every time the file is
>included within a project, it generates the same code (or declarations).

I do find that the construct

[inside foo.h]
#ifdef MAIN
#define EXTERN
#else
#define EXTERN extern
#endif

EXTERN unsigned char bar;

[inside foo.c]
#define MAIN
#include "foo.h"

[inside others.c]
#include "foo.h"

helps to keep externs under control, though it breaks the rule above.

--
Rich Webb Norfolk, VA
From: David Brown on
Rich Webb wrote:
> On Tue, 17 Jun 2008 22:24:59 +0200, David Brown
> <david.brown(a)hesbynett.removethisbit.no> wrote:
>
>
>> A golden rule about include files is that every time the file is
>> included within a project, it generates the same code (or declarations).
>
> I do find that the construct
>
> [inside foo.h]
> #ifdef MAIN
> #define EXTERN
> #else
> #define EXTERN extern
> #endif
>
> EXTERN unsigned char bar;
>
> [inside foo.c]
> #define MAIN
> #include "foo.h"
>
> [inside others.c]
> #include "foo.h"
>
> helps to keep externs under control, though it breaks the rule above.
>

That's a construct I've seen many places - in my not so humble opinion,
it's horrible!

C is, in many ways, a terrible language. It is far too flexible about
what it accepts - you need to enforce clear and consistent rules if you
are going to write clear and consistent programming.

One of the fundamentals about modularised programming is that you have a
clear separation between interface and implementation. If you have a
module "module", then "module.h" is the interface, and "module.c" is the
implementation.

Ideally, "module.h" contains only declarations (and appropriate
comments, of course) - it tells users of the module what is offered by
the module, and how to access its functions, data, and structures.
Unless you are using "whole program" compilation, it's often unavoidable
to have a certain amount of code in the header (as either static inline
functions, or equivalent macros), but ideally it contains only things
like typedefs and extern declarations.

In "module.c", you have the implementation. Everything that is only
used locally in "module.c" is declared "static" in its definition -
everything that is exported is defined without "static", and has a
matching "extern" in the header. The compiler will check that your
externs match up properly as you include "module.h" at the start of
"module.c". Even better with gcc, "-Wmissing-declarations" will flag
any mismatches, such as a non-static definition without a matching "extern".

The use of the "EXTERN" hack will give the same compiled code in the end
- but at the cost of breaking modularity, breaking consistency with
extern functions (in a function declaration in a header, the "extern" is
technically redundant - but it is important as a statement of the intent
of the programmer), introducing unnecessary "pretend keywords", loosing
consistency when using initialised variables, and failing to have the
actual implementation of the variables in an appropriate place in the
module. Of course, it does save you typing "unsigned char bar;" in
"module.c" as well as "module.h" - but if you are having trouble doing
that, you're probably in the wrong profession.

That's my 2 �re, anyway.

mvh.,

David
From: Hans-Bernhard Bröker on
pb1 wrote:
> I'm working on code that requires the use of large number of global
> variables. To restrict access I'm thinking of making the globals read only
> in all source files bar the file that writes it.

Sorry to burst your bubble, but that's not possible in a (remotely)
well-behaved C program. You would be invoking undefined behaviour by
violating C99 6.2.7p2 (or equivalent clauses in whatever language
definintion you're based on).

> The construct I would like to implement uses two declarations for one
> variable definition, I know this does not comply with MISRA guidelines,

MISRA set aside, it's plain and simply invalid C. It may work if you're
rather lucky. Then again, maybe it won't. And if it doesn't, you'll
have nobody but yourself to blame.

> but can anyone see any real problems

You'll be toast if you try that on a platform where 'const' data ends up
in an actually separate memory space. C allows such things, your code
assumes it doesn't.


From: Hans-Bernhard Bröker on
Tim Wescott wrote:

> Declaring data const in one place and not const in another is (AFAIK)
> perfectly kosher according to the ANSI C standard,

At least as of C99, it's not. It violates 6.2.7p2, punishable by
undefined behaviour.

> because the C virtual machine is perfectly Von Neumann.

No, it's not. The C virtual machine, if anything, is a Harvard
architecture. Data and function pointers live in entirely separate worlds.

> C compilers for these processors handle the disparity between ANSI C
> and the processor's peculiarities with varying levels of compliance
> to the ANSI C standard.

There is no such disparity to be handled.

The only issue here with actual Harvard architectures, particularly
embedded ones, is with the quality-of-implementation for const variables
on RAM-starved, not-strictly-Harvard machines. That triggers a desire
to leave consts in CODE-addressed-as-data, rather than copying them to
RAM along with the non-const variables. So either generic pointers get
bogged down by extra machinery for CODE and RAM addressing by the same
pointer, or the language has to be extended.