From: Ian Collins on
Timmy wrote:
> Hi!!
>
> First let me say I'm not really a C/C++ programmer. I can write some
> advanced string code and write calculators etc, very basic stuff.
>
There are as many C/C++ programmers as there are unicorn jockeys, both
being mythical creatures.

> My question is about C/C++ optimization and CFLAGS to make said code
> run as fast as possible.
>
> Here is the deal: My friends and I use source based operating systems
> a.la Gentoo, FreeBSD & LFS Linux from scratch. All of the code of
> these operating systems are compiled with GCC/ -g++ C/C++ Since we
> build the O/S's from source we have control of make.conf and the flags
> that GCC uses to build the code. Normally you start with CFLAGS -O2
> -pipe that is the standard and is considered safe optimizations on all
> three O/S mentioned above. Our goal is to build the fastest O/S programs
> without breaking stuff using insane CFLAGS.
>
You'd do better asking on a Linux or gcc list, where those in the know
can be found.

--
Ian Collins.
From: Jerry Coffin on
In article <20080103231301.1cfae0ea(a)suddenlink.net>, Timmy15(a)Spamnot.com
says...

[ ... ]

> My question was. Will x86 run faster on a smaller code base than it
> will on a larger optimised code base? I know what you're thinking
> "Benchmark it & and see" Do you know how long it takes to build and
> install these operating systems from source? That's why I thought I
> would ask master programmers who write this code their opinion before I
> spend all day building an o/s from source. For all I know the woman was
> yanking my chain.

Optimizing for size is still optimizing, so you're only comparing one
type of optimization to another.

In any case, there's really no answer. Even if you benchmark it and see,
your results are only really meaningful for your particular system
running the load programs you test with. Even a seemingly small change
in the system (e.g. changing to a CPU with a different size of cache)
could change your results. As such, to get results that mean a lot, you
need to carry out tests that do a good job of simulating typical loads
on a wide variety of system configurations.

When you're done, you'll probably find that your system ends up
executing kernel code something like 1% of the time, so optimizing the
kernel makes only a trivial difference in overall speed anyway.

--
Later,
Jerry.

The universe is a figment of its own imagination.
From: Daniel T. on
Timmy <Timmy15(a)Spamnot.com> wrote:
> Ian Collins <ian-news(a)hotmail.com> wrote:
>
> > > My question is about C/C++ optimization and CFLAGS to make said
> > > code run as fast as possible.
> > >
> > > Here is the deal: My friends and I use source based operating
> > > systems a.la Gentoo, FreeBSD & LFS Linux from scratch. All of
> > > the code of these operating systems are compiled with GCC/ -g++
> > > C/C++ Since we build the O/S's from source we have control of
> > > make.conf and the flags that GCC uses to build the code.
> > > Normally you start with CFLAGS -O2 -pipe that is the standard
> > > and is considered safe optimizations on all three O/S mentioned
> > > above. Our goal is to build the fastest O/S programs without
> > > breaking stuff using insane CFLAGS.
> >
> > You'd do better asking on a Linux or gcc list, where those in the
> > know can be found.
>
> There in lies the problem; Those "In The Know" Say you shouldn't
> step outside the norm... That's why I thought I would ask master
> programmers who write this code their opinion...

So because the people who know the answer didn't give you the answer you
wanted, you decided to ask the question in a group devoted to C/C++
beginners? Don't you see a bit of a problem with that?

You are not asking the master programmers who wrote that code by posting
in this group. If you asked in the FreeBSD & LFS Linux groups and they
all said "don't try it" then I suggest you take their advice, unless you
are a master programmer who knows the code (which if you were, you
wouldn't be asking the question.)
From: Jim Langston on
Daniel T. wrote:
> Timmy <Timmy15(a)Spamnot.com> wrote:
>> Ian Collins <ian-news(a)hotmail.com> wrote:
>>
>>>> My question is about C/C++ optimization and CFLAGS to make said
>>>> code run as fast as possible.
>>>>
>>>> Here is the deal: My friends and I use source based operating
>>>> systems a.la Gentoo, FreeBSD & LFS Linux from scratch. All of
>>>> the code of these operating systems are compiled with GCC/ -g++
>>>> C/C++ Since we build the O/S's from source we have control of
>>>> make.conf and the flags that GCC uses to build the code.
>>>> Normally you start with CFLAGS -O2 -pipe that is the standard
>>>> and is considered safe optimizations on all three O/S mentioned
>>>> above. Our goal is to build the fastest O/S programs without
>>>> breaking stuff using insane CFLAGS.
>>>
>>> You'd do better asking on a Linux or gcc list, where those in the
>>> know can be found.
>>
>> There in lies the problem; Those "In The Know" Say you shouldn't
>> step outside the norm... That's why I thought I would ask master
>> programmers who write this code their opinion...
>
> So because the people who know the answer didn't give you the answer
> you wanted, you decided to ask the question in a group devoted to
> C/C++ beginners? Don't you see a bit of a problem with that?
>
> You are not asking the master programmers who wrote that code by
> posting in this group. If you asked in the FreeBSD & LFS Linux groups
> and they all said "don't try it" then I suggest you take their
> advice, unless you are a master programmer who knows the code (which
> if you were, you wouldn't be asking the question.)

The answer is, it depends. Sometimes a smaller executable (optimized for
size) will run faster than onen a larger executable (optimized for speed)
but there are many variables involved. Such as the more physical memory so
more of the .exe can fit into physical memory at a time and doesn't have to
be page swapped, allowing for a larger executable.

The only answer is, test. If optimization for speed increases the file size
dramatically then the OS will need to keep swapping in sections of the
program as it runs through the program, slowing it down since disk IO is one
of the main bottlenecks of computers. What size makes a difference? Like I
said, it depends, you'll have to test.

This is why there are so many switches for optimizations because one size
does not fit all. If there was a series of optimization switches that
always resulted in a faster executable program then the only need for
switches would be for exteme cases (no hard drive at all so the program must
fit in memory completely would be optimized for size).


--
Jim Langston
tazmaster(a)rocketmail.com


From: Ulrich Eckhardt on
Timmy wrote:
> First let me say I'm not really a C/C++ programmer.

Those two are different languages, the term C/C++ doesn't make much sense.

> My question is about C/C++ optimization and CFLAGS to make said code
> run as fast as possible.

Profile the code.

> Normally you start with CFLAGS -O2 -pipe that is the standard and is
> considered safe optimizations on all three O/S mentioned above. Our
> goal is to build the fastest O/S programs without breaking stuff
> using insane CFLAGS.

There is another one, I think it's -M or -m which tells the compiler to
specifically target said CPU. Further, there is one that controls how many
arguments are passed in registers when calling a function. This can produce
faster code, but binaries are not compatible with each other then.

> She said that the more optimisations used the larger the code base, duh
> like we don't know that. According to her, on x86 boxes CFLAGS -Os Not
> adding any optimisations will run faster than a larger optimized
> code base. She also said that 64bits/larger programing pointers will
> take advantage of the larger optimised code base where large code base
> would slow down x86 boxes.

(Note up front: I'll simplify things a bit here!)

You have various resources in a computer system that affect how fast it will
run. Different tasks use these to different extents. E.g. there is memory
bandwidth, i.e. the access speed of various levels of cache, RAM and even
harddisk. Then there is the CPU pipeline itself, which typically handles
code without branches better than code with branches, because if you don't
know what branch is taken, you first need to finish all opcodes up to the
branch before you can start the next opcode after the branch. IOW, if two
opcodes are independent, the CPU can execute them in parallel but if one
depends on the result of the other it can't[1].

Now, what does that mean for the speed? The speed is limited by both the
pipeline and the memory bandwidth. If one is used completely, the other
might still be partially idle. You can now try to tweak that behaviour by
reducing the amount of stress on one of the two. E.g. unrolling loops will
reduce the number of branches (easier work for the pipeline) but increase
the amount of instructions (more work for the memory bus). Using
size-optimised code will reduce the load for the memory bus but increase
the work for the pipeline.

So, whether something is faster or not actually depends both on the CPU and
on the actual task.

Uli

[1] Note: this is not only true for branches but for any computation.
Further, branches are typically the easiest to handle, some CPUs actually
execute both ways of the branch and discard the results of the one that is
eventually not taken.