From: Terje Mathisen on
Andrew Reilly wrote:
> [2] I've used simple post-increment addressing modes for the two loads,
> here, but most DSPs can do post decrement, non-unit stride, circular
> constraint and bit-reverse update modes as well. Describing bit-reverse
> addressing in C is a something involving a loop all on its own.

The only (?) fast way to handle bit-reversed addressing in C is to avoid
it. :-)

Either you unroll the code sufficiently that you can hardcode the
varying offsets (does FFTW do this?), or you fall back on a separate
fixup stage, which can be pretty fast.

Generating an n-bit bit-reversed lookup table on the fly can be very
efficient, i.e. you'll easily saturate memory write bandwidth, and you
only need to do it once for each size n.

Terje
--
- <Terje.Mathisen(a)hda.hydro.com>
"almost all programming can be viewed as an exercise in caching"