|
From: Terje Mathisen on 5 Feb 2007 13:56 Andrew Reilly wrote: > [2] I've used simple post-increment addressing modes for the two loads, > here, but most DSPs can do post decrement, non-unit stride, circular > constraint and bit-reverse update modes as well. Describing bit-reverse > addressing in C is a something involving a loop all on its own. The only (?) fast way to handle bit-reversed addressing in C is to avoid it. :-) Either you unroll the code sufficiently that you can hardcode the varying offsets (does FFTW do this?), or you fall back on a separate fixup stage, which can be pretty fast. Generating an n-bit bit-reversed lookup table on the fly can be very efficient, i.e. you'll easily saturate memory write bandwidth, and you only need to do it once for each size n. Terje -- - <Terje.Mathisen(a)hda.hydro.com> "almost all programming can be viewed as an exercise in caching"
|
Pages: 1 Prev: Designing a stack-machine CPU Next: Cell contest for students sponsored by IBM. |