Radix-4 Radix-2 ordering [DSP]

Prev: Narrowband Rician fading
Next: phase of FFT

From: Jerry Avins on 19 Jan 2010 17:36

septer012 wrote:
> I am receiving 32bit data packets with 24bit data inside. I will convert
> the 24bit data to 32 bit floating point.
>
> The radix-2/4 methods are what I am trying to learn I have no other reason
> to use them other then the fact that I am using the DSK compiler that came
> with this old development board. And I really dont know how to use it
> (environment), let alone if there is an FFT library function and how it
> works. I barely got it to run and compile and run some code on the board
> (its fairly old and bad).

It's fairly old, and if you don't happen to like the assembly language,
you're not alone. I found it pretty easy to deal with. The bit-reversed
addressing is handy, but you need to so locate the buffer that the base
of the array has as many zero bits in the physical memory as are needed
for the array size. When memory is tight, that can be a problem.

The FFT examples in TMS320C3X User's Guide are in-place algorithms that
I've not studied in detail. Example 11-38 is a real, radix-2 FFT. Do you
have that book?

Jerry
--
Engineering is the art of making what you want from things you can get.
��

From: septer012 on 19 Jan 2010 19:18

>It's fairly old, and if you don't happen to like the assembly language,
>you're not alone. I found it pretty easy to deal with. The bit-reversed
>addressing is handy, but you need to so locate the buffer that the base
>of the array has as many zero bits in the physical memory as are needed
>for the array size. When memory is tight, that can be a problem.
>
>The FFT examples in TMS320C3X User's Guide are in-place algorithms that
>I've not studied in detail. Example 11-38 is a real, radix-2 FFT. Do you

>have that book?
>
>Jerry

I do not have a problem with Assembly. I am not stupid enough to say I
prefer it, but I can write assembly no problem if I sit down and learn some
of the instructions. I however have the ability to concentrate and read
someone elses assembly. I do have the TMS320C3x user guide but dont have
those examples. I found some assembly examples in the application notes
guide.

I guess the first thing I am going to try is to make a N=16 Radix-2 FFT in
C and feed it a premade vector and expand onto Radix-4.

How did you get that value earlier of 262k... Are you saying I wont have
enough memory? I should only need to at most take out 4 data points, do
the Radix-4 on them, and overwrite the old values. So that's 128K of Data
and I can use that same 128K of data to store the calculations as I process
the FFT?

From: Jerry Avins on 19 Jan 2010 20:43

septer012 wrote:
>> It's fairly old, and if you don't happen to like the assembly language,
>> you're not alone. I found it pretty easy to deal with. The bit-reversed
>> addressing is handy, but you need to so locate the buffer that the base
>> of the array has as many zero bits in the physical memory as are needed
>> for the array size. When memory is tight, that can be a problem.
>>
>> The FFT examples in TMS320C3X User's Guide are in-place algorithms that
>> I've not studied in detail. Example 11-38 is a real, radix-2 FFT. Do you
>
>> have that book?
>>
>> Jerry
>
>
> I do not have a problem with Assembly. I am not stupid enough to say I
> prefer it, but I can write assembly no problem if I sit down and learn some
> of the instructions. I however have the ability to concentrate and read
> someone elses assembly. I do have the TMS320C3x user guide but dont have
> those examples. I found some assembly examples in the application notes
> guide.
>
> I guess the first thing I am going to try is to make a N=16 Radix-2 FFT in
> C and feed it a premade vector and expand onto Radix-4.
>
> How did you get that value earlier of 262k... Are you saying I wont have
> enough memory? I should only need to at most take out 4 data points, do
> the Radix-4 on them, and overwrite the old values. So that's 128K of Data
> and I can use that same 128K of data to store the calculations as I process
> the FFT?

You have plenty of memory with an in-place algorithm. The data book I
looked in was SPRU031D, dated 1994. If your TI rep can't get you a copy,
I'd be willing to scan some pages and send them yo you. You can probably
find it on line... I just did:
http://www.ise.pw.edu.pl/dydaktyka/psap/320C3x.pdf

Jerry
--
Engineering is the art of making what you want from things you can get.
��

From: kevin on 20 Jan 2010 00:08

septer012 wrote:

> I am doing 2 FFT's.
> FFT1 is N=1024
> FFT2 is N=131072
> I am using a TMS320VC33.

A critical piece of information is missing: are the FFTs for real
inputs only or complex inputs? For N real inputs only, you can do an
N/2 size complex FFT and manipulate results with very little extra
overhead.

> The SRAM I will have available to me will likely be 256kx32.

If the input is complex valued, 262144 (256x1024) of 32 bit SRAM is
just enough to store the 131072 real results and 131072 imaginary
results of a 131072 FFT, with absolutely no room for storing anything
else (and if youre using that same SRAM to store your program, or any
look-up tables, good luck with that). So perhaps your 131072 FFT must
be for real valued inputs? Otherwise, someone is giving you a nearly
impossible requirement.

> I am not doing the sampling but the data being sent to me is sampled at
> 10khz. What I am looking for is near 5kHz

If what youre looking for exists over a relatively small frequency
range compared to the 0 to Fs/2 FFT range, might you use a few DFTs to
cover the range of interest instead? Depending on the number of DFTs
used, it might be faster than computing all FFT points. And DFTs are
easy to code.

> for the first FFT I believe I have to do a single Radix-4 computation.

I have no idea where this comes from. Suffice to say that are
probably several hundred different ways of doing a 1024 point FFT.
You could do it as a 8x8x8x2 (radix 8 is quite efficient, as it only
requires multiplications by +/-.707.., and you can often do that with
a few adds/subtracts). Or you could do a 1024 as a 32x32 (e.g.: write
an efficient 32 point routine, then use 2 stages of 32 point
butterflies to do the 1024).

> for the second FFT I believe I have to do (8) Radix-4's and (1) Radix-2 to
> cover the number of data points. Is there a specific order I have to
> handle the cascade?

I do hope you realize that when you combine smaller FFTs to make a
bigger FFT, youve got twiddle factor multiplications between all the
stages. Depending on how the algorithm is developed, they can be
either easy or difficult to figure out.

In your first post, you also show a reorder outputs. When doing
mixed radix graphs, the reordering can be difficult if you want to do
it in-place. For instance, a mixed radix 32 point FFT composed of a
radix 8 stage (followed by interstage twiddles) followed by a radix 4
stage will give you an output where only that first and last points
are in the correct place, and everything else is in the wrong place.
Whats more, you cant swap locations like you would normally do to
get everything into the proper sequence.

Jerry made a great find in that TI manual. I scanned through it, and
I believe they actually have a general purpose radix 2 algorithm, and
one for radix 4 (see p.11-73 +). So you might want to start with
that.

Kevin McGee

From: Jerry Avins on 20 Jan 2010 09:07

kevin wrote:
> septer012 wrote:
>
>> I am doing 2 FFT's.
>> FFT1 is N=1024
>> FFT2 is N=131072
>> I am using a TMS320VC33.
>
> A critical piece of information is missing: are the FFTs for real
> inputs only or complex inputs? For N real inputs only, you can do an
> N/2 size complex FFT and manipulate results with very little extra
> overhead.
>
>> The SRAM I will have available to me will likely be 256kx32.
>
> If the input is complex valued, 262144 (256x1024) of 32 bit SRAM is
> just enough to store the 131072 real results and 131072 imaginary
> results of a 131072 FFT, with absolutely no room for storing anything
> else (and if you�re using that same SRAM to store your program, or any
> look-up tables, good luck with that). So perhaps your 131072 FFT must
> be for real valued inputs? Otherwise, someone is giving you a nearly
> impossible requirement.

From what I gather, the inputs are real, but the outputs may be complex.

...

> Jerry made a great find in that TI manual. I scanned through it, and
> I believe they actually have a general purpose radix 2 algorithm, and
> one for radix 4 (see p.11-73 +). So you might want to start with
> that.

Thanks. I have that manual here. Did you see that the FFT assembly
routines have C calling wrappers? Years ago, I bought a 'C32 DSK to see
what DSP was all about. After I sent TI an improvement in a piece of
their published code, they gave me a 'C33 DSK as a thank you.

Jerry
--
Engineering is the art of making what you want from things you can get.
��

First | Prev | Next | Last
Pages: 1 2 3 4
Prev: Narrowband Rician fading
Next: phase of FFT