From: onkars on
Hi, I am a student working with the Xilinx C model of the FFT. Following
are my settings:

Pipelined arch. (this restricts the scaling to be applied only after every
pair of radix-2 butterflies)

Precision of 8 bits.

Twiddle precision 8 bits.

FFT size 1024.


If I use a conservative scaling schedule of [2, 2, 2, 2, 2] (i.e. divide by
4 for every pair of Radix2 BFs) ... most of the outputs are 0. Is this
possible? -- i.e. outputs are getting too scaled.

If I use block floating -- I get much better results (close to the floating
point golden outputs).

A loose scaling schedule [1, 1, 1, 1, 1] .. (i.e. divide by 4 for every
pair of Radix2 BFs) causes overflow.


Other than using block floating OR increasing my precision -- is there any
other way of achieving better results (not so many 0s)? I also want my
design to be protected from overflow under any input conditions (assume
this is general purpose)

Thank you.
From: Tim Wescott on
onkars wrote:
> Hi, I am a student working with the Xilinx C model of the FFT. Following
> are my settings:
>
> Pipelined arch. (this restricts the scaling to be applied only after every
> pair of radix-2 butterflies)
>
> Precision of 8 bits.
>
> Twiddle precision 8 bits.
>
> FFT size 1024.
>
>
> If I use a conservative scaling schedule of [2, 2, 2, 2, 2] (i.e. divide by
> 4 for every pair of Radix2 BFs) ... most of the outputs are 0. Is this
> possible? -- i.e. outputs are getting too scaled.
>
> If I use block floating -- I get much better results (close to the floating
> point golden outputs).
>
> A loose scaling schedule [1, 1, 1, 1, 1] .. (i.e. divide by 4 for every
> pair of Radix2 BFs) causes overflow.
>
>
> Other than using block floating OR increasing my precision -- is there any
> other way of achieving better results (not so many 0s)? I also want my
> design to be protected from overflow under any input conditions (assume
> this is general purpose)
>
> Thank you.

I simply don't see, for all but some predefined signal with known
characteristics, how a 1024 bin FFT is going to benefit you in any way
with such low precision. As a rule of thumb, to catch everything that's
going on you should have an output precision that's 10 bits deeper than
your input -- so 8 bits out is just insufficient, any way you slice it.

--
Tim Wescott
Control system and signal processing consulting
www.wescottdesign.com
From: onkars on
>onkars wrote:
>> Hi, I am a student working with the Xilinx C model of the FFT.
Following
>> are my settings:
>>
>> Pipelined arch. (this restricts the scaling to be applied only after
every
>> pair of radix-2 butterflies)
>>
>> Precision of 8 bits.
>>
>> Twiddle precision 8 bits.
>>
>> FFT size 1024.
>>
>>
>> If I use a conservative scaling schedule of [2, 2, 2, 2, 2] (i.e. divide
by
>> 4 for every pair of Radix2 BFs) ... most of the outputs are 0. Is this
>> possible? -- i.e. outputs are getting too scaled.
>>
>> If I use block floating -- I get much better results (close to the
floating
>> point golden outputs).
>>
>> A loose scaling schedule [1, 1, 1, 1, 1] .. (i.e. divide by 4 for every
>> pair of Radix2 BFs) causes overflow.
>>
>>
>> Other than using block floating OR increasing my precision -- is there
any
>> other way of achieving better results (not so many 0s)? I also want my
>> design to be protected from overflow under any input conditions (assume
>> this is general purpose)
>>
>> Thank you.
>
>I simply don't see, for all but some predefined signal with known
>characteristics, how a 1024 bin FFT is going to benefit you in any way
>with such low precision. As a rule of thumb, to catch everything that's
>going on you should have an output precision that's 10 bits deeper than
>your input -- so 8 bits out is just insufficient, any way you slice it.
>
>--
>Tim Wescott
>Control system and signal processing consulting
>www.wescottdesign.com



Actually I use a randomly generated (uniformly distributed) input that
gives me pretty good (SNR of 39dB) outputs when I use Block floating point
with the precision of 8 bits.








>
From: Steve Pope on
onkars <onkar.sarode(a)n_o_s_p_a_m.gmail.com> wrote:

> (Tim wrotes)

>>onkars wrote:

>>> Hi, I am a student working with the Xilinx C model of the FFT.

[snip precision discussion]

>>I simply don't see, for all but some predefined signal with known
>>characteristics, how a 1024 bin FFT is going to benefit you in any way
>>with such low precision. As a rule of thumb, to catch everything that's
>>going on you should have an output precision that's 10 bits deeper than
>>your input -- so 8 bits out is just insufficient, any way you slice it.

>Actually I use a randomly generated (uniformly distributed) input that
>gives me pretty good (SNR of 39dB) outputs when I use Block floating point
>with the precision of 8 bits.

That's a good way to do it.

In more detail, I would follow a procedure along the lines of
the following:

(1) Implement the FFT in high precision, such as double-precision
floating point.

(2) Run through this a series of randomly-generated test cases, at
different RMS levels over the dynamic range of interest. Save
the resulting inputs and outputs as test vectors.

(3) Implement the FFT at the designed target precisions --
input, internal, and output.

(4) Run the same vectors through this fixed-point version, and
compare its output to that of the full-precision version.
From this, generate a plot of RMS error vs. input level, and
determine if it meets your requirements.

If it does not, revise the precision and go back to step (3)
and try again.

(What you might probably find is that you need to keep around 4 to
6 underflow bits at each internal stage, and that your output
needs to have at least two bits greater precision than your input.
But it depends on details of both your requirements and your
implementations.)

Steve
From: onkars on
>onkars <onkar.sarode(a)n_o_s_p_a_m.gmail.com> wrote:
>
>> (Tim wrotes)
>
>>>onkars wrote:
>
>>>> Hi, I am a student working with the Xilinx C model of the FFT.
>
>[snip precision discussion]
>
>>>I simply don't see, for all but some predefined signal with known
>>>characteristics, how a 1024 bin FFT is going to benefit you in any way
>>>with such low precision. As a rule of thumb, to catch everything that's

>>>going on you should have an output precision that's 10 bits deeper than

>>>your input -- so 8 bits out is just insufficient, any way you slice it.
>
>>Actually I use a randomly generated (uniformly distributed) input that
>>gives me pretty good (SNR of 39dB) outputs when I use Block floating
point
>>with the precision of 8 bits.
>
>That's a good way to do it.
>
>In more detail, I would follow a procedure along the lines of
>the following:
>
>(1) Implement the FFT in high precision, such as double-precision
>floating point.
>
>(2) Run through this a series of randomly-generated test cases, at
>different RMS levels over the dynamic range of interest. Save
>the resulting inputs and outputs as test vectors.
>
>(3) Implement the FFT at the designed target precisions --
>input, internal, and output.
>
>(4) Run the same vectors through this fixed-point version, and
>compare its output to that of the full-precision version.
>From this, generate a plot of RMS error vs. input level, and
>determine if it meets your requirements.
>
>If it does not, revise the precision and go back to step (3)
>and try again.
>
>(What you might probably find is that you need to keep around 4 to
>6 underflow bits at each internal stage, and that your output
>needs to have at least two bits greater precision than your input.
>But it depends on details of both your requirements and your
>implementations.)
>
>Steve
>


@Steve .. thank you for the response.

I am sorry but I don't understand what you mean by "keep around 4 to 6
underflow bits at each internal stage"