From: Vladimir Vassilevsky on


_tomislav wrote:

> Fastfloat16 can't fit into 16 bits - format provides one 16-bit word for
> exponent and one 16-bit word for fraction.

BTW, there is a 16-bit short float format (natively supported by SHARC
DSP, for example), which is confused with what you described.

> Fastfloat16 gives a good precision and range.

Who in the hell may need 16 bits for exponent?

> Let's assume that well
> optimized FFT algorithm has a complexity of O(Nlog10).

The log10 of what?

There is no benefit in using true floating point. You will get to the
same precision with the block floating point integer FFT, and it is
going to be faster. This is a standard thing to do.

> On a fixed point DSP

Yes, on a fixed point DSP.

> like BF532 16-bit multiplication/addition is completed in one cycle. So, N
> is equal to cycle number.

Dream on. There is a lot of other overhead.

> If fastfloat16 multiplication needs, lets say, 30
> cycles in average for multiplication/addition I think this outperforms
> floating point format on fixed point machine.

What exactly are you trying to accomplish?



Vladimir Vassilevsky
DSP and Mixed Signal Design Consultant
http://www.abvolt.com
From: _tomislav on
Tim, Vladimir

thank you for your replies, they have been very helpful. This is my first
step in DSP world so I have a lot to learn.

I was looking for a fast implementation of 16-bit precision FFT algorithm.
I have read an article describing fastfloat16 on blackfin processor.
Proposed format seemed to be OK for my project. But now, I think that block
floating point is much better option.

I'll use ADI BF532 for signal processing. I found article describing block
fft implementation on TMS320C55x. I'll study the article to see how fast
really is block floating point fft. My only concerns is precison of block
floating point if numbers have wide dynamic range.


From: Steven G. Johnson on
On Oct 20, 11:09 am, Tim Wescott <t...(a)seemywebsite.com> wrote:
> I would assume that for a well-scaled fixed point algorithm you'd want at
> least your input precision, plus log2 of your number of bins, plus two or
> three bits for good measure -- that'll add up to 16 pretty darn quick,
> much less whatever is available for a mantissa in fastfloat16.

I don't think this is correct. One fact which is often overlooked is
that a properly implemented floating-point FFT accumulates roundoff
error far, far more slowly than a fixed-point FFT.

In fixed point, the roundoff error growth for a Cooley-Tukey FFT is O
(sqrt(N)), which is why you need to add ~ log2(N)/2 bits of precision
as N grows. In floating-point, the rms roundoff error growth is O(sqrt
(log N)), which is so slow that in practice it can usually be treated
as independent of N.

For example, a single-precision floating-point FFT can achieve nearly
machine-precision accuracy (~7 decimal places) even for N=2^24 (see
http://fftw.org/accuracy/CoreDuo-3.0GHz-icc64/).

Whether this is worth it in the original poster's case I cannot say,
but the key point is that a floating-point FFT can get away with many
fewer bits of precision than a fixed-point FFT of the same size, so
your intuition here seems wrong to me. Although I've never tried 16-
bit floating-point, I would guess that a well-implemented 16-bit FFT
is likely to achieve around 3 decimal digits of root-mean-square
relative accuracy (i.e. nearly machine precision, assuming a 12-bit
significand) for any N that is feasible on a DSP chip.

Regards,
Steven G. Johnson
First  |  Prev  | 
Pages: 1 2
Prev: rahasia uang 250 ribu
Next: using FFT as a channelizer