|
Prev: Is 44.1 KHz sample-rate enough to cover the entire human hearing range?
Next: Is 44.1 KHz sample-rate enough to cover the entire human hearing range?
From: Steven G. Johnson on 6 May 2008 20:11 On May 6, 3:12 am, glen herrmannsfeldt <g...(a)ugcs.caltech.edu> wrote: > > Cooley-Tukey. It is O(N log N) for all N. > > An interesting statement. O() notation gives the order as > N goes to infinity, and it may or may not be of any use > for small N. Now, is infinity prime or composite? If you look at the definition of asymptotic (O) notation, you'll see that there's no need to say whether "infinity" is prime or composite. In practice, the O(N log N) prime-size algorithms already have clear advantage for N > 100. In principle, you usually have enough freedom to choose your transform to be a composite size, but many people apparently find it useful to not have to worry about resizing their data, without being afraid that if they are unlucky the FFT will be thousands of times slower (as opposed to just a factor of 10). Steven
From: Steven G. Johnson on 6 May 2008 20:23 On May 5, 12:02 pm, "markt" <tak...(a)pericle.com> wrote: > Yeah, as you note, even "optimal" in terms of the least amount of FLOPS is > not necessarily "optimal" in terms of speed. And even optimality in terms of flop counts is not proven; no one knows tight lower bounds. There isn't even any proof that you can't do better than O(N log N) for an FFT (although no counterexamples are known). > without reference to some criteria. Set your minimum FLOP count algorithm > up in such a way that your next instruction is always waiting for > completion of the current instruction (a pipeline stall) and you'll find it > may not work very well compared to another algorithm with more required > FLOPS, but a better overall flow. Yes, especially for large N I'm not aware of any implementation of the minimal-known-flop FFT algorithm(s) that is competitive in practice with highly optimized implementations. You just have too many other things to worry about that are more important than arithmetic these days. On the other hand, to be fair, I doubt that the highly optimized implementations sacrifice more than 10-15% in the flop count compared to the lowest known flop counts. On the other hand, for small hardcoded transforms of sizes < 100 or so, we have usually found that we do best with the algorithms that come close to the minimum known flop counts. (This relies on the ability to find a good schedule for the code, of course, but we have a generic algorithm for this that seems to work well.) (Sometimes people talk about a "crossover point" at which the O(N log N) algorithms beat the naive O(N^2) algorithm, but in my experience this is mostly about overheads of implementations. For small N you really want to hard-code everything to avoid the overhead, and in that case, we've usually found that a hard-coded O(N log N) algorithm wins over the O(N^2) algorithm for all composite N starting with N=4.) Steven
From: Philip Martel on 6 May 2008 21:32 "glen herrmannsfeldt" <gah(a)ugcs.caltech.edu> wrote in message news:8JCdncXYrN0Anr3VnZ2dnUVZ_oimnZ2d(a)comcast.com... > Steven G. Johnson wrote: > (snip) > >> Short answer: FFTW uses the prime-factor algorithm for small sizes (<= >> 16) within its codelets at the leaves of its recursion. Larger >> composite sizes are broken down into smaller sizes using mixed-radix >> Cooley-Tukey. It is O(N log N) for all N. > > An interesting statement. O() notation gives the order as > N goes to infinity, and it may or may not be of any use > for small N. Now, is infinity prime or composite? Yes. > > -- glen >
From: markt on 5 May 2008 12:02 >(although there is no proof that any of these >algorithms are optimal). Yeah, as you note, even "optimal" in terms of the least amount of FLOPS is not necessarily "optimal" in terms of speed. I had a professor once that used to get quite ticked every time he heard somebody mention "optimal" without reference to some criteria. Set your minimum FLOP count algorithm up in such a way that your next instruction is always waiting for completion of the current instruction (a pipeline stall) and you'll find it may not work very well compared to another algorithm with more required FLOPS, but a better overall flow. Mark
From: Jerry Avins on 6 May 2008 16:15
glen herrmannsfeldt wrote: > ... Now, is infinity prime or composite? :-) ! Jerry -- Engineering is the art of making what you want from things you can get. ����������������������������������������������������������������������� |