|
From: glowkeeper on 23 Apr 2008 10:17 This is heavily related to this post: http://www.dsprelated.com/showmessage/89637/1.php, but since that's a little old now, I thought I'd start a new thread ;) I am trying to use the Harmonic Product Spectrum method to guess the fundamental frequency from an input to a microphone. It's working 50% of the time, but the other 50% it's picking out harmonics other than the fundamental. I wonder if someone could help explain as to why, and perhaps suggest where I might be going wrong or could improve the algorithm outlined below. I'm using fftw to give me a fourier transform full of reals from the microphone input. From that I calculate a power spectrum by taking the absolute value of each bin of the fft. I then 2, 3 and 4x downsample the power spectrum into different arrays by taking the 2nd, 3rd and 4th output of the power spectrum respectively. Next, I multiply each bin of the power, 2x downsampled, 3x downsampled and 4x downsampled spectrums to produce a Harmonic Product Spectrum. Finally, I search for the bin of the greatest magnitude and calculate it's frequency by doing bin * ( sampleRate / fftsize ). fyi, the sample rate of my mic' is 11khz, and I'm using an fft size of 2k. Therefore each bin of my fft represents ~5Hz. Any help and suggestions greatly appreciated! Steve
From: Ron N. on 23 Apr 2008 13:48 On Apr 23, 6:17 am, "glowkeeper" <glowkee...(a)yahoo.co.uk> wrote: > This is heavily related to this post:http://www.dsprelated.com/showmessage/89637/1.php, but since that's a > little old now, I thought I'd start a new thread ;) > > I am trying to use the Harmonic Product Spectrum method to guess the > fundamental frequency from an input to a microphone. It's working 50% of > the time, but the other 50% it's picking out harmonics other than the > fundamental. I wonder if someone could help explain as to why, and perhaps > suggest where I might be going wrong or could improve the algorithm > outlined below. Have you examined the spectrum of those fft frames where you think your method is picking out harmonics other than the fundamental? > I'm using fftw to give me a fourier transform full of reals from the > microphone input. From that I calculate a power spectrum by taking the > absolute value of each bin of the fft. How are you taking the absolute value of each complex fft bin? > I then 2, 3 and 4x downsample the power spectrum into different arrays by > taking the 2nd, 3rd and 4th output of the power spectrum respectively. > > Next, I multiply each bin of the power, 2x downsampled, 3x downsampled and > 4x downsampled spectrums to produce a Harmonic Product Spectrum. > > Finally, I search for the bin of the greatest magnitude and calculate it's > frequency by doing bin * ( sampleRate / fftsize ). > > fyi, the sample rate of my mic' is 11khz, and I'm using an fft size of 2k. > Therefore each bin of my fft represents ~5Hz. If you downsample your fft, shouldn't that change the bin width? Or are you throwing away spectral energy? That loss of information could certainly effect the results. IMHO. YMMV. -- rhn A.T nicholson d.0.t C-o-M http://www.nicholson.com/rhn/dsp.html
From: glowkeeper on 24 Apr 2008 03:33 >On Apr 23, 6:17 am, "glowkeeper" <glowkee...(a)yahoo.co.uk> wrote: >> This is heavily related to this post:http://www.dsprelated.com/showmessage/89637/1.php, but since that's a >> little old now, I thought I'd start a new thread ;) >> >> I am trying to use the Harmonic Product Spectrum method to guess the >> fundamental frequency from an input to a microphone. It's working 50% of >> the time, but the other 50% it's picking out harmonics other than the >> fundamental. I wonder if someone could help explain as to why, and perhaps >> suggest where I might be going wrong or could improve the algorithm >> outlined below. > >Have you examined the spectrum of those fft frames where >you think your method is picking out harmonics other than >the fundamental? Not exactly. You're right, I will examine the data a little closer. >> I'm using fftw to give me a fourier transform full of reals from the >> microphone input. From that I calculate a power spectrum by taking the >> absolute value of each bin of the fft. > >How are you taking the absolute value of each complex fft >bin? The app' is c++, and I'm using the standard c api call: fabs( fft[binNumber ] ); I suppose I could also do something like taking the square root of the square, but that seems expensive (although granted, I don't know what fabs is doing under the hood). >> I then 2, 3 and 4x downsample the power spectrum into different arrays by >> taking the 2nd, 3rd and 4th output of the power spectrum respectively. >> >> Next, I multiply each bin of the power, 2x downsampled, 3x downsampled and >> 4x downsampled spectrums to produce a Harmonic Product Spectrum. >> >> Finally, I search for the bin of the greatest magnitude and calculate it's >> frequency by doing bin * ( sampleRate / fftsize ). >> >> fyi, the sample rate of my mic' is 11khz, and I'm using an fft size of 2k. >> Therefore each bin of my fft represents ~5Hz. > >If you downsample your fft, shouldn't that change the bin >width? Or are you throwing away spectral energy? That >loss of information could certainly effect the results. Ah, now this was something I wasn't sure of to be honest. Maybe? I output my product to an array the same size as the fft, and therefore I thought the bin width was the same. But I think you're suggestingt that my 2x bin width is ~5Hz/2, 3x ~5Hz/3, 4x ~5Hz/4, and therefore when I find the bin of greatest magnitude in the harmonic product spectrum, I need to factor this into my equation and not just bin * ( sampleRate / fftsize ), right? Thanks for your help.
From: Steven G. Johnson on 24 Apr 2008 11:45 On Apr 24, 3:33 am, "glowkeeper" <glowkee...(a)yahoo.co.uk> wrote: > The app' is c++, and I'm using the standard c api call: > > fabs( fft[binNumber ] ); > > I suppose I could also do something like taking the square root of the > square, but that seems expensive (although granted, I don't know what fabs > is doing under the hood). fabs is the absolute value of a real number. You have to use sqrt(re*re + im*im) instead, or hypot(re, im) on Unix (which is more accurate), or abs(std::complex<double>(re, im)) in C++, or .... where re and im are the real and imaginary parts, respectively. Regards, Steven G. Johnson
From: markt on 24 Apr 2008 11:56 >I suppose I could also do something like taking the square root of the >square, but that seems expensive (although granted, I don't know what fabs >is doing under the hood). First of all, your Fourier coefficients should be complex, even if the input data are real. From FFTW's website, the planning function is: fftw_plan fftw_plan_dft_r2c_1d(int n, double *in, fftw_complex *out, unsigned flags); where *in is a pointer to your real input array, and *out is a pointer to your complex output array. Next, fabs() performs an absolute value of a real data point. For power, you actually want the absolute value squared. I.e., given a complex coefficient of (a+jb), the power is (a+jb)(a-jb), or abs(a+jb)^2. Since you're using FFTW, the output array is defined as type fftw_complex (or fftwf_complex for single-precision), unless you've included the complex.h header from the C99 math library. If so, then it will be of type double/float complex at the output. I've never done it this way but maybe Steven Johnson can chime in on how it works better (I prefer their convention since the complex math doesn't work as well on the processors I've used). Either way, the appropriate function at that point would be cabs(x)*cabs(x) (well, y = cabs(x), then y2 = y*y, if you want to out-think the compiler), which is the complex magnitude squared. Mark
|
Next
|
Last
Pages: 1 2 Prev: Zeroforcing precoding/beamforming Next: Retry: Java Audio question - PLEASE HELP |