From: glowkeeper on
This is heavily related to this post:
http://www.dsprelated.com/showmessage/89637/1.php, but since that's a
little old now, I thought I'd start a new thread ;)

I am trying to use the Harmonic Product Spectrum method to guess the
fundamental frequency from an input to a microphone. It's working 50% of
the time, but the other 50% it's picking out harmonics other than the
fundamental. I wonder if someone could help explain as to why, and perhaps
suggest where I might be going wrong or could improve the algorithm
outlined below.

I'm using fftw to give me a fourier transform full of reals from the
microphone input. From that I calculate a power spectrum by taking the
absolute value of each bin of the fft.

I then 2, 3 and 4x downsample the power spectrum into different arrays by
taking the 2nd, 3rd and 4th output of the power spectrum respectively.

Next, I multiply each bin of the power, 2x downsampled, 3x downsampled and
4x downsampled spectrums to produce a Harmonic Product Spectrum.

Finally, I search for the bin of the greatest magnitude and calculate it's
frequency by doing bin * ( sampleRate / fftsize ).

fyi, the sample rate of my mic' is 11khz, and I'm using an fft size of 2k.
Therefore each bin of my fft represents ~5Hz.

Any help and suggestions greatly appreciated!

Steve





From: Ron N. on
On Apr 23, 6:17 am, "glowkeeper" <glowkee...(a)yahoo.co.uk> wrote:
> This is heavily related to this post:http://www.dsprelated.com/showmessage/89637/1.php, but since that's a
> little old now, I thought I'd start a new thread ;)
>
> I am trying to use the Harmonic Product Spectrum method to guess the
> fundamental frequency from an input to a microphone. It's working 50% of
> the time, but the other 50% it's picking out harmonics other than the
> fundamental. I wonder if someone could help explain as to why, and perhaps
> suggest where I might be going wrong or could improve the algorithm
> outlined below.

Have you examined the spectrum of those fft frames where
you think your method is picking out harmonics other than
the fundamental?

> I'm using fftw to give me a fourier transform full of reals from the
> microphone input. From that I calculate a power spectrum by taking the
> absolute value of each bin of the fft.

How are you taking the absolute value of each complex fft
bin?

> I then 2, 3 and 4x downsample the power spectrum into different arrays by
> taking the 2nd, 3rd and 4th output of the power spectrum respectively.
>
> Next, I multiply each bin of the power, 2x downsampled, 3x downsampled and
> 4x downsampled spectrums to produce a Harmonic Product Spectrum.
>
> Finally, I search for the bin of the greatest magnitude and calculate it's
> frequency by doing bin * ( sampleRate / fftsize ).
>
> fyi, the sample rate of my mic' is 11khz, and I'm using an fft size of 2k.
> Therefore each bin of my fft represents ~5Hz.

If you downsample your fft, shouldn't that change the bin
width? Or are you throwing away spectral energy? That
loss of information could certainly effect the results.



IMHO. YMMV.
--
rhn A.T nicholson d.0.t C-o-M
http://www.nicholson.com/rhn/dsp.html
From: glowkeeper on
>On Apr 23, 6:17 am, "glowkeeper" <glowkee...(a)yahoo.co.uk> wrote:
>> This is heavily related to this
post:http://www.dsprelated.com/showmessage/89637/1.php, but since that's a
>> little old now, I thought I'd start a new thread ;)
>>
>> I am trying to use the Harmonic Product Spectrum method to guess the
>> fundamental frequency from an input to a microphone. It's working 50%
of
>> the time, but the other 50% it's picking out harmonics other than the
>> fundamental. I wonder if someone could help explain as to why, and
perhaps
>> suggest where I might be going wrong or could improve the algorithm
>> outlined below.
>
>Have you examined the spectrum of those fft frames where
>you think your method is picking out harmonics other than
>the fundamental?

Not exactly. You're right, I will examine the data a little closer.

>> I'm using fftw to give me a fourier transform full of reals from the
>> microphone input. From that I calculate a power spectrum by taking the
>> absolute value of each bin of the fft.
>
>How are you taking the absolute value of each complex fft
>bin?

The app' is c++, and I'm using the standard c api call:

fabs( fft[binNumber ] );

I suppose I could also do something like taking the square root of the
square, but that seems expensive (although granted, I don't know what fabs
is doing under the hood).

>> I then 2, 3 and 4x downsample the power spectrum into different arrays
by
>> taking the 2nd, 3rd and 4th output of the power spectrum respectively.
>>
>> Next, I multiply each bin of the power, 2x downsampled, 3x downsampled
and
>> 4x downsampled spectrums to produce a Harmonic Product Spectrum.
>>
>> Finally, I search for the bin of the greatest magnitude and calculate
it's
>> frequency by doing bin * ( sampleRate / fftsize ).
>>
>> fyi, the sample rate of my mic' is 11khz, and I'm using an fft size of
2k.
>> Therefore each bin of my fft represents ~5Hz.
>
>If you downsample your fft, shouldn't that change the bin
>width? Or are you throwing away spectral energy? That
>loss of information could certainly effect the results.

Ah, now this was something I wasn't sure of to be honest. Maybe? I output
my product to an array the same size as the fft, and therefore I thought
the bin width was the same.

But I think you're suggestingt that my 2x bin width is ~5Hz/2, 3x ~5Hz/3,
4x ~5Hz/4, and therefore when I find the bin of greatest magnitude in the
harmonic product spectrum, I need to factor this into my equation and not
just bin * ( sampleRate / fftsize ), right?

Thanks for your help.


From: Steven G. Johnson on
On Apr 24, 3:33 am, "glowkeeper" <glowkee...(a)yahoo.co.uk> wrote:
> The app' is c++, and I'm using the standard c api call:
>
> fabs( fft[binNumber ] );
>
> I suppose I could also do something like taking the square root of the
> square, but that seems expensive (although granted, I don't know what fabs
> is doing under the hood).

fabs is the absolute value of a real number. You have to use
sqrt(re*re + im*im) instead, or hypot(re, im) on Unix (which is more
accurate), or abs(std::complex<double>(re, im)) in C++, or .... where
re and im are the real and imaginary parts, respectively.

Regards,
Steven G. Johnson
From: markt on
>I suppose I could also do something like taking the square root of the
>square, but that seems expensive (although granted, I don't know what
fabs
>is doing under the hood).

First of all, your Fourier coefficients should be complex, even if the
input data are real. From FFTW's website, the planning function is:

fftw_plan fftw_plan_dft_r2c_1d(int n, double *in, fftw_complex *out,
unsigned flags);

where *in is a pointer to your real input array, and *out is a pointer to
your complex output array.

Next, fabs() performs an absolute value of a real data point. For power,
you actually want the absolute value squared. I.e., given a complex
coefficient of (a+jb), the power is (a+jb)(a-jb), or abs(a+jb)^2. Since
you're using FFTW, the output array is defined as type fftw_complex (or
fftwf_complex for single-precision), unless you've included the complex.h
header from the C99 math library. If so, then it will be of type
double/float complex at the output. I've never done it this way but maybe
Steven Johnson can chime in on how it works better (I prefer their
convention since the complex math doesn't work as well on the processors
I've used). Either way, the appropriate function at that point would be
cabs(x)*cabs(x) (well, y = cabs(x), then y2 = y*y, if you want to out-think
the compiler), which is the complex magnitude squared.

Mark