From: jungledmnc on
Hi there. I'm going to implement circular convolution with zero latency,
however here I'll simplify it to say 1024 points latency, because I need to
understand where is the problem.

So let's say we have an extremely long kernel, like 1000 * 1024 points. Now
this is the input:
i1 i2 i3 .. i1000 (each 1024 points long).

And this is the kernel:
k1 k2 k3 .. k1000 (each 1024 points long).

To perform convolution we can zero pad each input and kernel block to 2048
samples and:
out1 = idft( dft(i1 zero padded) * dft(k1 zero padded) )

And this is the result of convolution for input i1000:
finalout1000 = out1 + out2 + .. + out1000 + leakagefromfinalout999

We can obviously precalculate dft's for k1 .. k1000. But we can also
calculate input dft's just once, when processing realtime,
because what is now i1000, will soon be i999, and then i998 etc., right?

And here's my dilemma - I think we can also calculate output idft once. Why
summing out1+out2.., when we can sum it in frequency domain? So, is this
true: ?

finalout1000 = idft(
dft(i1 zero padded) * dft(k1 zero padded) +
dft(i2 zero padded) * dft(k2 zero padded) +
..
dft(i1000 zero padded) * dft(k1000 zero padded)) +
leakagefromfinalout999

If that would be true, the convolution could be done too quickly I believe,
since FFT is the most time-consuming task here. But why it is wrong?

Thank in advance.
From: Andor on
On 17 Mai, 15:58, "jungledmnc" <jungledmnc(a)n_o_s_p_a_m.gmail.com>
wrote:
> Hi there. I'm going to implement circular convolution with zero latency,
> however here I'll simplify it to say 1024 points latency, because I need to
> understand where is the problem.
>
> So let's say we have an extremely long kernel, like 1000 * 1024 points. Now
> this is the input:
> i1 i2 i3 .. i1000 (each 1024 points long).
>
> And this is the kernel:
> k1 k2 k3 .. k1000 (each 1024 points long).
>
> To perform convolution we can zero pad each input and kernel block to 2048
> samples and:
> out1 = idft( dft(i1 zero padded) * dft(k1 zero padded) )
>
> And this is the result of convolution for input i1000:
> finalout1000 = out1 + out2 + .. + out1000 + leakagefromfinalout999
>
> We can obviously precalculate dft's for k1 .. k1000. But we can also
> calculate input dft's just once, when processing realtime,
> because what is now i1000, will soon be i999, and then i998 etc., right?
>
> And here's my dilemma - I think we can also calculate output idft once. Why
> summing out1+out2.., when we can sum it in frequency domain? So, is this
> true: ?
>
> finalout1000 = idft(
>   dft(i1 zero padded) * dft(k1 zero padded) +
>   dft(i2 zero padded) * dft(k2 zero padded) +
>   ..
>   dft(i1000 zero padded) * dft(k1000 zero padded)) +
>   leakagefromfinalout999
>
> If that would be true, the convolution could be done too quickly I believe,
> since FFT is the most time-consuming task here. But why it is wrong?
>
> Thank in advance.

I'm not quite sure what your question is, but you might find answers
here:

http://pcfarina.eng.unipr.it/Public/AES-113/

Regards,
Andor
From: jungledmnc on
Aaaa, thanks a lot! It's in the Microsoft patent!