From: markt on
>>C) "Most ICA methods are not able to extract the actual number of
>> source signals, the order of the source signals, nor the signs
>> or the scales of the sources."
>>
>>Contemplate that one for a bit, particularly in view of
>>quote A) at the top: "Most" ICA methods can't do the job
>>ICA is supposed to do! Mind you, in line with the great
>>academic tradition, the word 'most' is here used in one
>>of two functions:
>
>The first part of this is a bit nebulous since there is a point at which
>some sources are simply lost in the background. You have to have some a
>priori guess with any blind detection method as to how large your spread
>is, e.g., you sort of need to know how large your delay spread in a comm
>channel is before deciding how many multipaths there actually are.

I should add, the "nebulous" part of the statement is that you have to run
ICA repeatedly until you get to a point you think you have resolved all of
the individual sources. This requires either a priori information (as I
mentioned) about the number of sources, e.g., you know there are 10 sources
so you find 10 independent results and quit, or some criteria to test each
output. For speech, the test could be something like checking for an
intelligible output, or perhaps that the output signals actually contain
formants that resemble speech. Both would require a pattern matching
algorithm of sorts, as I see it, on the output.

Since PCA is variance-based, you can collect enough vectors that describe
some pre-determined amount of the total signal variance, say 95%, and
assume the rest is either unrecoverable or noise.

Mark
From: markt on
>I don't buy that, unless you come up with an anal definition
>of the problem ICA is supposed to solve. My experience is
>with Direction of Arrival problems which aims at separating
>multiple sources in a spatial domain. MUSIC is a very popular
>choise as basis for investigations, and in that case it is
>a trivial excercise to show that the maxmum number of sources
>is strongly related to the number of sensors in the array.

I made it pretty clear that with a single receiver you lose spatial
resolution, i.e., you no longer have DOA information. I'm not talking
about exploiting DOA, but independence. An MMSE receiver for a comm
system, for example, exploits the fact that users/multipaths are
uncorrelated. You start with an input signal, r, and your symbol length,
M, as well as delay spread, P. You collect a vector r_vec(n) = [r(n+M+P-1)
r(n+M+P-2) ... r(n)]^T, which gives you M+P "observations" used to estimate
the channel impulse response. The MMSE result is given by the solution to
the Wiener-Hopf equation w = inv(R)*p where R is E{r_vec*r_vec^H} and p is
E{r*conj(r_vec)};

>The reason is that MUSIC and similar methods is based on
>examining certain eigenvectors of the signal covariance
>matrix. The dimension of that covariance matrix is given
>by the number of sensors, which in turn govern the whole
>analysis. I would be cerys surprised if the analysis of
>ICA is fundamentally different.

Depends upon how you implement it. All ICA does (essentially) is use
higher-order statistics. Nothing magical, just an extension of the fact
that if you have a linear combination of independent vectors, there is only
one solution that results in independent vector outputs. If you have
multiple receivers, you also have DOA information which can only help.

>I have expereience with multipath in sonar environments.
>There is nothing available which comes even remotely
>close to work 'quite well' in multipath scenarios.

I'm sorry to hear that. Comm systems (DS-CDMA) employ a RAKE on the
forward link which is typically implemented with a bank of correlator
receivers to estimate each of the multipaths. Of course, you have the
pilot sequence, which is known, to base your estimation on. The correlator
receiver is not optimal (w.r.t. MSE) in the presence of multipath or
anything other than an AWGN channel. The MMSE receiver above, is, however,
and knowledge of the pilot is unnecessary.

I suspect, perhaps, that the delay spread in sonar is very large? That
might make a solution in that realm very difficult.

>I don't trust the details supplied by a method which misses
>a main parameter like the number of sources present.

Well, it is "blind," so this is hardly a surprise. That's why there's
research in the area, because applications in which a priori information is
not available are more common than not. I also pointed out that PCA
provides a means for determining the number of sources, which is typically
applied on the front-end of an ICA system. For example, besides
pre-whitening, you could examine the autocorrelation matrix and come to
conclusions about the number of sources present.

There is no one method that's going to universally solve all of these (and
other) problems.

>So you don't learn anything new from that sort of analysis,
>but rather get your prejudice confirmed? If you need to
>'guess' or 'assume' major parameters of the scenario you
>have already lost. If you use a deterministic parametric
>model you state your assumptions clearly up front and is
>able to analyze them, and also have a fair chance of
>spotting flaws when such occur. "Blind" methods hides
>such factors from the analysi, obfuscating the whole
>picture.

I'm not sure what you're getting at here. ICA will separate the sources
if they are independent (or nearly so). Couple the ICA with other methods
(PCA) and you have good estimates of the parameters you request. The
real-world isn't so kind as to provide all the answers automatically.

The whole purpose of blind methods is exactly to deal with scenarios in
which you do not have complete information. The reverse link of a comm
system has to guess how many users are present before they are acquired.
There is no deterministic way to make this assumption a priori, yet my cell
phone works quite well (most of the time). How is this any different?

Mark