From: Wolfgang Schwanghart on
Dear all,

I am not an expert in optimization and hence I'd really
appreciate your help with following problem.

Consider n random sampling locations in 2-D. E.g.

n = 100;
x = rand(n,2);

The pairwise euclidean distance between all sampling points
is (in case you have the Stats toolbox)

d = pdist(x);

You may want to plot a histogram of d

hist(d,20);

and realize, that the distances have a non-uniform
distribution. But a uniform distribution of the sampling
points is what I want. Hence, I'd like to shift the sampling
locations so that their pairwise distance approximates a
continuous uniform distribution.

I have no idea how to formulate such an optimization
problem. If you have an idea, hint, ... whatever, that'll be
great. If the question is too general to be answered here,
please let me know. This is no homework :-)

Best regards,
Wolfgang

From: Wolfgang Schwanghart on
I forgot to mention... The optimization should follow the
constraint that the sampling locations remain within the
rectangular sampling domain set by the initial random
configuration.

x(:,1) >= min(x(:,1)) & <= max(x(:,1))
x(:,2) >= min(x(:,2)) & <= max(x(:,2))
From: John D'Errico on
"Wolfgang Schwanghart" <schwanghart(a)googlemail.com> wrote in message
<fvp5bg$s0t$1(a)fred.mathworks.com>...
> Dear all,
>
> I am not an expert in optimization and hence I'd really
> appreciate your help with following problem.
>
> Consider n random sampling locations in 2-D. E.g.
>
> n = 100;
> x = rand(n,2);
>
> The pairwise euclidean distance between all sampling points
> is (in case you have the Stats toolbox)
>
> d = pdist(x);
>
> You may want to plot a histogram of d
>
> hist(d,20);
>
> and realize, that the distances have a non-uniform
> distribution. But a uniform distribution of the sampling
> points is what I want. Hence, I'd like to shift the sampling
> locations so that their pairwise distance approximates a
> continuous uniform distribution.
>
> I have no idea how to formulate such an optimization
> problem. If you have an idea, hint, ... whatever, that'll be
> great. If the question is too general to be answered here,
> please let me know. This is no homework :-)

Well, its obviously NOT a homework problem.
Why not? Because it has no solution, nor is it
a well formulated problem mathematically.

Consider any given point. If the set of interpoint
distances has as many points close to that point
as there are at any given distance, then in the
(x,y) plane, the density of the points must fall off
as the inverse of the distance from the chosen
location.

Conversely, if the points were perfectly uniformly
distributed over the entire plane, then if we were
to plot a histogram of the points as a function of
euclidean distance, then that histogram would be
linearly increasing. But, we cannot have a set of
points which falls off in density as you are asking
from any location in space. Perhaps you are wishing
for a perfectly uniform sampling of points over the
domain of interest.

Of course, it makes no sense to talk about a
uniform distribution over an infinite plane. When
you sampled randomly over the unit square, the
points near the edges of the square had fewer
near neighbors than did the points in the center
of the square. And of course, the set of distances
that are large enough to be near sqrt(2) will be
small when you sample in the unit square.

So, due to these edge effects, we cannot ever
have a uniform set of distances over the unit
square (or any closed region) come out of pdist.

As I said, this problem has no solution as you
have posed it. Perhaps you are really interested
in having a set that has a roughly CONSTANT
distance to the nearest neighbor for any point.
(Even this wording is probably poor on my part
as I think about it.)

HTH,
John
From: Wolfgang Schwanghart on
Thanks very much John for your answer. One of the main
problems of non-mathematicians is probably to pose well
formulated problems. Hence, let me explain where I want to go.

I want to set up a sampling scheme inside a rectangle from
which I get the best idea of small to large scale spatial
variability of some geochemical parameter (e.g. N in soils).
In order to assess spatial variability I'll use the
empirical variogram.

http://en.wikipedia.org/wiki/Variogram

In the variogram all distances should be sufficiently
represented for a given number of sampling locations.

Well... colleagues calling for lunchtime ... I think I need
to rethink my problem.

Thanks again,
Wolfgang




"John D'Errico" <woodchips(a)rochester.rr.com> wrote in
message <fvpagh$75i$1(a)fred.mathworks.com>...
> "Wolfgang Schwanghart" <schwanghart(a)googlemail.com> wrote
in message
> <fvp5bg$s0t$1(a)fred.mathworks.com>...
> > Dear all,
> >
> > I am not an expert in optimization and hence I'd really
> > appreciate your help with following problem.
> >
> > Consider n random sampling locations in 2-D. E.g.
> >
> > n = 100;
> > x = rand(n,2);
> >
> > The pairwise euclidean distance between all sampling points
> > is (in case you have the Stats toolbox)
> >
> > d = pdist(x);
> >
> > You may want to plot a histogram of d
> >
> > hist(d,20);
> >
> > and realize, that the distances have a non-uniform
> > distribution. But a uniform distribution of the sampling
> > points is what I want. Hence, I'd like to shift the sampling
> > locations so that their pairwise distance approximates a
> > continuous uniform distribution.
> >
> > I have no idea how to formulate such an optimization
> > problem. If you have an idea, hint, ... whatever, that'll be
> > great. If the question is too general to be answered here,
> > please let me know. This is no homework :-)
>
> Well, its obviously NOT a homework problem.
> Why not? Because it has no solution, nor is it
> a well formulated problem mathematically.
>
> Consider any given point. If the set of interpoint
> distances has as many points close to that point
> as there are at any given distance, then in the
> (x,y) plane, the density of the points must fall off
> as the inverse of the distance from the chosen
> location.
>
> Conversely, if the points were perfectly uniformly
> distributed over the entire plane, then if we were
> to plot a histogram of the points as a function of
> euclidean distance, then that histogram would be
> linearly increasing. But, we cannot have a set of
> points which falls off in density as you are asking
> from any location in space. Perhaps you are wishing
> for a perfectly uniform sampling of points over the
> domain of interest.
>
> Of course, it makes no sense to talk about a
> uniform distribution over an infinite plane. When
> you sampled randomly over the unit square, the
> points near the edges of the square had fewer
> near neighbors than did the points in the center
> of the square. And of course, the set of distances
> that are large enough to be near sqrt(2) will be
> small when you sample in the unit square.
>
> So, due to these edge effects, we cannot ever
> have a uniform set of distances over the unit
> square (or any closed region) come out of pdist.
>
> As I said, this problem has no solution as you
> have posed it. Perhaps you are really interested
> in having a set that has a roughly CONSTANT
> distance to the nearest neighbor for any point.
> (Even this wording is probably poor on my part
> as I think about it.)
>
> HTH,
> John

From: John D'Errico on
"Wolfgang Schwanghart" <schwanghart(a)googlemail.com> wrote in message
<fvpcai$mfm$1(a)fred.mathworks.com>...
> Thanks very much John for your answer. One of the main
> problems of non-mathematicians is probably to pose well
> formulated problems. Hence, let me explain where I want to go.
>
> I want to set up a sampling scheme inside a rectangle from
> which I get the best idea of small to large scale spatial
> variability of some geochemical parameter (e.g. N in soils).
> In order to assess spatial variability I'll use the
> empirical variogram.
>
> http://en.wikipedia.org/wiki/Variogram
>
> In the variogram all distances should be sufficiently
> represented for a given number of sampling locations.
>
> Well... colleagues calling for lunchtime ... I think I need
> to rethink my problem.
>
> Thanks again,
> Wolfgang

Ah. I was thinking towards a different goal when
I read your first post.

A problem is that the distance distribution of
any set of points inside a finite 2-d domain like
the unit square will be strongly influenced by the
edge effects. So that distribution must drop off
strongly as you begin to look at distances that
span the diameter of the region.

Next, any roughly uniform sampling in 2-d
will exhibit an increase in the number of
neighbors at a distance of d as d increases
from zero. We can see this by understanding
that the area of an annular ring of thickness
dr is pi*r*dr. So we should expect the distance
distribution to rise roughly linearly with radius
for small r, at least for any "locally uniform"
sampling. Again, there are edge effects that
will probably screw this up, making the increase
nonlinear even for small radius.

So I'm not at all sure that a goal of a maximally
uniform distance distribution is achievable.

John