|
Prev: a question about angle in solve
Next: Bloated files when saving large variables to .mat files with -append
From: Wolfgang Schwanghart on 6 May 2008 04:37 Dear all, I am not an expert in optimization and hence I'd really appreciate your help with following problem. Consider n random sampling locations in 2-D. E.g. n = 100; x = rand(n,2); The pairwise euclidean distance between all sampling points is (in case you have the Stats toolbox) d = pdist(x); You may want to plot a histogram of d hist(d,20); and realize, that the distances have a non-uniform distribution. But a uniform distribution of the sampling points is what I want. Hence, I'd like to shift the sampling locations so that their pairwise distance approximates a continuous uniform distribution. I have no idea how to formulate such an optimization problem. If you have an idea, hint, ... whatever, that'll be great. If the question is too general to be answered here, please let me know. This is no homework :-) Best regards, Wolfgang
From: Wolfgang Schwanghart on 6 May 2008 05:45 I forgot to mention... The optimization should follow the constraint that the sampling locations remain within the rectangular sampling domain set by the initial random configuration. x(:,1) >= min(x(:,1)) & <= max(x(:,1)) x(:,2) >= min(x(:,2)) & <= max(x(:,2))
From: John D'Errico on 6 May 2008 06:05 "Wolfgang Schwanghart" <schwanghart(a)googlemail.com> wrote in message <fvp5bg$s0t$1(a)fred.mathworks.com>... > Dear all, > > I am not an expert in optimization and hence I'd really > appreciate your help with following problem. > > Consider n random sampling locations in 2-D. E.g. > > n = 100; > x = rand(n,2); > > The pairwise euclidean distance between all sampling points > is (in case you have the Stats toolbox) > > d = pdist(x); > > You may want to plot a histogram of d > > hist(d,20); > > and realize, that the distances have a non-uniform > distribution. But a uniform distribution of the sampling > points is what I want. Hence, I'd like to shift the sampling > locations so that their pairwise distance approximates a > continuous uniform distribution. > > I have no idea how to formulate such an optimization > problem. If you have an idea, hint, ... whatever, that'll be > great. If the question is too general to be answered here, > please let me know. This is no homework :-) Well, its obviously NOT a homework problem. Why not? Because it has no solution, nor is it a well formulated problem mathematically. Consider any given point. If the set of interpoint distances has as many points close to that point as there are at any given distance, then in the (x,y) plane, the density of the points must fall off as the inverse of the distance from the chosen location. Conversely, if the points were perfectly uniformly distributed over the entire plane, then if we were to plot a histogram of the points as a function of euclidean distance, then that histogram would be linearly increasing. But, we cannot have a set of points which falls off in density as you are asking from any location in space. Perhaps you are wishing for a perfectly uniform sampling of points over the domain of interest. Of course, it makes no sense to talk about a uniform distribution over an infinite plane. When you sampled randomly over the unit square, the points near the edges of the square had fewer near neighbors than did the points in the center of the square. And of course, the set of distances that are large enough to be near sqrt(2) will be small when you sample in the unit square. So, due to these edge effects, we cannot ever have a uniform set of distances over the unit square (or any closed region) come out of pdist. As I said, this problem has no solution as you have posed it. Perhaps you are really interested in having a set that has a roughly CONSTANT distance to the nearest neighbor for any point. (Even this wording is probably poor on my part as I think about it.) HTH, John
From: Wolfgang Schwanghart on 6 May 2008 06:36 Thanks very much John for your answer. One of the main problems of non-mathematicians is probably to pose well formulated problems. Hence, let me explain where I want to go. I want to set up a sampling scheme inside a rectangle from which I get the best idea of small to large scale spatial variability of some geochemical parameter (e.g. N in soils). In order to assess spatial variability I'll use the empirical variogram. http://en.wikipedia.org/wiki/Variogram In the variogram all distances should be sufficiently represented for a given number of sampling locations. Well... colleagues calling for lunchtime ... I think I need to rethink my problem. Thanks again, Wolfgang "John D'Errico" <woodchips(a)rochester.rr.com> wrote in message <fvpagh$75i$1(a)fred.mathworks.com>... > "Wolfgang Schwanghart" <schwanghart(a)googlemail.com> wrote in message > <fvp5bg$s0t$1(a)fred.mathworks.com>... > > Dear all, > > > > I am not an expert in optimization and hence I'd really > > appreciate your help with following problem. > > > > Consider n random sampling locations in 2-D. E.g. > > > > n = 100; > > x = rand(n,2); > > > > The pairwise euclidean distance between all sampling points > > is (in case you have the Stats toolbox) > > > > d = pdist(x); > > > > You may want to plot a histogram of d > > > > hist(d,20); > > > > and realize, that the distances have a non-uniform > > distribution. But a uniform distribution of the sampling > > points is what I want. Hence, I'd like to shift the sampling > > locations so that their pairwise distance approximates a > > continuous uniform distribution. > > > > I have no idea how to formulate such an optimization > > problem. If you have an idea, hint, ... whatever, that'll be > > great. If the question is too general to be answered here, > > please let me know. This is no homework :-) > > Well, its obviously NOT a homework problem. > Why not? Because it has no solution, nor is it > a well formulated problem mathematically. > > Consider any given point. If the set of interpoint > distances has as many points close to that point > as there are at any given distance, then in the > (x,y) plane, the density of the points must fall off > as the inverse of the distance from the chosen > location. > > Conversely, if the points were perfectly uniformly > distributed over the entire plane, then if we were > to plot a histogram of the points as a function of > euclidean distance, then that histogram would be > linearly increasing. But, we cannot have a set of > points which falls off in density as you are asking > from any location in space. Perhaps you are wishing > for a perfectly uniform sampling of points over the > domain of interest. > > Of course, it makes no sense to talk about a > uniform distribution over an infinite plane. When > you sampled randomly over the unit square, the > points near the edges of the square had fewer > near neighbors than did the points in the center > of the square. And of course, the set of distances > that are large enough to be near sqrt(2) will be > small when you sample in the unit square. > > So, due to these edge effects, we cannot ever > have a uniform set of distances over the unit > square (or any closed region) come out of pdist. > > As I said, this problem has no solution as you > have posed it. Perhaps you are really interested > in having a set that has a roughly CONSTANT > distance to the nearest neighbor for any point. > (Even this wording is probably poor on my part > as I think about it.) > > HTH, > John
From: John D'Errico on 6 May 2008 08:20 "Wolfgang Schwanghart" <schwanghart(a)googlemail.com> wrote in message <fvpcai$mfm$1(a)fred.mathworks.com>... > Thanks very much John for your answer. One of the main > problems of non-mathematicians is probably to pose well > formulated problems. Hence, let me explain where I want to go. > > I want to set up a sampling scheme inside a rectangle from > which I get the best idea of small to large scale spatial > variability of some geochemical parameter (e.g. N in soils). > In order to assess spatial variability I'll use the > empirical variogram. > > http://en.wikipedia.org/wiki/Variogram > > In the variogram all distances should be sufficiently > represented for a given number of sampling locations. > > Well... colleagues calling for lunchtime ... I think I need > to rethink my problem. > > Thanks again, > Wolfgang Ah. I was thinking towards a different goal when I read your first post. A problem is that the distance distribution of any set of points inside a finite 2-d domain like the unit square will be strongly influenced by the edge effects. So that distribution must drop off strongly as you begin to look at distances that span the diameter of the region. Next, any roughly uniform sampling in 2-d will exhibit an increase in the number of neighbors at a distance of d as d increases from zero. We can see this by understanding that the area of an annular ring of thickness dr is pi*r*dr. So we should expect the distance distribution to rise roughly linearly with radius for small r, at least for any "locally uniform" sampling. Again, there are edge effects that will probably screw this up, making the increase nonlinear even for small radius. So I'm not at all sure that a goal of a maximally uniform distance distribution is achievable. John
|
Next
|
Last
Pages: 1 2 3 Prev: a question about angle in solve Next: Bloated files when saving large variables to .mat files with -append |