From: Dustin on
i have a scatter plot with of several thousand points. Needless to say I don't need to plot every one of these. What is the best way to avoid plotting points in high density areas? Anything in the statistics toolbox maybe? some kind of data smoothing?

Thanks for the ideas/help
From: Walter Roberson on
Dustin wrote:
> i have a scatter plot with of several thousand points. Needless to say I
> don't need to plot every one of these. What is the best way to avoid
> plotting points in high density areas? Anything in the statistics
> toolbox maybe? some kind of data smoothing?

If you look in the Matlab File Exchange, you can find a contribution for
n-dimensional histograms. If you were to histogram the 2D coordinates and get
the bin indices for each bin, then you could run through the bins and for
those that are sparse enough, plot all of the points, and for those above some
density, take a random sub-selection of the original points (their bin index
will match the current bin number) and plot those.

This could be done as a series of scatter() calls, one per (occupied) bin, or
you could put all of the chosen points together and scatter() on the result.
From: John D'Errico on
"Dustin " <dbrisset(a)gmail.com> wrote in message <i3v1us$lda$1(a)fred.mathworks.com>...
> i have a scatter plot with of several thousand points. Needless to say I don't need to plot every one of these. What is the best way to avoid plotting points in high density areas? Anything in the statistics toolbox maybe? some kind of data smoothing?
>
> Thanks for the ideas/help

I usually just use a small plot symbol.

x = randn(10000,1);
y = randn(10000,1);

% This plot is hard to see what is happening
plot(x,y,'o')

% This one is a wee bit better
plot(x,y,'.')

% And this one is quite tolerable to look at
plot(x,y,'.','markersize',1)

John
From: Dustin on
Walter Roberson <roberson(a)hushmail.com> wrote in message <i3v3j4$fac$1(a)canopus.cc.umanitoba.ca>...
> Dustin wrote:
> > i have a scatter plot with of several thousand points. Needless to say I
> > don't need to plot every one of these. What is the best way to avoid
> > plotting points in high density areas? Anything in the statistics
> > toolbox maybe? some kind of data smoothing?
>
> If you look in the Matlab File Exchange, you can find a contribution for
> n-dimensional histograms. If you were to histogram the 2D coordinates and get
> the bin indices for each bin, then you could run through the bins and for
> those that are sparse enough, plot all of the points, and for those above some
> density, take a random sub-selection of the original points (their bin index
> will match the current bin number) and plot those.
>
> This could be done as a series of scatter() calls, one per (occupied) bin, or
> you could put all of the chosen points together and scatter() on the result.

Thanks I will look into this. How lite do you think the routine would end up being, i.e. would this routine still be faster than plotting a few thousand points? Either way I'll investigate it as a viable option.
From: Peter Perkins on
On 8/11/2010 4:42 PM, Dustin wrote:
> i have a scatter plot with of several thousand points. Needless to say I
> don't need to plot every one of these. What is the best way to avoid
> plotting points in high density areas? Anything in the statistics
> toolbox maybe? some kind of data smoothing?

Assuming this is 2-D, you could try the HIST3 function in the Statistics
Toolbox, but that will not plot _any_ of the points. There is also

<http://www.mathworks.com/matlabcentral/fileexchange/13352-smoothhist2d>

which would seem to be more like what you're looking for.