From: Dustin on
I think this is a fairly straight forward question but well see.

I have an arbitrarily sized data set x and y, in two column vectors. I want to do box plots of y for even intervals of x on the same figure.

How can I do this without breaking up my data set?

thanks.
From: Kelly Kearney on
"Dustin " <dbrisset(a)gmail.com> wrote in message <i40tr7$3j9$1(a)fred.mathworks.com>...
> I think this is a fairly straight forward question but well see.
>
> I have an arbitrarily sized data set x and y, in two column vectors. I want to do box plots of y for even intervals of x on the same figure.
>
> How can I do this without breaking up my data set?
>
> thanks.

Use histc to group your data by interval, and use boxplot with the group input:

x = rand(100,1)*100;
y = rand(size(x));
edge = 0:10:100;
[n,bin] = histc(x,edge);
boxplot(y, bin)

You may want to play around with the positions, widths, and xlabels of each boxplot so it better reflects the actual intervals being displayed.

-Kelly
From: Dustin on
"Kelly Kearney" <kakearney(a)nospamgmail.com> wrote in message <i4114s$c8s$1(a)fred.mathworks.com>...
> "Dustin " <dbrisset(a)gmail.com> wrote in message <i40tr7$3j9$1(a)fred.mathworks.com>...
> > I think this is a fairly straight forward question but well see.
> >
> > I have an arbitrarily sized data set x and y, in two column vectors. I want to do box plots of y for even intervals of x on the same figure.
> >
> > How can I do this without breaking up my data set?
> >
> > thanks.
>
> Use histc to group your data by interval, and use boxplot with the group input:
>
> x = rand(100,1)*100;
> y = rand(size(x));
> edge = 0:10:100;
> [n,bin] = histc(x,edge);
> boxplot(y, bin)
>
> You may want to play around with the positions, widths, and xlabels of each boxplot so it better reflects the actual intervals being displayed.
>
> -Kelly

This is super close to what I want, but is great help for me. Let me add a little more detail to what I was going for. I want to transpose this on top of the scatter plot of (x,Y) corresponding to where the data lies. I don't think this is quite grabbing the bins appropriately.

I'll try to elaborate a little better. Looking at my code below... This appears to be breaking my x-range of data into 50 bins. where i want a specified x-range (0-25) broken into 50 bins. Theres something I'm not quite understanding, and the help isn't very helpful to me.

It also keeps hijacking my x-tick marks but i can probably handle this.

scatter(data.x,data.y)
hold on
edge = 0:.5:25;
[~,bin] = histc(data.x,edge);
boxplot(data.y, bin)
From: Tom Lane on
> I'll try to elaborate a little better. Looking at my code below... This
> appears to be breaking my x-range of data into 50 bins. where i want a
> specified x-range (0-25) broken into 50 bins. Theres something I'm not
> quite understanding, and the help isn't very helpful to me.
> It also keeps hijacking my x-tick marks but i can probably handle this.
>
> scatter(data.x,data.y)
> hold on
> edge = 0:.5:25;
> [~,bin] = histc(data.x,edge);
> boxplot(data.y, bin)

Type "help histc" and you'll see that the bin number is 0 for out-of-range
values. The boxplot function doesn't treat the value 0 as anything special.
If you want to exclude bin 0, you might try

xmeans = grpstats(data.x,bin)
boxplot(data.y(bin>0), bin(bin>0),'position',xmeans(2:end))

Here I've also positioned the boxes at the means of the x values in each
bin.

You may need to play with this some. For example, I know I have values in
bin 0 so I omit the first xmeans value. You might need to do that only
conditionally.

-- Tom

From: Kelly Kearney on
"Dustin " <dbrisset(a)gmail.com> wrote in message <i4149j$aao$1(a)fred.mathworks.com>...
> "Kelly Kearney" <kakearney(a)nospamgmail.com> wrote in message <i4114s$c8s$1(a)fred.mathworks.com>...
> > "Dustin " <dbrisset(a)gmail.com> wrote in message <i40tr7$3j9$1(a)fred.mathworks.com>...
> > > I think this is a fairly straight forward question but well see.
> > >
> > > I have an arbitrarily sized data set x and y, in two column vectors. I want to do box plots of y for even intervals of x on the same figure.
> > >
> > > How can I do this without breaking up my data set?
> > >
> > > thanks.
> >
> > Use histc to group your data by interval, and use boxplot with the group input:
> >
> > x = rand(100,1)*100;
> > y = rand(size(x));
> > edge = 0:10:100;
> > [n,bin] = histc(x,edge);
> > boxplot(y, bin)
> >
> > You may want to play around with the positions, widths, and xlabels of each boxplot so it better reflects the actual intervals being displayed.
> >
> > -Kelly
>
> This is super close to what I want, but is great help for me. Let me add a little more detail to what I was going for. I want to transpose this on top of the scatter plot of (x,Y) corresponding to where the data lies. I don't think this is quite grabbing the bins appropriately.
>
> I'll try to elaborate a little better. Looking at my code below... This appears to be breaking my x-range of data into 50 bins. where i want a specified x-range (0-25) broken into 50 bins. Theres something I'm not quite understanding, and the help isn't very helpful to me.
>
> It also keeps hijacking my x-tick marks but i can probably handle this.
>
> scatter(data.x,data.y)
> hold on
> edge = 0:.5:25;
> [~,bin] = histc(data.x,edge);
> boxplot(data.y, bin)

I think it's doing what you want it to do, but you need to give it x positions and widths to position the boxes in the correct area. The below should do that, as well as correct the annoying hijacking of the x-ticks (I hate that boxplot does that). Also, unless you plan to use varying colors and sizes in your scattered data, plot is a lot faster than scatter.

x = rand(1000,1) .* 25;
y = sin(x/4) + randn(size(x));

edge = 0:.5:25;
[n,bin] = histc(x, edge);

xmid = (edge(1:end-1) + edge(2:end))./2;
dx = diff(edge);

figure;
plot(x,y,'k.');
hold on;
boxplot(y, bin, 'positions', xmid, 'widths', dx);
set(gca, 'xticklabelmode', 'auto', 'xtick', 0:5:25);