From: S on
Ok I am really confused. I have a file with 30000 data points, that varies between 0 and 1. I can create a pdf plot of this data in Excel using frequency command, and say a hundred bins between 0.01 and 1. Now if I try to do the same in Matlab, using either hist or histc, the values I get in the bins are different to excel. Why is this? And how can I generate a pdf plot in Matlab that is the same in Excel? Would appreciate any help you could give me. I've spent many hours on this. Many thanks.
From: Roger Stafford on
"S " <enxss10(a)nottingham.ac.uk> wrote in message <i3vaeo$ggp$1(a)fred.mathworks.com>...
> Ok I am really confused. I have a file with 30000 data points, that varies between 0 and 1. I can create a pdf plot of this data in Excel using frequency command, and say a hundred bins between 0.01 and 1. Now if I try to do the same in Matlab, using either hist or histc, the values I get in the bins are different to excel. Why is this? And how can I generate a pdf plot in Matlab that is the same in Excel? Would appreciate any help you could give me. I've spent many hours on this. Many thanks.
- - - - - - - - - - - - - -
My recommendation would be to undertake your own investigation of why the bin counts are different. For example, do a histogram with just two bins with both histc and Excel. If there is a difference in their respective counts, do a sort on your original data and this will allow you to identify the particular data value or values that for matlab went into one bin and in Excel into the other. If you study these values carefully you may discover just why matlab and Excel treated them differently.

Don't forget to read the histc documentation carefully to find out just how it makes such bin decisions.

Roger Stafford
From: Safa on
"S " wrote in message <i3vaeo$ggp$1(a)fred.mathworks.com>...
> Ok I am really confused. I have a file with 30000 data points, that varies between 0 and 1. I can create a pdf plot of this data in Excel using frequency command, and say a hundred bins between 0.01 and 1. Now if I try to do the same in Matlab, using either hist or histc, the values I get in the bins are different to excel. Why is this? And how can I generate a pdf plot in Matlab that is the same in Excel? Would appreciate any help you could give me. I've spent many hours on this. Many thanks.

Thanks for your suggestion. I have read histc documentation several times. It gives the following equation:
n(k) counts the value x(i) if edges(k) <= x(i) < edges(k+1).
What I would like to be able to do, is to tweak the histc command so that it gives the same frequency distribution as in excel. Is this possible? At the moment, the Excel and Matlab are counting the numbers differently, and I am at a loss to why it is doing this. Appreciate any further advice you could give on this matter. Thanks in advance.
From: Roger Stafford on
> Thanks for your suggestion. I have read histc documentation several times. It gives the following equation:
> n(k) counts the value x(i) if edges(k) <= x(i) < edges(k+1).
> What I would like to be able to do, is to tweak the histc command so that it gives the same frequency distribution as in excel. Is this possible? At the moment, the Excel and Matlab are counting the numbers differently, and I am at a loss to why it is doing this. Appreciate any further advice you could give on this matter. Thanks in advance.
- - - - - - - -
I repeat! This something you are entirely capable of finding out for yourself with the use of the sort function. Pin down the individual data value or values where Excel made one decision and histc a different one and then you are well on your way to solving your own problem.

Roger Stafford
From: Safa on
"Roger Stafford" <ellieandrogerxyzzy(a)mindspring.com.invalid> wrote in message <i3vgp9$io5$1(a)fred.mathworks.com>...
> > Thanks for your suggestion. I have read histc documentation several times. It gives the following equation:
> > n(k) counts the value x(i) if edges(k) <= x(i) < edges(k+1).
> > What I would like to be able to do, is to tweak the histc command so that it gives the same frequency distribution as in excel. Is this possible? At the moment, the Excel and Matlab are counting the numbers differently, and I am at a loss to why it is doing this. Appreciate any further advice you could give on this matter. Thanks in advance.
> - - - - - - - -
> I repeat! This something you are entirely capable of finding out for yourself with the use of the sort function. Pin down the individual data value or values where Excel made one decision and histc a different one and then you are well on your way to solving your own problem.
>
> Roger Stafford

I must say I wasn&#8217;t happy with the tone of your second message as this is a serious query about the operation of Matlab. I will respond to it, by submitting the following example.

Y=[0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1 1.2]
Bins=[0.2 0.4 0.6 0.8 1]
Histc(Y,Bins) gives 4,4,4,4,1

Frequency command in Excel gives 3,4,4,4,4

I am more of an Excel user, and I already know that the frequency command counts the instance of numbers that are less than or equal to the upper limit of each bin. Obviously Matlab is using the formula: n(k) counts the value x(i) if edges(k) <= x(i) < edges(k+1). It is unclear however what Matlab does for the last bin, does it just count instances of 1 exactly?

As I mentioned in my previous message, I already looked histc in Matlab help, and I requested a way to CHANGE the histc so that it matches Excel. Histc is an inbuild command in Matlab and I don't know how to change the above inbuilt equation.
Is my query clearer now?
Thank you!