|
From: Roger Stafford on 6 May 2008 16:32 "Simon Preston" <preston.simon+mathsworks(a)gmail.com> wrote in message <fvpde9$rm2$1(a)fred.mathworks.com>... > Having looked, it uses histc. (I promise I hadn't known > about randsample before my original reply!) > > In fact, the second output of histc is the length-n vector > which indexes the bin into which each observation falls, > i.e. what Shonkhor wanted. Hence > [x,y] = histc(rand(1,n),[0 cumsum(p)]); > returns as y Shonkhor's desired sample. > > As for the order of this approach, maybe Roger could > explain: are the other methods (theoretically) faster? I > suppose, though, i) the order calculations are asymptotic > results, and ii) it's hard to imagine an application where > this calculation, whichever method used, would be > rate-limiting. Still it's interesting to see there are > various nice ways to solve the same problem. > > Best wishes, S --------- Hello Simon, In any discussion of algorithmic order it needs to be understood that, as you say, this has only to do with asymptotic times of execution. I initially favored the binary search approach with its order (n*log(m)) to provide for cases of large values of both m and n. It would be better than (n+m)*log(n+m) for very large n and better than n*m for very large m. (I am guessing 'histc' is order ((n+m)*log(n+m)) because they may do a initial sort of combined edges and points. However, perhaps they have a simpler order (n*m) algorithm instead, as my own 'hist' does.) Whether it would be faster for numbers likely to be used is quite another question and requires experimentation. I did have the distinct impression from Shonkhor's mention of distributions that a very large value of n might be needed and the value m = 360 was evidently large. Whatever is true here, you did certainly provide Shonkhor with a very fine one-liner that doesn't require the Statistics Toolbox, Simon. When I saw it, I said to myself, "now why didn't I think of that?" Roger Stafford
From: Peter Perkins on 5 May 2008 16:30 Roger Stafford wrote: > Peter Perkins <Peter.PerkinsRemoveThis(a)mathworks.com> wrote in message >> randsample(1:360,p,n,true) >> >> to generate a sample of size n, drawing values from 1:360 independently >> with replacement. > ----------- > That will teach me not to go diving into a problem without checking to see if > someone else has already done it. I have a weakness in that direction! > Anyway, Shonkhor got three versions that don't require the Statistics Toolbox. > > According to the documentation for 'randsample', the call should be made > as: > > randsample(1:360,n,true,p); > > Does your argument sequence really work also? No, and that will teach _me_ to not check the help before typing example code. The correct argument order is what you described. Thanks.
From: Simon Preston on 6 May 2008 06:55 > I do have a question about 'randsample' if it doesn't violate Mathworks' > proprietary interests. The four-argument case is markedly more difficult to > implement than the others. Does it use something like the binary search I > used to effect the inverse cumulative distribution of p, Simon's 'histc' method, > or perhaps something akin to Alessandro's acceptance-rejection method > directly on the p values? In other words, if m = 360, is it an order(n*log(m)) > algorithm, order ((n+m)*log(n+m)) algorithm, or something altogether > different? Users would probably like to know this, but the documentation > doesn't say. > > Roger Stafford > Having looked, it uses histc. (I promise I hadn't known about randsample before my original reply!) In fact, the second output of histc is the length-n vector which indexes the bin into which each observation falls, i.e. what Shonkhor wanted. Hence [x,y] = histc(rand(1,n),[0 cumsum(p)]); returns as y Shonkhor's desired sample. As for the order of this approach, maybe Roger could explain: are the other methods (theoretically) faster? I suppose, though, i) the order calculations are asymptotic results, and ii) it's hard to imagine an application where this calculation, whichever method used, would be rate-limiting. Still it's interesting to see there are various nice ways to solve the same problem. Best wishes, S
From: alessandro mura on 7 May 2008 13:45 > However, I think you would have trouble doing polynomial fits to inverse > cdf's > for distributions with unbounded ranges. > > Roger Stafford Exactly. Usually what I need to simulate are random distribution of velocity or energy for a test particle, and in fact they have (almost) never upper bound. What I sometimes do (when it is not a maxwellian, of course), is to split the problem in two parts: I use a Neumann rejection for the main part of the distribution, for example E<E0when this method has a reasonable fraction of "good" guesses (to avoid wasting machine time), and I add a "high energy tail" (E>E0) with a polynomial fit of the inverse CDF. Ciao -- Alessandro Mura Istituto Nazionale di Astrofisica - IFSI http://pptt4.ifsi-roma.inaf.it/~mura/index.html http://www.alessandromura.it
First
|
Prev
|
Pages: 1 2 Prev: Terminate loop with user input? Next: plot smooth curves from cadence data set (csv file) |