From: Stefan Behnel on
southof40, 20.06.2010 12:19:
> I have list of of N Vehicle objects - the only possible vehicles are
> cars, bikes, trucks.
>
> I want to select an object from the list with a probability of : cars
> 0.7, bikes 0.3, trucks 0.1.
>
> I've currently implemented this by creating another list in which each
> car object from the original list appears 7 times, each bike 3 times
> and each truck once. I then pick at random from that list.
>
> This works but seems very clunky to me.

Why? It's a very simple, generic, easy to understand and fast solution to
the problem.

Stefan

From: Cameron Simpson on
On 20Jun2010 12:44, Stefan Behnel <stefan_ml(a)behnel.de> wrote:
| southof40, 20.06.2010 12:19:
| >I have list of of N Vehicle objects - the only possible vehicles are
| >cars, bikes, trucks.
| >
| >I want to select an object from the list with a probability of : cars
| >0.7, bikes 0.3, trucks 0.1.
| >
| >I've currently implemented this by creating another list in which each
| >car object from the original list appears 7 times, each bike 3 times
| >and each truck once. I then pick at random from that list.
| >
| >This works but seems very clunky to me.
|
| Why? It's a very simple, generic, easy to understand and fast
| solution to the problem.

Only 3 out of 4, if you want to be precise in your selections.
Supposing he wants probabilities 0.7432, 0.3765, 0.1087654 ?
The required list needs to be Very Long to achieve an accurate
representation, and thus Very Slow to construct/populate.

A faster approach is to make a list represention the sum of the
proportions as one counts along the choices, thus 0.7, 1.0, 1.1 in the
example given (0.7, 0.7+0.3, 0.7+0.3+0.1). Then choose a value in the
range 0.0 to the total (1.1) using the pseudo-random function of your
choice, such as that in the random module. Then binary search the list
for the matching item.

The list scales linearly as the number of choices, not exponentially
with the precision of the proportions. The search is logarithmic with
the number of choices. Beyond a very small number of choices the former
will dominate.

Cheers,
--
Cameron Simpson <cs(a)zip.com.au> DoD#743
http://www.cskk.ezoshosting.com/cs/

.... you could spend *all day* customizing the title bar. Believe me. I
speak from experience. - Matt Welsh
From: duncan smith on
southof40 wrote:
> I have list of of N Vehicle objects - the only possible vehicles are
> cars, bikes, trucks.
>
> I want to select an object from the list with a probability of : cars
> 0.7, bikes 0.3, trucks 0.1.
>
> I've currently implemented this by creating another list in which each
> car object from the original list appears 7 times, each bike 3 times
> and each truck once. I then pick at random from that list.
>
> This works but seems very clunky to me. Can anyone suggest a better
> data structure which would support the 'weighted randomness' I'm
> after ?
>
> I'm not fixed on the idea of using a list - could be a dictionary,
> tree whatever .
>
> Thanks in advance.
>

Try googling for "alias table". Efficient if you're selecting many
random objects from the same mass function. Better than binary search
on the cumulative mass function in big-O terms (but maybe not in
practical terms for reasonable sized problems). Neither approach is as
efficient as the one you outlined, but the required list size might be
an issue for some sets of probabilities.

Duncan
From: duncan smith on
duncan smith wrote:
> southof40 wrote:
>> I have list of of N Vehicle objects - the only possible vehicles are
>> cars, bikes, trucks.
>>
>> I want to select an object from the list with a probability of : cars
>> 0.7, bikes 0.3, trucks 0.1.
>>
>> I've currently implemented this by creating another list in which each
>> car object from the original list appears 7 times, each bike 3 times
>> and each truck once. I then pick at random from that list.
>>
>> This works but seems very clunky to me. Can anyone suggest a better
>> data structure which would support the 'weighted randomness' I'm
>> after ?
>>
>> I'm not fixed on the idea of using a list - could be a dictionary,
>> tree whatever .
>>
>> Thanks in advance.
>>
>
> Try googling for "alias table". Efficient if you're selecting many
> random objects from the same mass function. Better than binary search
> on the cumulative mass function in big-O terms (but maybe not in
> practical terms for reasonable sized problems). Neither approach is as
> efficient as the one you outlined, but the required list size might be
> an issue for some sets of probabilities.
>
> Duncan

BTW, the alias table approach is basically a means of getting round the
problem of needing large lists. Assuming your probabilities should be
0.7, 0.2 and 0.1 you could construct a list of 3 objects. The first
would be 100% car, the second would be 60% bike and 40% car, the third
would be 30% truck and 70% car. Choose an object at random, then the
vehicle type according to the mass function associated with the object.
The alias table approach only requires the generation of a single
uniform random variate and a single comparison (once you've constructed it).

Duncan