From: Andrew Morton on
On Wed, 28 Apr 2010 10:04:32 -0500
Jack Steiner <steiner(a)sgi.com> wrote:

> Some workloads that create a large number of small files tend to assign
> too many pages to node 0 (multi-node systems). Part of the reason is that
> the rotor (in cpuset_mem_spread_node()) used to assign nodes starts
> at node 0 for newly created tasks.

And, presumably, your secret testcase forks lots of subprocesses which
do the file creation?

> This patch changes the rotor to be initialized to a random node number
> of the cpuset.

Why random as opposed to, say, inherit-rotor-from-parent?

> Index: linux/arch/x86/mm/numa.c
> ===================================================================
> --- linux.orig/arch/x86/mm/numa.c 2010-04-28 09:44:52.422898844 -0500
> +++ linux/arch/x86/mm/numa.c 2010-04-28 09:49:39.282899779 -0500
> @@ -2,6 +2,7 @@
> #include <linux/topology.h>
> #include <linux/module.h>
> #include <linux/bootmem.h>
> +#include <linux/random.h>
>
> #ifdef CONFIG_DEBUG_PER_CPU_MAPS
> # define DBG(x...) printk(KERN_DEBUG x)
> @@ -65,3 +66,19 @@ const struct cpumask *cpumask_of_node(in
> }
> EXPORT_SYMBOL(cpumask_of_node);
> #endif
> +
> +/*
> + * Return the bit number of a random bit set in the nodemask.
> + * (returns -1 if nodemask is empty)
> + */
> +int __node_random(const nodemask_t *maskp)
> +{
> + int w, bit = -1;
> +
> + w = nodes_weight(*maskp);
> + if (w)
> + bit = bitmap_find_nth_bit(maskp->bits,
> + get_random_int() % w, MAX_NUMNODES);
> + return bit;
> +}
> +EXPORT_SYMBOL(__node_random);

I suspect random32() would suffice here. It avoids depleting the
entropy pool altogether.

> +
> +/**
> + * bitmap_find_nth_bit(buf, ord, bits)
> + * @buf: pointer to bitmap
> + * @n: ordinal bit position (n-th set bit, n >= 0)
> + * @nbits: number of bits in the bitmap
> + *
> + * find the Nth bit that is set in the bitmap
> + * Value of @n should be in range 0 <= @n < weight(buf), else
> + * results are undefined.
> + *
> + * The bit positions 0 through @bits are valid positions in @buf.
> + */
> +int bitmap_find_nth_bit(const unsigned long *bitmap, int n, int bits)
> +{
> + return bitmap_ord_to_pos(bitmap, n, bits);
> +}
> +EXPORT_SYMBOL(bitmap_find_nth_bit);

This does nothing apart from consume more stack? Better to rename
bitmap_ord_to_pos() and export it.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Matt Mackall on
On Wed, 2010-04-28 at 15:40 -0700, Andrew Morton wrote:
> On Wed, 28 Apr 2010 10:04:32 -0500
> Jack Steiner <steiner(a)sgi.com> wrote:
>
> > Some workloads that create a large number of small files tend to assign
> > too many pages to node 0 (multi-node systems). Part of the reason is that
> > the rotor (in cpuset_mem_spread_node()) used to assign nodes starts
> > at node 0 for newly created tasks.
>
> And, presumably, your secret testcase forks lots of subprocesses which
> do the file creation?
>
> > This patch changes the rotor to be initialized to a random node number
> > of the cpuset.
>
> Why random as opposed to, say, inherit-rotor-from-parent?

That'd be fine, I bet.

> > Index: linux/arch/x86/mm/numa.c
> > ===================================================================
> > --- linux.orig/arch/x86/mm/numa.c 2010-04-28 09:44:52.422898844 -0500
> > +++ linux/arch/x86/mm/numa.c 2010-04-28 09:49:39.282899779 -0500
> > @@ -2,6 +2,7 @@
> > #include <linux/topology.h>
> > #include <linux/module.h>
> > #include <linux/bootmem.h>
> > +#include <linux/random.h>
> >
> > #ifdef CONFIG_DEBUG_PER_CPU_MAPS
> > # define DBG(x...) printk(KERN_DEBUG x)
> > @@ -65,3 +66,19 @@ const struct cpumask *cpumask_of_node(in
> > }
> > EXPORT_SYMBOL(cpumask_of_node);
> > #endif
> > +
> > +/*
> > + * Return the bit number of a random bit set in the nodemask.
> > + * (returns -1 if nodemask is empty)
> > + */
> > +int __node_random(const nodemask_t *maskp)
> > +{
> > + int w, bit = -1;
> > +
> > + w = nodes_weight(*maskp);
> > + if (w)
> > + bit = bitmap_find_nth_bit(maskp->bits,
> > + get_random_int() % w, MAX_NUMNODES);
> > + return bit;
> > +}
> > +EXPORT_SYMBOL(__node_random);
>
> I suspect random32() would suffice here. It avoids depleting the
> entropy pool altogether.

I wouldn't worry about that. get_random_int() touches the urandom pool,
which will always leave entropy around. Also, Ted and I decided over a
year ago that we should drop the whole entropy accounting framework,
which I'll get around to some rainy weekend.

--
http://selenic.com : development and support for Mercurial and Linux


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Andrew Morton on
On Wed, 28 Apr 2010 18:04:06 -0500
Matt Mackall <mpm(a)selenic.com> wrote:

> > I suspect random32() would suffice here. It avoids depleting the
> > entropy pool altogether.
>
> I wouldn't worry about that. get_random_int() touches the urandom pool,
> which will always leave entropy around. Also, Ted and I decided over a
> year ago that we should drop the whole entropy accounting framework,
> which I'll get around to some rainy weekend.

hm, so why does random32() exist? Speed?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Stephen Hemminger on
On Wed, 28 Apr 2010 16:12:44 -0700
Andrew Morton <akpm(a)linux-foundation.org> wrote:

> On Wed, 28 Apr 2010 18:04:06 -0500
> Matt Mackall <mpm(a)selenic.com> wrote:
>
> > > I suspect random32() would suffice here. It avoids depleting the
> > > entropy pool altogether.
> >
> > I wouldn't worry about that. get_random_int() touches the urandom pool,
> > which will always leave entropy around. Also, Ted and I decided over a
> > year ago that we should drop the whole entropy accounting framework,
> > which I'll get around to some rainy weekend.
>
> hm, so why does random32() exist? Speed?

Because I need a cheap fast pseudo-random source for emulation
and it got used for more and more non-cryptographic uses.
And like most random generators people keep forgetting that
it was not intended for security use.

--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Matt Mackall on
On Wed, 2010-04-28 at 16:12 -0700, Andrew Morton wrote:
> On Wed, 28 Apr 2010 18:04:06 -0500
> Matt Mackall <mpm(a)selenic.com> wrote:
>
> > > I suspect random32() would suffice here. It avoids depleting the
> > > entropy pool altogether.
> >
> > I wouldn't worry about that. get_random_int() touches the urandom pool,
> > which will always leave entropy around. Also, Ted and I decided over a
> > year ago that we should drop the whole entropy accounting framework,
> > which I'll get around to some rainy weekend.
>
> hm, so why does random32() exist? Speed?

Yep. There are lots of RNG uses that aren't security sensitive and this
is one: the kernel won't be DoSed by an attacker that gets all pages
preferentially allocated on one node. Performance will suffer, but it's
reasonably bounded.

One of my goals is to call these sorts of trade-offs out in the API, ie:

get_fast_random_u32()
get_fast_random_bytes()
get_secure_random_u32()
get_secure_random_bytes()

--
http://selenic.com : development and support for Mercurial and Linux


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/