From: Pekka Enberg on
David Rientjes wrote:
> Slab lacks any memory hotplug support for nodes that are hotplugged
> without cpus being hotplugged. This is possible at least on x86
> CONFIG_MEMORY_HOTPLUG_SPARSE kernels where SRAT entries are marked
> ACPI_SRAT_MEM_HOT_PLUGGABLE and the regions of RAM represent a seperate
> node. It can also be done manually by writing the start address to
> /sys/devices/system/memory/probe for kernels that have
> CONFIG_ARCH_MEMORY_PROBE set, which is how this patch was tested, and
> then onlining the new memory region.
>
> When a node is hotadded, a nodelist for that node is allocated and
> initialized for each slab cache. If this isn't completed due to a lack
> of memory, the hotadd is aborted: we have a reasonable expectation that
> kmalloc_node(nid) will work for all caches if nid is online and memory is
> available.
>
> Since nodelists must be allocated and initialized prior to the new node's
> memory actually being online, the struct kmem_list3 is allocated off-node
> due to kmalloc_node()'s fallback.
>
> When an entire node is offlined (or an online is aborted), these
> nodelists are subsequently drained and freed. If objects still exist
> either on the partial or full lists for those nodes, the offline is
> aborted. This scenario will not occur for an aborted online, however,
> since objects can never be allocated from those nodelists until the
> online has completed.
>
> Signed-off-by: David Rientjes <rientjes(a)google.com>

Andi, does this fix the oops you were seeing?

Pekka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Andi Kleen on
On Mon, Mar 01, 2010 at 02:24:43AM -0800, David Rientjes wrote:
> Slab lacks any memory hotplug support for nodes that are hotplugged
> without cpus being hotplugged. This is possible at least on x86
> CONFIG_MEMORY_HOTPLUG_SPARSE kernels where SRAT entries are marked
> ACPI_SRAT_MEM_HOT_PLUGGABLE and the regions of RAM represent a seperate
> node. It can also be done manually by writing the start address to
> /sys/devices/system/memory/probe for kernels that have
> CONFIG_ARCH_MEMORY_PROBE set, which is how this patch was tested, and
> then onlining the new memory region.

The patch looks far more complicated than my simple fix.

Is more complicated now better?

-Andi

--
ak(a)linux.intel.com -- Speaking for myself only.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Pekka Enberg on
Hi Andi,

On Tue, Mar 2, 2010 at 2:53 PM, Andi Kleen <andi(a)firstfloor.org> wrote:
> On Mon, Mar 01, 2010 at 02:24:43AM -0800, David Rientjes wrote:
>> Slab lacks any memory hotplug support for nodes that are hotplugged
>> without cpus being hotplugged. �This is possible at least on x86
>> CONFIG_MEMORY_HOTPLUG_SPARSE kernels where SRAT entries are marked
>> ACPI_SRAT_MEM_HOT_PLUGGABLE and the regions of RAM represent a seperate
>> node. �It can also be done manually by writing the start address to
>> /sys/devices/system/memory/probe for kernels that have
>> CONFIG_ARCH_MEMORY_PROBE set, which is how this patch was tested, and
>> then onlining the new memory region.
>
> The patch looks far more complicated than my simple fix.

I wouldn't exactly call the fallback_alloc() games "simple".

> Is more complicated now better?

Heh, heh. You can't post the oops, you don't want to rework your
patches as per review comments, and now you complain about David's
patch without one bit of technical content. I'm sorry but I must
conclude that someone is playing a prank on me because there's no way
a seasoned kernel hacker such as yourself could possibly think that
this is the way to get patches merged.

But anyway, if you have real technical concerns over the patch, please
make them known; otherwise I'd much appreciate a Tested-by tag from
you for David's patch.

Thanks,

Pekka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Christoph Lameter on

Not sure how this would sync with slab use during node bootstrap and
shutdown. Kame-san?

Otherwise

Acked-by: Christoph Lameter <cl(a)linux-foundation.org>


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: David Rientjes on
On Tue, 2 Mar 2010, Christoph Lameter wrote:

>
> Not sure how this would sync with slab use during node bootstrap and
> shutdown. Kame-san?
>

All the nodelist allocation and initialization is done during
MEM_GOING_ONLINE, so there should be no use of them until that
notification cycle is done and it has graduated to MEM_ONLINE: if there
are, there're even bigger problems because zonelist haven't even been
built for that pgdat yet. I can only speculate, but since Andi's
patchset did all this during MEM_ONLINE, where the bit is already set in
node_states[N_HIGH_MEMORY] and is passable to kmalloc_node(), this is
probably why additional hacks had to be added elsewhere.

Other than that, concurrent kmem_cache_create() is protected by
cache_chain_mutex.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/