From: Christoph Lameter on
On Thu, 25 Feb 2010, David Rientjes wrote:

> I don't see how memory hotadd with a new node being onlined could have
> worked fine before since slab lacked any memory hotplug notifier until
> Andi just added it.

AFAICR The cpu notifier took on that role in the past.

If what you say is true then memory hotplug has never worked before.
Kamesan?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Pekka Enberg on
Hi Christoph,

Christoph Lameter wrote:
>> OK, can we get this issue resolved? The merge window is open and Christoph
>> seems to be unhappy with the whole patch queue. I'd hate this bug fix to miss
>> .34...
>
> Merge window? These are bugs that have to be fixed independently from a
> merge window. The question is if this is the right approach or if there is
> other stuff still lurking because we are not yet seeing the full picture.

The first set of patches from Andi are almost one month old. If this
issue progresses as swiftly as it has to this day, I foresee a rocky
road for any of them getting merged to .34 through slab.git, that's all.

Pekka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Christoph Lameter on
On Thu, 25 Feb 2010, Pekka Enberg wrote:

> The first set of patches from Andi are almost one month old. If this issue
> progresses as swiftly as it has to this day, I foresee a rocky road for any of
> them getting merged to .34 through slab.git, that's all.

Onlining and offlining memory is not that frequently used.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: David Rientjes on
On Thu, 25 Feb 2010, Christoph Lameter wrote:

> > I don't see how memory hotadd with a new node being onlined could have
> > worked fine before since slab lacked any memory hotplug notifier until
> > Andi just added it.
>
> AFAICR The cpu notifier took on that role in the past.
>

The cpu notifier isn't involved if the firmware notifies the kernel that a
new ACPI memory device has been added or you write a start address to
/sys/devices/system/memory/probe. Hot-added memory devices can include
ACPI_SRAT_MEM_HOT_PLUGGABLE entries in the SRAT for x86 that assign them
non-online node ids (although all such entries get their bits set in
node_possible_map at boot), so a new pgdat may be allocated for the node's
registered range.

Slab isn't concerned about that until the memory is onlined by doing
echo online > /sys/devices/system/memory/memoryX/state for the new memory
section. This is where all the new pages are onlined, kswapd is started
on the new node, and the zonelists are built. It's also where the new
node gets set in N_HIGH_MEMORY and, thus, it's possible to call
kmalloc_node() in generic kernel code. All that is done under
MEM_GOING_ONLINE and not MEM_ONLINE, which is why I suggest the first and
fourth patch in this series may not be necessary if we prevent setting the
bit in the nodemask or building the zonelists until the slab nodelists are
ready.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Christoph Lameter on
On Thu, 25 Feb 2010, David Rientjes wrote:

> On Thu, 25 Feb 2010, Christoph Lameter wrote:
>
> > > I don't see how memory hotadd with a new node being onlined could have
> > > worked fine before since slab lacked any memory hotplug notifier until
> > > Andi just added it.
> >
> > AFAICR The cpu notifier took on that role in the past.
> >
>
> The cpu notifier isn't involved if the firmware notifies the kernel that a
> new ACPI memory device has been added or you write a start address to
> /sys/devices/system/memory/probe. Hot-added memory devices can include
> ACPI_SRAT_MEM_HOT_PLUGGABLE entries in the SRAT for x86 that assign them
> non-online node ids (although all such entries get their bits set in
> node_possible_map at boot), so a new pgdat may be allocated for the node's
> registered range.

Yes Andi's work makes it explicit but there is already code in the cpu
notifier (see cpuup_prepare) that seems to have been intended to
initialize the node structures. Wonder why the hotplug people never
addressed that issue? Kame?


list_for_each_entry(cachep, &cache_chain, next) {
/*
* Set up the size64 kmemlist for cpu before we can
* begin anything. Make sure some other cpu on this
* node has not already allocated this
*/
if (!cachep->nodelists[node]) {
l3 = kmalloc_node(memsize, GFP_KERNEL, node);
if (!l3)
goto bad;
kmem_list3_init(l3);
l3->next_reap = jiffies + REAPTIMEOUT_LIST3 +
((unsigned long)cachep) % REAPTIMEOUT_LIST3;

/*
* The l3s don't come and go as CPUs come and
* go. cache_chain_mutex is sufficient
* protection here.
*/
cachep->nodelists[node] = l3;
}

spin_lock_irq(&cachep->nodelists[node]->list_lock);
cachep->nodelists[node]->free_limit =
(1 + nr_cpus_node(node)) *
cachep->batchcount + cachep->num;
spin_unlock_irq(&cachep->nodelists[node]->list_lock);
}


> kmalloc_node() in generic kernel code. All that is done under
> MEM_GOING_ONLINE and not MEM_ONLINE, which is why I suggest the first and
> fourth patch in this series may not be necessary if we prevent setting the
> bit in the nodemask or building the zonelists until the slab nodelists are
> ready.

That sounds good.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/