From: David Rientjes on
With NUMA emulation, it's possible for a single cpu to be bound to
multiple nodes since more than one may have affinity if allocated on a
physical node that is local to the cpu.

APIC ids must therefore be mapped to the lowest node ids to maintain
generic kernel use of functions such as cpu_to_node() that determine
device affinity. For example, if a device has proximity to physical node
1, for instance, and a cpu happens to be mapped to a higher emulated node
id 8, the proximity may not be correctly determined by comparison in
generic code even though the cpu may be truly local and allocated on
physical node 1. When this happens, the true topology of the machine
isn't accurately represented in the emulated environment; although this
isn't critical to the system's uptime, any generic code that is NUMA
aware benefits from the physical topology being accurated represented.

This can affect any system that maps multiple APIC ids to a single node
and is booted with numa=fake=N where N is greater than the number of
physical nodes.

Signed-off-by: David Rientjes <rientjes(a)google.com>
---
arch/x86/mm/srat_64.c | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/arch/x86/mm/srat_64.c b/arch/x86/mm/srat_64.c
--- a/arch/x86/mm/srat_64.c
+++ b/arch/x86/mm/srat_64.c
@@ -461,7 +461,8 @@ void __init acpi_fake_nodes(const struct bootnode *fake_nodes, int num_nodes)
* node, it must now point to the fake node ID.
*/
for (j = 0; j < MAX_LOCAL_APIC; j++)
- if (apicid_to_node[j] == nid)
+ if (apicid_to_node[j] == nid &&
+ fake_apicid_to_node[j] == NUMA_NO_NODE)
fake_apicid_to_node[j] = i;
}
for (i = 0; i < num_nodes; i++)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/