From: Suresh Siddha on
This is an updated version of patchset which is posted earlier at
http://lkml.org/lkml/2009/12/10/470

Description:
Existing nohz idle load balance logic uses the pull model, with one
idle load balancer CPU nominated on any partially idle system and that
balancer CPU not going into nohz mode. With the periodic tick, the
balancer does the idle balancing on behalf of all the CPUs in nohz mode.

This is not very optimal and has few issues:
* the balancer will continue to have periodic ticks and wakeup
frequently (HZ rate), even though it may not have any rebalancing to do on
behalf of any of the idle CPUs.
* On x86 and CPUs that have APIC timer stoppage on idle CPUs, this periodic
wakeup can result in an additional interrupt on a CPU doing the timer
broadcast.

The alternative is to have a push model, where all idle CPUs can enter nohz
mode and any busy CPU kicks one of the idle CPUs to take care of idle
balancing on behalf of a group of idle CPUs.

Following patches switches idle load balancer to this push approach.

Updates from the previous version:

* Busy CPU uses send_remote_softirq() for invoking SCHED_SOFTIRQ on the
idle load balancing cpu, which does the load balancing on behalf of
all the idle CPUs.

* Dropped the per NUMA node nohz load balancing as it doesn't detect
certain imbalance scenarios. This will be addressed later.

Signed-off-by: Suresh Siddha <suresh.b.siddha(a)intel.com>
Signed-off-by: Venkatesh Pallipadi <venki(a)google.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/