From: Ingo Molnar on

* Ting Yang <tingy(a)cs.umass.edu> wrote:

> My name is Ting Yang, a graduate student from UMASS. I am currently
> studying the linux scheduler and virtual memory manager to solve some
> page swapping problems. I am very excited with the new scheduler CFS.
> After I read through your code, I think that you might be interested
> in reading this paper:

thanks for your detailed analysis - it was very interesting!

> Based on my understanding, adopting something like EEVDF in CFS
> should not be very difficult given their similarities, although I do
> not have any idea on how this impacts the load balancing for SMP. Does
> this worth a try?

It would definitely be interesting to try! I dont think it should
negatively impact load balancing on SMP. The current fork-time behavior
of CFS is really just a first-approximation thing, and what you propose
seems to make more sense to me too because it preserves the fluidity of
fairness. (I'd probably apply your patch even if there was no directly
measurable impact on workloads, because the more natural approaches tend
to be more maintainable in the long run.)

So by all means, please feel free to do a patch for this.

> Sorry for such a long email :-)

it made alot of sense and was very useful :-)

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Ingo Molnar on

* Mike Galbraith <efault(a)gmx.de> wrote:

> On Tue, 2007-05-01 at 23:22 +0200, Ingo Molnar wrote:
> > - interactivity: precise load calculation and load smoothing
>
> This seems to help quite a bit.

great :)

> (5 second top sample)
>
> 2636 root 15 -5 19148 15m 5324 R 73 1.5 1:42.29 0 amarok_libvisua
> 5440 root 20 0 320m 36m 8388 S 18 3.6 3:28.55 1 Xorg
> 4621 root 20 0 22776 18m 4168 R 12 1.8 0:00.63 1 cc1
> 4616 root 20 0 19456 13m 2200 R 9 1.3 0:00.43 0 cc1
>
> I no longer have to renice both X and Gforce to achieve a perfect
> display when they are sharing my box with a make -j2. X is displaying
> everything it's being fed beautifully with no help. I have to renice
> Gforce (amarok_libvisual), but given it's very heavy CPU usage, that
> seems perfectly fine.

ah! Besides OpenGL behavior and app startup performance i didnt
originally have Xorg in mind with this change, but thinking about it,
precise load calculations and load smoothing does have a positive effect
on 'coupled' workloads where under the previous variant of CFS's load
calculation one task component of the workload could become 'invisible'
to another task and hence cause macro-scale scheduling artifacts not
expected by humans. With smoothing these are dealt with more
consistently. Xorg can be a quite strongly coupled workload.

> No regressions noticed so far. Box is _very_ responsive under load,
> seemingly even more so than with previous releases. That is purely
> subjective, but the first impression was very distinct.

yeah, make -jN workloads (and any mixture of non-identical scheduling
patterns, which most real workloads consist of) should be handled more
consistently too by -v8.

so your workload (and Gene's workload) were the ones in fact that i had
hoped not to _hurt_ with -v8 (neither of you being the hardcore gamer
type ;), and in reality -v8 ended up helping them too. These sorts of
side-effects are always welcome ;-)

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Balbir Singh on
Ingo Molnar wrote:
> * Balbir Singh <balbir(a)linux.vnet.ibm.com> wrote:
>
>> With -v7 I would run the n/n+1 test. Basically on a system with n
>> cpus, I would run n+1 tasks and see how their load is distributed. I
>> usually find that the last two tasks would get stuck on one CPU on the
>> system and would get half the cpu time as their other peers. I think
>> this issue has been around for long even before CFS. But while I was
>> investigating that, I found that with -v8, all the n+1 tasks are stuck
>> on the same cpu.
>
> i believe this problem is specific to powerpc - load is distributed fine
> on i686/x86_64 and your sched_debug shows a cpu_load[0] == 0 on CPU#2
> which is 'impossible'. (I sent a few suggestions off-Cc about how to
> debug this.)
>
> Ingo

Hi, Ingo

The suggestions helped, here is a fix tested on PowerPC only.

Patch and Description
=====================


Load balancing on PowerPC is broken. Running 5 tasks on a 4 cpu system
results in all 5 tasks running on the same CPU. Based on Ingo's feedback,
I instrumented and debugged update_load_fair().

The problem is with comparing a s64 values with (s64)ULONG_MAX, which
evaluates to -1. Then we check if exec_delta64 and fair_delta64 are greater
than (s64)ULONG_MAX (-1), if so we assign (s64)ULONG_MAX to the respective
values.

The fix is to compare these values against (s64)LONG_MAX and assign
(s64)LONG_MAX to exec_delta64 and fair_delta64 if they are greater than
(s64)LONG_MAX.

Tested on PowerPC, the regression is gone, tasks are load balanced as they
were in v7.

Output of top

5614 root 20 0 4912 784 252 R 52 0.0 3:27.49 3 bash

5620 root 20 0 4912 784 252 R 47 0.0 3:07.38 2 bash

5617 root 20 0 4912 784 252 R 47 0.0 3:08.18 0 bash

5624 root 20 0 4912 784 252 R 26 0.0 1:42.97 1 bash

5621 root 20 0 4912 784 252 R 26 0.0 1:43.14 1 bash


Tasks 5624 and 5621 getting half of their peer values is a separate issue
altogether.

Signed-off-by: Balbir Singh <balbir(a)linux.vnet.ibm.com>
---

kernel/sched.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)

diff -puN kernel/sched.c~cfs-fix-load-balancing-arith kernel/sched.c
--- linux-2.6.21/kernel/sched.c~cfs-fix-load-balancing-arith 2007-05-02
16:16:20.000000000 +0530
+++ linux-2.6.21-balbir/kernel/sched.c 2007-05-02 16:16:47.000000000 +0530
@@ -1533,19 +1533,19 @@ static void update_load_fair(struct rq *
this_rq->prev_exec_clock = this_rq->exec_clock;
WARN_ON_ONCE(exec_delta64 <= 0);

- if (fair_delta64 > (s64)ULONG_MAX)
- fair_delta64 = (s64)ULONG_MAX;
+ if (fair_delta64 > (s64)LONG_MAX)
+ fair_delta64 = (s64)LONG_MAX;
fair_delta = (unsigned long)fair_delta64;

- if (exec_delta64 > (s64)ULONG_MAX)
- exec_delta64 = (s64)ULONG_MAX;
+ if (exec_delta64 > (s64)LONG_MAX)
+ exec_delta64 = (s64)LONG_MAX;
exec_delta = (unsigned long)exec_delta64;
if (exec_delta > TICK_NSEC)
exec_delta = TICK_NSEC;

idle_delta = TICK_NSEC - exec_delta;

- tmp = SCHED_LOAD_SCALE * exec_delta / fair_delta;
+ tmp = (SCHED_LOAD_SCALE * exec_delta) / fair_delta;
tmp64 = (u64)tmp * (u64)exec_delta;
do_div(tmp64, TICK_NSEC);
this_load = (unsigned long)tmp64;
_


--
Warm Regards,
Balbir Singh
Linux Technology Center
IBM, ISTL
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Ingo Molnar on

* Balbir Singh <balbir(a)linux.vnet.ibm.com> wrote:

> The problem is with comparing a s64 values with (s64)ULONG_MAX, which
> evaluates to -1. Then we check if exec_delta64 and fair_delta64 are
> greater than (s64)ULONG_MAX (-1), if so we assign (s64)ULONG_MAX to
> the respective values.

ah, indeed ...

> The fix is to compare these values against (s64)LONG_MAX and assign
> (s64)LONG_MAX to exec_delta64 and fair_delta64 if they are greater
> than (s64)LONG_MAX.
>
> Tested on PowerPC, the regression is gone, tasks are load balanced as
> they were in v7.

thanks, applied!

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Mark Lord on
Ingo Molnar wrote:
> i'm pleased to announce release -v8 of the CFS scheduler patchset. (The
> main goal of CFS is to implement "desktop scheduling" with as high
> quality as technically possible.)
>
> The CFS patch against v2.6.21.1 (or against v2.6.20.10) can be
> downloaded from the usual place:
>
> http://people.redhat.com/mingo/cfs-scheduler/

Excellent. I've switched machines here, so my "old" single-core 2GHz notebook
is no longer being used as much. The new machine is a more or less identical
notebook, but with a 2.1GHz Core2Duo CPU.

And.. man, the mainline scheduler really sucks eggs on the dual-core!
On single core it was nearly always "nice" enough for me,
but wow.. what a degradation with the extra CPU.

So, CFS is now a permanent add-on for kernels I use here on either machine,
and I also can say that it works GGGGGGGRRRRRRRRRRREEEEEEEEEAAAAAAAAAATTTTTTTT!!!
(please excuse the North American excuse for a "cultural" reference). ;)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/