From: Ingo Molnar on

* Felix Fietkau <nbd(a)openwrt.org> wrote:

> Ingo Molnar wrote:
> > * Michael Buesch <mb(a)bu3sch.de> wrote:
> >
> >> On Tuesday 08 September 2009 09:48:25 Ingo Molnar wrote:
> >> > Mind poking on this one to figure out whether it's all repeatable
> >> > and why that slowdown happens?
> >>
> >> I repeated the test several times, because I couldn't really believe
> >> that there's such a big difference for me, but the results were the
> >> same. I don't really know what's going on nor how to find out what's
> >> going on.
> >
> > Well that's a really memory constrained MIPS device with like 16 MB of
> > RAM or so? So having effects from small things like changing details in
> > a kernel image is entirely plausible.
>
> Normally changing small details doesn't have much of an effect. While
> 16 MB is indeed not that much, we do usually have around 8 MB free
> with a full user space running. Changes to other subsystems normally
> produce consistent and repeatable differences that seem entirely
> unrelated to memory use, so any measurable difference related to
> scheduler changes is unlikely to be related to the low amount of RAM.
> By the way, we do frequently also test the same software with devices
> that have more RAM, e.g. 32 or 64 MB and it usually behaves in a very
> similar way.

Well, Michael Buesch posted vmstat results, and they show what i have
found with my x86 simulated reproducer as well (these are Michael's
numbers):

procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
1 0 0 15892 1684 5868 0 0 0 0 268 6 31 69 0 0
1 0 0 15892 1684 5868 0 0 0 0 266 2 34 66 0 0
1 0 0 15892 1684 5868 0 0 0 0 266 6 33 67 0 0
1 0 0 15892 1684 5868 0 0 0 0 267 4 37 63 0 0
1 0 0 15892 1684 5868 0 0 0 0 267 6 34 66 0 0

on average 4 context switches _per second_. The scheduler is not a
factor on this box.

Furthermore:

| I'm currently unable to test BFS, because the device throws strange
| flash errors. Maybe the flash is broken :(

So maybe those flash errors somehow impacted the measurements as well?

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Felix Fietkau on
Ingo Molnar wrote:
> * Felix Fietkau <nbd(a)openwrt.org> wrote:
>
>> Ingo Molnar wrote:
>> > Well that's a really memory constrained MIPS device with like 16 MB of
>> > RAM or so? So having effects from small things like changing details in
>> > a kernel image is entirely plausible.
>>
>> Normally changing small details doesn't have much of an effect. While
>> 16 MB is indeed not that much, we do usually have around 8 MB free
>> with a full user space running. Changes to other subsystems normally
>> produce consistent and repeatable differences that seem entirely
>> unrelated to memory use, so any measurable difference related to
>> scheduler changes is unlikely to be related to the low amount of RAM.
>> By the way, we do frequently also test the same software with devices
>> that have more RAM, e.g. 32 or 64 MB and it usually behaves in a very
>> similar way.
>
> Well, Michael Buesch posted vmstat results, and they show what i have
> found with my x86 simulated reproducer as well (these are Michael's
> numbers):
>
> procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
> r b swpd free buff cache si so bi bo in cs us sy id wa
> 1 0 0 15892 1684 5868 0 0 0 0 268 6 31 69 0 0
> 1 0 0 15892 1684 5868 0 0 0 0 266 2 34 66 0 0
> 1 0 0 15892 1684 5868 0 0 0 0 266 6 33 67 0 0
> 1 0 0 15892 1684 5868 0 0 0 0 267 4 37 63 0 0
> 1 0 0 15892 1684 5868 0 0 0 0 267 6 34 66 0 0
>
> on average 4 context switches _per second_. The scheduler is not a
> factor on this box.
>
> Furthermore:
>
> | I'm currently unable to test BFS, because the device throws strange
> | flash errors. Maybe the flash is broken :(
>
> So maybe those flash errors somehow impacted the measurements as well?
I did some tests with BFS v230 vs CFS on Linux 2.6.30 on a different
MIPS device (Atheros AR2317) with 180 MHz and 16 MB RAM. When running
iperf tests, I consistently get the following results when running the
transfer from the device to my laptop:

CFS: [ 5] 0.0-60.0 sec 107 MBytes 15.0 Mbits/sec
BFS: [ 5] 0.0-60.0 sec 119 MBytes 16.6 Mbits/sec

The transfer speed from my laptop to the device are the same with BFS
and CFS. I repeated the tests a few times just to be sure, and I will
check vmstat later.
The difference here cannot be flash related, as I ran a kernel image
with the whole userland contained in initramfs. No on-flash filesystem
was mounted or accessed.

- Felix
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Ingo Molnar on

* Felix Fietkau <nbd(a)openwrt.org> wrote:

> Ingo Molnar wrote:
> > * Felix Fietkau <nbd(a)openwrt.org> wrote:
> >
> >> Ingo Molnar wrote:
> >> > Well that's a really memory constrained MIPS device with like 16 MB of
> >> > RAM or so? So having effects from small things like changing details in
> >> > a kernel image is entirely plausible.
> >>
> >> Normally changing small details doesn't have much of an effect. While
> >> 16 MB is indeed not that much, we do usually have around 8 MB free
> >> with a full user space running. Changes to other subsystems normally
> >> produce consistent and repeatable differences that seem entirely
> >> unrelated to memory use, so any measurable difference related to
> >> scheduler changes is unlikely to be related to the low amount of RAM.
> >> By the way, we do frequently also test the same software with devices
> >> that have more RAM, e.g. 32 or 64 MB and it usually behaves in a very
> >> similar way.
> >
> > Well, Michael Buesch posted vmstat results, and they show what i have
> > found with my x86 simulated reproducer as well (these are Michael's
> > numbers):
> >
> > procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
> > r b swpd free buff cache si so bi bo in cs us sy id wa
> > 1 0 0 15892 1684 5868 0 0 0 0 268 6 31 69 0 0
> > 1 0 0 15892 1684 5868 0 0 0 0 266 2 34 66 0 0
> > 1 0 0 15892 1684 5868 0 0 0 0 266 6 33 67 0 0
> > 1 0 0 15892 1684 5868 0 0 0 0 267 4 37 63 0 0
> > 1 0 0 15892 1684 5868 0 0 0 0 267 6 34 66 0 0
> >
> > on average 4 context switches _per second_. The scheduler is not a
> > factor on this box.
> >
> > Furthermore:
> >
> > | I'm currently unable to test BFS, because the device throws strange
> > | flash errors. Maybe the flash is broken :(
> >
> > So maybe those flash errors somehow impacted the measurements as well?
> I did some tests with BFS v230 vs CFS on Linux 2.6.30 on a different
> MIPS device (Atheros AR2317) with 180 MHz and 16 MB RAM. When running
> iperf tests, I consistently get the following results when running the
> transfer from the device to my laptop:
>
> CFS: [ 5] 0.0-60.0 sec 107 MBytes 15.0 Mbits/sec
> BFS: [ 5] 0.0-60.0 sec 119 MBytes 16.6 Mbits/sec
>
> The transfer speed from my laptop to the device are the same with BFS
> and CFS. I repeated the tests a few times just to be sure, and I will
> check vmstat later.

Which exact mainline kernel have you tried? For anything performance
related running latest upstream -git (currently at 202c467) would be
recommended.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Felix Fietkau on
Ingo Molnar wrote:
> * Felix Fietkau <nbd(a)openwrt.org> wrote:
>> I did some tests with BFS v230 vs CFS on Linux 2.6.30 on a different
>> MIPS device (Atheros AR2317) with 180 MHz and 16 MB RAM. When running
>> iperf tests, I consistently get the following results when running the
>> transfer from the device to my laptop:
>>
>> CFS: [ 5] 0.0-60.0 sec 107 MBytes 15.0 Mbits/sec
>> BFS: [ 5] 0.0-60.0 sec 119 MBytes 16.6 Mbits/sec
>>
>> The transfer speed from my laptop to the device are the same with BFS
>> and CFS. I repeated the tests a few times just to be sure, and I will
>> check vmstat later.
>
> Which exact mainline kernel have you tried? For anything performance
> related running latest upstream -git (currently at 202c467) would be
> recommended.
I used the OpenWrt-patched 2.6.30. Support for the hardware that I
tested with hasn't been merged upstream yet. Do you think that the
scheduler related changes after 2.6.30 are relevant for non-SMP
performance as well? If so, I'll work on a test with latest upstream
-git with the necessary patches when I have time for it.

- Felix
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Ingo Molnar on

* Felix Fietkau <nbd(a)openwrt.org> wrote:

> Ingo Molnar wrote:
> > * Felix Fietkau <nbd(a)openwrt.org> wrote:
> >> I did some tests with BFS v230 vs CFS on Linux 2.6.30 on a different
> >> MIPS device (Atheros AR2317) with 180 MHz and 16 MB RAM. When running
> >> iperf tests, I consistently get the following results when running the
> >> transfer from the device to my laptop:
> >>
> >> CFS: [ 5] 0.0-60.0 sec 107 MBytes 15.0 Mbits/sec
> >> BFS: [ 5] 0.0-60.0 sec 119 MBytes 16.6 Mbits/sec
> >>
> >> The transfer speed from my laptop to the device are the same with BFS
> >> and CFS. I repeated the tests a few times just to be sure, and I will
> >> check vmstat later.
> >
> > Which exact mainline kernel have you tried? For anything performance
> > related running latest upstream -git (currently at 202c467) would be
> > recommended.
>
> I used the OpenWrt-patched 2.6.30. Support for the hardware that I
> tested with hasn't been merged upstream yet. Do you think that the
> scheduler related changes after 2.6.30 are relevant for non-SMP
> performance as well? If so, I'll work on a test with latest upstream
> -git with the necessary patches when I have time for it.

Dont know - it's hard to tell what happens without basic analysis tools.
Is there _any_ way to profile what happens on that system? (Do hrtimers
work on it that could be used to profile it?)

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/