From: Andrew Morton on
On Wed, 07 May 2008 11:41:52 +0800 "Zhang, Yanmin" <yanmin_zhang(a)linux.intel.com> wrote:

> As system idle is more than 50%, so the schedule/schedule_timeout caller is important
> information.
> 1) lock_kernel causes most schedule/schedule_timeout;
> 2) When lock_kernel calls down, then __down, __down calls ___schedule_timeout for
> lots of times in a loop;

Really? Are you sure? That would imply that we keep on waking up tasks
which then fail to acquire the lock. But the code pretty plainly doesn't
do that.

Odd.

> 3) Caller of lcok_kernel are sys_fcntl/vfs_ioctl/tty_release/chrdev_open.

Still :(
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Zhang, Yanmin on

On Tue, 2008-05-06 at 20:59 -0700, Andrew Morton wrote:
> On Wed, 07 May 2008 11:41:52 +0800 "Zhang, Yanmin" <yanmin_zhang(a)linux.intel.com> wrote:
>
> > As system idle is more than 50%, so the schedule/schedule_timeout caller is important
> > information.
> > 1) lock_kernel causes most schedule/schedule_timeout;
> > 2) When lock_kernel calls down, then __down, __down calls ___schedule_timeout for
> > lots of times in a loop;
>
> Really? Are you sure? That would imply that we keep on waking up tasks
> which then fail to acquire the lock. But the code pretty plainly doesn't
> do that.
Yes, totally based on the data.
The data means the calling times among functions. Initially , I just collected the caller
of schedule and schedule_timeout. Then I found most schedule/schedule_timeout are called by
__down which is called down. Then, I changes kernel to collect more functions' calling info.

If comparing the calling times of down, __down and schedule_timeout, we could find
schedule_timeout is called by __down for 222330308, but __down is called only for 153190.

>
> Odd.
>
> > 3) Caller of lcok_kernel are sys_fcntl/vfs_ioctl/tty_release/chrdev_open.
> Still :(
Yes. The data has an error difference, but the difference is small. My patch doesn't
use lock to protect data in case it might introduces too much overhead.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Ingo Molnar on

* Ingo Molnar <mingo(a)elte.hu> wrote:

> > 3) Caller of lock_kernel are sys_fcntl/vfs_ioctl/tty_release/chrdev_open.
>
> that's one often-forgotten BKL site: about 1000 ioctls are still
> running under the BKL. The TTY one is hurting the most. [...]

although it's an unlocked_ioctl() now in 2.6.26, so all the BKL locking
has been nicely pushed down to deep inside the tty code.

> [...] To make sure it's only that BKL acquire/release that hurts,
> could you try the hack patch below, does it make any difference to
> performance?

if you use a serial console you will need the updated patch below.

Ingo

---------------------->
Subject: no: tty bkl
From: Ingo Molnar <mingo(a)elte.hu>
Date: Wed May 07 08:21:22 CEST 2008

Signed-off-by: Ingo Molnar <mingo(a)elte.hu>
---
drivers/char/tty_io.c | 5 +++--
drivers/serial/serial_core.c | 2 +-
2 files changed, 4 insertions(+), 3 deletions(-)

Index: linux/drivers/char/tty_io.c
===================================================================
--- linux.orig/drivers/char/tty_io.c
+++ linux/drivers/char/tty_io.c
@@ -2844,9 +2844,10 @@ out:

static int tty_release(struct inode *inode, struct file *filp)
{
- lock_kernel();
+ /* DANGEROUS - can crash your kernel! */
+// lock_kernel();
release_dev(filp);
- unlock_kernel();
+// unlock_kernel();
return 0;
}

Index: linux/drivers/serial/serial_core.c
===================================================================
--- linux.orig/drivers/serial/serial_core.c
+++ linux/drivers/serial/serial_core.c
@@ -1241,7 +1241,7 @@ static void uart_close(struct tty_struct
struct uart_state *state = tty->driver_data;
struct uart_port *port;

- BUG_ON(!kernel_locked());
+// BUG_ON(!kernel_locked());

if (!state || !state->port)
return;

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Ingo Molnar on

* Zhang, Yanmin <yanmin_zhang(a)linux.intel.com> wrote:

> 3) Caller of lock_kernel are sys_fcntl/vfs_ioctl/tty_release/chrdev_open.

that's one often-forgotten BKL site: about 1000 ioctls are still running
under the BKL. The TTY one is hurting the most. To make sure it's only
that BKL acquire/release that hurts, could you try the hack patch below,
does it make any difference to performance?

but even if taking the BKL does hurt, it's quite unexpected to cause a
40% drop. Perhaps AIM7 has tons of threads that exit at once and all try
to release their controlling terminal or something like that?

Ingo

------------------------>
Subject: DANGEROUS tty hack: no BKL
From: Ingo Molnar <mingo(a)elte.hu>
Date: Wed May 07 08:21:22 CEST 2008

NOT-Signed-off-by: Ingo Molnar <mingo(a)elte.hu>
---
drivers/char/tty_io.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

Index: linux/drivers/char/tty_io.c
===================================================================
--- linux.orig/drivers/char/tty_io.c
+++ linux/drivers/char/tty_io.c
@@ -2844,9 +2844,10 @@ out:

static int tty_release(struct inode *inode, struct file *filp)
{
- lock_kernel();
+ /* DANGEROUS - can crash your kernel! */
+// lock_kernel();
release_dev(filp);
- unlock_kernel();
+// unlock_kernel();
return 0;
}

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Zhang, Yanmin on

On Tue, 2008-05-06 at 19:39 +0200, Ingo Molnar wrote:
> * Andrew Morton <akpm(a)linux-foundation.org> wrote:
>
> > Finally: how come we regressed by swapping the semaphore
> > implementation anyway? We went from one sleeping lock implementation
> > to another - I'd have expected performance to be pretty much the same.
> i.e. we'll always keep yet another task in flight. This can mask wakeup
> latencies especially when it takes time.
>
> The patch (hack) below tries to emulate this weirdness - it 'kicks'
> another task as well and keeps it busy. Most of the time this just
> causes extra scheduling, but if AIM7 is _just_ saturating the number of
> CPUs, it might make a difference. Yanmin, does the patch below make any
> difference to the AIM7 results?
I tested it on my 8-core stoakley and the result is 12% worse than the one of
pure 2.6.26-rc1.

-yanmin

>
> ( it would be useful data to get a meaningful context switch trace from
> the whole regressed workload, and compare it to a context switch trace
> with the revert added. )
>
> Ingo
>
> ---
> kernel/semaphore.c | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
> Index: linux/kernel/semaphore.c
> ===================================================================
> --- linux.orig/kernel/semaphore.c
> +++ linux/kernel/semaphore.c
> @@ -261,4 +261,14 @@ static noinline void __sched __up(struct
> list_del(&waiter->list);
> waiter->up = 1;
> wake_up_process(waiter->task);
> +
> + if (likely(list_empty(&sem->wait_list)))
> + return;
> + /*
> + * Opportunistically wake up another task as well but do not
> + * remove it from the list:
> + */
> + waiter = list_first_entry(&sem->wait_list,
> + struct semaphore_waiter, list);
> + wake_up_process(waiter->task);
> }

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/