From: Andrew Morton on
On Tue, 29 Jun 2010 14:22:26 +0530
"Shilimkar, Santosh" <santosh.shilimkar(a)ti.com> wrote:

> Hi,
>
> I have faced similar issue as what is being described in below with
> latest kernel.
>
> ------------------------------------------------
> https://patchwork.kernel.org/patch/103347/
>
> When a secondary CPU is being brought up, it is not uncommon for
> printk() to be invoked when cpu_online(smp_processor_id()) == 0. The
> case that I witnessed personally was on MIPS:
>
> http://lkml.org/lkml/2010/5/30/4
>
> If (can_use_console() == 0), printk() will spool its output to log_buf
> and it will be visible in "dmesg", but that output will NOT be echoed to
> the console until somebody calls release_console_sem() from a CPU that
> is online. Therefore, the boot time messages from the new CPU can get
> stuck in "limbo" for a long time, and might suddenly appear on the
> screen when a completely unrelated event (e.g. "eth0: link is down")
> occurs.
>
> This patch modifies the console code so that any pending messages are
> automatically flushed out to the console whenever a CPU hotplug
> operation completes successfully or aborts.
>
> -----------------------------------------------
>
> Above patch fixes only half of the problem. I mean the cpu online
> path prints are coming on the console.
>
> But similar problem also exist if there are prints in the cpu offline
> path. I got that fixed by adding below patch on top of you patch.
>
> diff --git a/kernel/printk.c b/kernel/printk.c
> index d370b74..f4d7352 100644
> --- a/kernel/printk.c
> +++ b/kernel/printk.c
> @@ -982,6 +982,9 @@ static int __cpuinit console_cpu_notify(struct notifier_bloc
> switch (action) {
> case CPU_ONLINE:
> case CPU_UP_CANCELED:
> + case CPU_DEAD:
> + case CPU_DYING:
> + case CPU_DOWN_FAILED:
> if (try_acquire_console_sem() == 0)
> release_console_sem();
> }

The patch lacked a suitable title. I called it "console: flush log
messages for more cpu-hotplug events".

The patch lacks a Signed-off-by:. Please send one.

The patch has its tabs replaced with spaces. I fixed that. Please
reconfigure your email client for next time.

The code which is being patch has changed. It now does

acquire_console_sem();
release_console_sem();

so the code may no longer work - perhaps it now deadlocks (unlikely).
Please retest?

Finally, I don't understand the patch :( Who is sending out CPU_DEAD,
CPU_DYING or CPU_DOWN_FAILED events during kernel boot? I'd have
thought that those events simply aren't occurring, and that the patch
has no effect. Confused - please explain further.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Shilimkar, Santosh on
Thanks Andrew for looking into this.
> -----Original Message-----
> From: Andrew Morton [mailto:akpm(a)linux-foundation.org]
> Sent: Tuesday, August 03, 2010 4:15 AM
> To: Shilimkar, Santosh
> Cc: Kevin Cernekee; linux-kernel(a)vger.kernel.org; Russell King - ARM Linux
> Subject: Re: Additional fix : (was [v2]printk: fix delayed messages from
> CPU hotplug events)
>
> On Tue, 29 Jun 2010 14:22:26 +0530
> "Shilimkar, Santosh" <santosh.shilimkar(a)ti.com> wrote:
>
> > Hi,
> >
> > I have faced similar issue as what is being described in below with
> > latest kernel.
> >
> > ------------------------------------------------
> > https://patchwork.kernel.org/patch/103347/
> >
> > When a secondary CPU is being brought up, it is not uncommon for
> > printk() to be invoked when cpu_online(smp_processor_id()) == 0. The
> > case that I witnessed personally was on MIPS:
> >
> > http://lkml.org/lkml/2010/5/30/4
> >
> > If (can_use_console() == 0), printk() will spool its output to log_buf
> > and it will be visible in "dmesg", but that output will NOT be echoed to
> > the console until somebody calls release_console_sem() from a CPU that
> > is online. Therefore, the boot time messages from the new CPU can get
> > stuck in "limbo" for a long time, and might suddenly appear on the
> > screen when a completely unrelated event (e.g. "eth0: link is down")
> > occurs.
> >
> > This patch modifies the console code so that any pending messages are
> > automatically flushed out to the console whenever a CPU hotplug
> > operation completes successfully or aborts.
> >
> > -----------------------------------------------
> >
> > Above patch fixes only half of the problem. I mean the cpu online
> > path prints are coming on the console.
> >
> > But similar problem also exist if there are prints in the cpu offline
> > path. I got that fixed by adding below patch on top of you patch.
> >
> > diff --git a/kernel/printk.c b/kernel/printk.c
> > index d370b74..f4d7352 100644
> > --- a/kernel/printk.c
> > +++ b/kernel/printk.c
> > @@ -982,6 +982,9 @@ static int __cpuinit console_cpu_notify(struct
> notifier_bloc
> > switch (action) {
> > case CPU_ONLINE:
> > case CPU_UP_CANCELED:
> > + case CPU_DEAD:
> > + case CPU_DYING:
> > + case CPU_DOWN_FAILED:
> > if (try_acquire_console_sem() == 0)
> > release_console_sem();
> > }
>
> The patch lacked a suitable title. I called it "console: flush log
> messages for more cpu-hotplug events".
>
This diff was on top of already posted RFC patch. I will combine them

> The patch lacks a Signed-off-by:. Please send one.
>
> The patch has its tabs replaced with spaces. I fixed that. Please
> reconfigure your email client for next time.
>
> The code which is being patch has changed. It now does
>
> acquire_console_sem();
> release_console_sem();
>
> so the code may no longer work - perhaps it now deadlocks (unlikely).
> Please retest?
Retested. No deadlock observed
>
> Finally, I don't understand the patch :( Who is sending out CPU_DEAD,
> CPU_DYING or CPU_DOWN_FAILED events during kernel boot? I'd have
> thought that those events simply aren't occurring, and that the patch
> has no effect. Confused - please explain further.
These events can come during the CPU hotplug(offline). Below is the
complete patch. Also attaching it in case some email format screw
up.

-----------------------------------------------
From b99271ce43cc82cda28447444004933d0f218ee3 Mon Sep 17 00:00:00 2001
From: Santosh Shilimkar <santosh.shilimkar(a)ti.com>
Date: Tue, 3 Aug 2010 12:58:22 +0530
Subject: [PATCH] console: flush delayed log messages from cpu-hotplug events

When a secondary CPU is being brought up, it is not uncommon for
printk() to be invoked when cpu_online(smp_processor_id()) == 0. The
case that I witnessed personally was on MIPS:

http://lkml.org/lkml/2010/5/30/4

If (can_use_console() == 0), printk() will spool its output to log_buf
and it will be visible in "dmesg", but that output will NOT be echoed to
the console until somebody calls release_console_sem() from a CPU that
is online. Therefore, the boot time messages from the new CPU can get
stuck in "limbo" for a long time, and might suddenly appear on the
screen when a completely unrelated event (e.g. "eth0: link is down")
occurs.

This patch modifies the console code so that any pending messages are
automatically flushed out to the console whenever a CPU hotplug
operation completes successfully or aborts.
This is true even when CPU is getting hot-plugged out(offline) so
need to add additional hotplug events.

The issue was seen on 2.6.34.

Signed-off-by: Santosh Shilimkar <santosh.shilimkar(a)ti.com>
Signed-off-by: Kevin Cernekee <cernekee(a)gmail.com>
---
kernel/printk.c | 36 ++++++++++++++++++++++++++++++++++++
1 files changed, 36 insertions(+), 0 deletions(-)

diff --git a/kernel/printk.c b/kernel/printk.c
index 444b770..a884d81 100644
--- a/kernel/printk.c
+++ b/kernel/printk.c
@@ -37,6 +37,8 @@
#include <linux/ratelimit.h>
#include <linux/kmsg_dump.h>
#include <linux/syslog.h>
+#include <linux/cpu.h>
+#include <linux/notifier.h>

#include <asm/uaccess.h>

@@ -985,6 +987,40 @@ void resume_console(void)
}

/**
+ * console_cpu_notify - print deferred console messages after CPU hotplug
+ *
+ * If printk() is called from a CPU that is not online yet, the messages
+ * will be spooled but will not show up on the console. This function is
+ * called when a new CPU comes online and ensures that any such output
+ * gets printed.
+ */
+static int __cpuinit console_cpu_notify(struct notifier_block *self,
+ unsigned long action, void *hcpu)
+{
+ switch (action) {
+ case CPU_ONLINE:
+ case CPU_DEAD:
+ case CPU_DYING:
+ case CPU_DOWN_FAILED:
+ case CPU_UP_CANCELED:
+ if (try_acquire_console_sem() == 0)
+ release_console_sem();
+ }
+ return NOTIFY_OK;
+}
+
+static struct notifier_block __cpuinitdata console_nb = {
+ .notifier_call = console_cpu_notify,
+};
+
+static int __init console_notifier_init(void)
+{
+ register_cpu_notifier(&console_nb);
+ return 0;
+}
+late_initcall(console_notifier_init);
+
+/**
* acquire_console_sem - lock the console system for exclusive use.
*
* Acquires a semaphore which guarantees that the caller has
--
1.6.0.4
-----------------------------------------------


From: Randy Dunlap on
On Tue, 3 Aug 2010 13:03:25 +0530 Shilimkar, Santosh wrote:

> -----------------------------------------------
> From b99271ce43cc82cda28447444004933d0f218ee3 Mon Sep 17 00:00:00 2001
> From: Santosh Shilimkar <santosh.shilimkar(a)ti.com>
> Date: Tue, 3 Aug 2010 12:58:22 +0530
> Subject: [PATCH] console: flush delayed log messages from cpu-hotplug events
>
> When a secondary CPU is being brought up, it is not uncommon for
> printk() to be invoked when cpu_online(smp_processor_id()) == 0. The
> case that I witnessed personally was on MIPS:
>
> http://lkml.org/lkml/2010/5/30/4
>
> If (can_use_console() == 0), printk() will spool its output to log_buf
> and it will be visible in "dmesg", but that output will NOT be echoed to
> the console until somebody calls release_console_sem() from a CPU that
> is online. Therefore, the boot time messages from the new CPU can get
> stuck in "limbo" for a long time, and might suddenly appear on the
> screen when a completely unrelated event (e.g. "eth0: link is down")
> occurs.
>
> This patch modifies the console code so that any pending messages are
> automatically flushed out to the console whenever a CPU hotplug
> operation completes successfully or aborts.
> This is true even when CPU is getting hot-plugged out(offline) so
> need to add additional hotplug events.
>
> The issue was seen on 2.6.34.
>
> Signed-off-by: Santosh Shilimkar <santosh.shilimkar(a)ti.com>
> Signed-off-by: Kevin Cernekee <cernekee(a)gmail.com>
> ---
> kernel/printk.c | 36 ++++++++++++++++++++++++++++++++++++
> 1 files changed, 36 insertions(+), 0 deletions(-)
>
> diff --git a/kernel/printk.c b/kernel/printk.c
> index 444b770..a884d81 100644
> --- a/kernel/printk.c
> +++ b/kernel/printk.c
> @@ -37,6 +37,8 @@
> #include <linux/ratelimit.h>
> #include <linux/kmsg_dump.h>
> #include <linux/syslog.h>
> +#include <linux/cpu.h>
> +#include <linux/notifier.h>
>
> #include <asm/uaccess.h>
>
> @@ -985,6 +987,40 @@ void resume_console(void)
> }
>
> /**
> + * console_cpu_notify - print deferred console messages after CPU hotplug
> + *
> + * If printk() is called from a CPU that is not online yet, the messages
> + * will be spooled but will not show up on the console. This function is
> + * called when a new CPU comes online and ensures that any such output
> + * gets printed.
> + */
> +static int __cpuinit console_cpu_notify(struct notifier_block *self,
> + unsigned long action, void *hcpu)
> +{


kernel-doc notation should include function parameters also, please.

---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Shilimkar, Santosh on
> -----Original Message-----
> From: Randy Dunlap [mailto:rdunlap(a)xenotime.net]
> Sent: Tuesday, August 03, 2010 8:55 PM
> To: Shilimkar, Santosh
> Cc: Andrew Morton; Kevin Cernekee; linux-kernel(a)vger.kernel.org; Russell
> King - ARM Linux
> Subject: Re: Additional fix : (was [v2]printk: fix delayed messages from
> CPU hotplug events)
>
> On Tue, 3 Aug 2010 13:03:25 +0530 Shilimkar, Santosh wrote:
>

<snip>

> > /**
> > + * console_cpu_notify - print deferred console messages after CPU
> hotplug
> > + *
> > + * If printk() is called from a CPU that is not online yet, the
> messages
> > + * will be spooled but will not show up on the console. This function
> is
> > + * called when a new CPU comes online and ensures that any such output
> > + * gets printed.
> > + */
> > +static int __cpuinit console_cpu_notify(struct notifier_block *self,
> > + unsigned long action, void *hcpu)
> > +{
>
>
> kernel-doc notation should include function parameters also, please.
>
Sure. Here is an updated version.
--------------------------------------------
From 127c0ea58982762ad194f22d1281baaf529c9cbc Mon Sep 17 00:00:00 2001
From: Santosh Shilimkar <santosh.shilimkar(a)ti.com>
Date: Tue, 3 Aug 2010 12:58:22 +0530
Subject: [PATCH v2] console: flush delayed log messages from cpu-hotplug events

When a secondary CPU is being brought up, it is not uncommon for
printk() to be invoked when cpu_online(smp_processor_id()) == 0. The
case that I witnessed personally was on MIPS:

http://lkml.org/lkml/2010/5/30/4

If (can_use_console() == 0), printk() will spool its output to log_buf
and it will be visible in "dmesg", but that output will NOT be echoed to
the console until somebody calls release_console_sem() from a CPU that
is online. Therefore, the boot time messages from the new CPU can get
stuck in "limbo" for a long time, and might suddenly appear on the
screen when a completely unrelated event (e.g. "eth0: link is down")
occurs.

This patch modifies the console code so that any pending messages are
automatically flushed out to the console whenever a CPU hotplug
operation completes successfully or aborts.
This is true even when CPU is getting hot-plugged out(offline) so
need to add additional hotplug events.

The issue was seen on 2.6.34.

Signed-off-by: Santosh Shilimkar <santosh.shilimkar(a)ti.com>
Signed-off-by: Kevin Cernekee <cernekee(a)gmail.com>
---
kernel/printk.c | 39 +++++++++++++++++++++++++++++++++++++++
1 files changed, 39 insertions(+), 0 deletions(-)

diff --git a/kernel/printk.c b/kernel/printk.c
index 444b770..c145cef 100644
--- a/kernel/printk.c
+++ b/kernel/printk.c
@@ -37,6 +37,8 @@
#include <linux/ratelimit.h>
#include <linux/kmsg_dump.h>
#include <linux/syslog.h>
+#include <linux/cpu.h>
+#include <linux/notifier.h>

#include <asm/uaccess.h>

@@ -985,6 +987,43 @@ void resume_console(void)
}

/**
+ * console_cpu_notify - print deferred console messages after CPU hotplug
+ * @self: pointer to notfier block
+ * @action: cpu-hotplug events
+ * @hcpu: void poniter to pass any data
+ *
+ * If printk() is called from a CPU that is not online yet, the messages
+ * will be spooled but will not show up on the console. This function is
+ * called when a new CPU comes online or goes offline and ensures that
+ * any such output gets printed.
+ */
+static int __cpuinit console_cpu_notify(struct notifier_block *self,
+ unsigned long action, void *hcpu)
+{
+ switch (action) {
+ case CPU_ONLINE:
+ case CPU_DEAD:
+ case CPU_DYING:
+ case CPU_DOWN_FAILED:
+ case CPU_UP_CANCELED:
+ if (try_acquire_console_sem() == 0)
+ release_console_sem();
+ }
+ return NOTIFY_OK;
+}
+
+static struct notifier_block __cpuinitdata console_nb = {
+ .notifier_call = console_cpu_notify,
+};
+
+static int __init console_notifier_init(void)
+{
+ register_cpu_notifier(&console_nb);
+ return 0;
+}
+late_initcall(console_notifier_init);
+
+/**
* acquire_console_sem - lock the console system for exclusive use.
*
* Acquires a semaphore which guarantees that the caller has
--
1.6.0.4

From: Andrew Morton on
On Tue, 3 Aug 2010 13:03:25 +0530
"Shilimkar, Santosh" <santosh.shilimkar(a)ti.com> wrote:

> > thought that those events simply aren't occurring, and that the patch
> > has no effect. Confused - please explain further.
> These events can come during the CPU hotplug(offline). Below is the
> complete patch. Also attaching it in case some email format screw
> up.
>
> -----------------------------------------------
> >From b99271ce43cc82cda28447444004933d0f218ee3 Mon Sep 17 00:00:00 2001
> From: Santosh Shilimkar <santosh.shilimkar(a)ti.com>
> Date: Tue, 3 Aug 2010 12:58:22 +0530
> Subject: [PATCH] console: flush delayed log messages from cpu-hotplug events
>
> When a secondary CPU is being brought up, it is not uncommon for
> printk() to be invoked when cpu_online(smp_processor_id()) == 0. The
> case that I witnessed personally was on MIPS:
>
> http://lkml.org/lkml/2010/5/30/4
>
> If (can_use_console() == 0), printk() will spool its output to log_buf
> and it will be visible in "dmesg", but that output will NOT be echoed to
> the console until somebody calls release_console_sem() from a CPU that
> is online. Therefore, the boot time messages from the new CPU can get
> stuck in "limbo" for a long time, and might suddenly appear on the
> screen when a completely unrelated event (e.g. "eth0: link is down")
> occurs.
>
> This patch modifies the console code so that any pending messages are
> automatically flushed out to the console whenever a CPU hotplug
> operation completes successfully or aborts.
> This is true even when CPU is getting hot-plugged out(offline) so
> need to add additional hotplug events.
>
> The issue was seen on 2.6.34.
>
> Signed-off-by: Santosh Shilimkar <santosh.shilimkar(a)ti.com>
> Signed-off-by: Kevin Cernekee <cernekee(a)gmail.com>

An older version of this patch is present in linux-next, committed by
Paul Mundt, who wasn't cc'ed here(!).

Paul, please update. Be aware that the version of the patch to which I
replied is also not the latest.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/