From: Guenter Roeck on
[ For some reason my replies don't make it to the list. Resending. ]

On Wed, 2010-03-31 at 14:31 -0400, H. Peter Anvin wrote:
> On 03/31/2010 07:41 AM, Guenter Roeck wrote:
> > Current early_printk code writes into VGA memory space even
> > if CONFIG_VGA_CONSOLE is undefined. This can cause problems
> > if there is no VGA device in the system, especially if the memory
> > is used by another device.
> >
> > Fix problem by redirecting output to early_serial_console
> > if CONFIG_VGA_CONSOLE is undefined.
> >
> > Signed-off-by: Guenter Roeck <guenter.roeck(a)ericsson.com>
> >
> > asmlinkage void early_printk(const char *fmt, ...)
> > @@ -216,7 +224,7 @@ static int __init setup_early_printk(char *buf)
> > early_serial_init(buf + 4);
> > early_console_register(&early_serial_console, keep);
> > }
> > - if (!strncmp(buf, "vga", 3) &&
> > + if (have_vga_console && !strncmp(buf, "vga", 3) &&
> > boot_params.screen_info.orig_video_isVGA == 1) {
> > max_xpos = boot_params.screen_info.orig_video_cols;
> > max_ypos = boot_params.screen_info.orig_video_lines;
>
> I'm confused in a big way about how you could end up with a system where:
>
> a) there is no VGA;
> b) VGA memory is used by another device(!!!);
> c) boot_params.screen_info.orig_video_isVGA == 1?
>
> -hpa

Look for
early_printk("Kernel alive");

That function is called prior to early_console_register(). Even though
the call is now conditional, it can still happen if the log level is
high enough. There are a couple of other early_printk() calls which can
be executed before early_console_register() as well. The value of isVGA
is thus irrelevant.

Regarding a) and b), we have hardware which does not have VGA and does
use the same memory space for another device. This was actually how the
problem was found.

Guenter


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Guenter Roeck on
On Wed, 2010-03-31 at 11:32 -0400, Pekka Enberg wrote:
> On Wed, Mar 31, 2010 at 4:41 PM, Guenter Roeck
> <guenter.roeck(a)ericsson.com> wrote:
> > Current early_printk code writes into VGA memory space even
> > if CONFIG_VGA_CONSOLE is undefined. This can cause problems
> > if there is no VGA device in the system, especially if the memory
> > is used by another device.
> >
> > Fix problem by redirecting output to early_serial_console
> > if CONFIG_VGA_CONSOLE is undefined.
> >
> > Signed-off-by: Guenter Roeck <guenter.roeck(a)ericsson.com>
>
> Reviewed-by: Pekka Enberg <penberg(a)cs.helsinki.fi>
>
What will it take to get this patch into the tree ?

If there are coding style issues or some other unresolved concerns,
please let me know.

Guenter


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Guenter Roeck on
On Mon, 2010-04-05 at 14:46 -0400, H. Peter Anvin wrote:
> On 04/05/2010 11:10 AM, Guenter Roeck wrote:
> > On Wed, 2010-03-31 at 11:32 -0400, Pekka Enberg wrote:
> >> On Wed, Mar 31, 2010 at 4:41 PM, Guenter Roeck
> >> <guenter.roeck(a)ericsson.com> wrote:
> >>> Current early_printk code writes into VGA memory space even
> >>> if CONFIG_VGA_CONSOLE is undefined. This can cause problems
> >>> if there is no VGA device in the system, especially if the memory
> >>> is used by another device.
> >>>
> >>> Fix problem by redirecting output to early_serial_console
> >>> if CONFIG_VGA_CONSOLE is undefined.
> >>>
> >>> Signed-off-by: Guenter Roeck <guenter.roeck(a)ericsson.com>
> >>
> >> Reviewed-by: Pekka Enberg <penberg(a)cs.helsinki.fi>
> >>
> > What will it take to get this patch into the tree ?
> >
> > If there are coding style issues or some other unresolved concerns,
> > please let me know.
> >
>
> You didn't answer my question (c).
>
> I want to know how you ended up with
> boot_params.screen_info.orig_video_isVGA == 1 on a system with no VGA,
> which seems like it would have resolved this.
>
> I am *not* inclined to add a compile-time test for what should have been
> handed with a runtime test already.
>
Sorry, I thought I did answer it.

The problem is that early_printk() can be called prior to the call to
setup_early_printk(). Since early_console is currently pre-initialized
with early_vga_console, output can be written to VGA memory space even
if there is no VGA controller in the system (and even if
boot_params.screen_info.orig_video_isVGA == 0). This happens for all
early_printk() calls executed prior to the call to setup_early_printk().

I don't mind taking out have_vga_console, if that is the issue. That is
just an optimization resulting in the entire VGA code to be optimized
away if CONFIG_VGA_CONSOLE is not defined. The important part of the
patch is to not pre-initialize early_console with early_vga_console if
CONFIG_VGA_CONSOLE is not defined.

An alternative might be to not pre-initialize early_console at all, and
to modify early_printk() to not do anything if early_console is NULL.
However, that would result in output such as "Kernel alive" to not be
displayed at all, which I assumed would be undesirable.

Guenter


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Guenter Roeck on
On Mon, 2010-04-05 at 16:25 -0400, H. Peter Anvin wrote:
> On 04/05/2010 01:02 PM, Guenter Roeck wrote:
> >>
> >> You didn't answer my question (c).
> >>
> >> I want to know how you ended up with
> >> boot_params.screen_info.orig_video_isVGA == 1 on a system with no VGA,
> >> which seems like it would have resolved this.
> >>
> >> I am *not* inclined to add a compile-time test for what should have been
> >> handed with a runtime test already.
> >>
> > Sorry, I thought I did answer it.
> >
>
> You didn't. You still haven't!
>
c) boot_params.screen_info.orig_video_isVGA == 1?

boot_params.screen_info.orig_video_isVGA == 0 in the problem case. As I
tried to explain below, the problem happens before setup_early_printk()
is called, and thus the value of orig_video_isVGA is irrelevant for the
problem case. Not sure how else I can explain it.

> > The problem is that early_printk() can be called prior to the call to
> > setup_early_printk(). Since early_console is currently pre-initialized
> > with early_vga_console, output can be written to VGA memory space even
> > if there is no VGA controller in the system (and even if
> > boot_params.screen_info.orig_video_isVGA == 0). This happens for all
> > early_printk() calls executed prior to the call to setup_early_printk().
>
> If boot_params.screen_info.orig_video_isVGA == 0, at least this bit of
> your patch has no effect:
>
> > > - if (!strncmp(buf, "vga", 3) &&
> > > + if (have_vga_console && !strncmp(buf, "vga", 3) &&
> > > boot_params.screen_info.orig_video_isVGA == 1) {
>
It does; as a result of this part of the patch, the compiler can
optimize all vga related code away. As I said, this is just an
optimization resulting in less code. It is however not important /
relevant from a functional point of view, and I don't mind taking it
out.

> Now, we have at least two ways to report a non-VGA console at runtime:
>
> boot_params.screen_info.orig_video_isVGA != 1
> boot_params.screen_info.orig_video_lines == 0
>
> The former is zero for CGA/MDA/EGA, but early_vga_write() doesn't work
> right for MDA at least, so keying on isVGA is probably right.
>
> early_printk() being called before setup_early_printk() is a problem,
> and it's not immediately obvious to me how to fix it. We can of course
> make early_vga_write() simply return if boot_params.screen_info.isVGA ==
> 0, of course, but it really is a bigger problem than that in many ways.
>
As far as I can see, boot_params.screen_info.orig_video_isVGA is set
early enough during boot, so that should at least solve the immediate
problem. However, it would result in early messages being ignored, which
might not be desirable.

Would you accept a minimized patch like this ?

/* Direct interface for emergencies */
+#ifdef CONFIG_VGA_CONSOLE
static struct console *early_console = &early_vga_console;
+#else
+static struct console *early_console = &early_serial_console;
+#endif
static int __initdata early_console_initialized;

This would prevent the problem while minimizing changes, and at the same
time permit early messages to be written to the serial console.

Guenter


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Guenter Roeck on
On Mon, 2010-04-05 at 17:11 -0400, H. Peter Anvin wrote:
> On 04/05/2010 02:04 PM, Guenter Roeck wrote:
> >
> > Would you accept a minimized patch like this ?
> >
> > /* Direct interface for emergencies */
> > +#ifdef CONFIG_VGA_CONSOLE
> > static struct console *early_console = &early_vga_console;
> > +#else
> > +static struct console *early_console = &early_serial_console;
> > +#endif
> > static int __initdata early_console_initialized;
> >
> > This would prevent the problem while minimizing changes, and at the same
> > time permit early messages to be written to the serial console.
> >
>
> I'm unhappy about it, because *those early messages shouldn't exist in
> the first place*. It seems to be an indication that we're invoking
> setup_early_printk() too late. The whole playing around with max_xpos
> and max_ypos instead of using boot_params.screen_info directly is
> particularly bleacherous.
>
> I would at least like to see if the improper invocation of
> early_printk() can be avoided.
>
There are several such invocations.

1) arch/x86/kernel/head_64.S:
ENTRY(early_idt_handler)
....
leaq early_idt_msg(%rip),%rdi
call early_printk

This displays "PANIC: early exception %02lx rip %lx:%lx error %lx cr2 %
lx\n" and subsequently calls dump_stack. The handler is initialized from
x86_64_start_kernel().

2) arch/x86/kernel/head64.c:x86_64_start_kernel():
if (console_loglevel == 10)
early_printk("Kernel alive\n");

3) init/main.c: start_kernel()
printk(KERN_NOTICE "%s", linux_banner);
and
printk(KERN_NOTICE "Kernel command line: %s\n", boot_command_line);

4) arch/x86/kernel/setup.c:setup_arch()
Several.

After that I gave up looking.

Not sure if or how those can be avoided.

Moving setup_early_printk() into x86_64_start_kernel() might be an
option, but that would require much more significant changes.

Guenter


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/