From: Prarit Bhargava on
Upstream PV guests fail to boot because of a NULL pointer. It is possible that
xen guests have irq_desc->chip_data = NULL.

Test for NULL chip_data pointer before attempting to complete an irq move.

Signed-off-by: Prarit Bhargava <prarit(a)redhat.com>
Acked-by: Suresh Siddha <suresh.b.siddha(a)intel.com>

diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index 127b871..eb2789c 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -2545,6 +2545,9 @@ void irq_force_complete_move(int irq)
struct irq_desc *desc = irq_to_desc(irq);
struct irq_cfg *cfg = desc->chip_data;

+ if (!cfg)
+ return;
+
__irq_complete_move(&desc, cfg->vector);
}
#else
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Prarit Bhargava on


On 04/27/2010 12:58 PM, Konrad Rzeszutek Wilk wrote:
> On Tue, Apr 27, 2010 at 11:24:42AM -0400, Prarit Bhargava wrote:
>
>> Upstream PV guests fail to boot because of a NULL pointer. It is possible that
>> xen guests have irq_desc->chip_data = NULL.
>>
> Can you provide a short example of test scenario? As in what I should do
> to reproduce this problem?
>

Take the latest upstream (well ... to be honest, a bit older than that
because of some other bugs) -- take 2.6.33 and try to boot it as a PV
guest. I'm using a RHEL5 Xen HV fwiw ...

P.

>> Test for NULL chip_data pointer before attempting to complete an irq move.
>>
>> Signed-off-by: Prarit Bhargava<prarit(a)redhat.com>
>> Acked-by: Suresh Siddha<suresh.b.siddha(a)intel.com>
>>
>> diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
>> index 127b871..eb2789c 100644
>> --- a/arch/x86/kernel/apic/io_apic.c
>> +++ b/arch/x86/kernel/apic/io_apic.c
>> @@ -2545,6 +2545,9 @@ void irq_force_complete_move(int irq)
>> struct irq_desc *desc = irq_to_desc(irq);
>> struct irq_cfg *cfg = desc->chip_data;
>>
>> + if (!cfg)
>> + return;
>> +
>> __irq_complete_move(&desc, cfg->vector);
>> }
>> #else
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majordomo(a)vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at http://www.tux.org/lkml/
>>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Konrad Rzeszutek Wilk on
On Tue, Apr 27, 2010 at 11:24:42AM -0400, Prarit Bhargava wrote:
> Upstream PV guests fail to boot because of a NULL pointer. It is possible that
> xen guests have irq_desc->chip_data = NULL.

Can you provide a short example of test scenario? As in what I should do
to reproduce this problem?
>
> Test for NULL chip_data pointer before attempting to complete an irq move.
>
> Signed-off-by: Prarit Bhargava <prarit(a)redhat.com>
> Acked-by: Suresh Siddha <suresh.b.siddha(a)intel.com>
>
> diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
> index 127b871..eb2789c 100644
> --- a/arch/x86/kernel/apic/io_apic.c
> +++ b/arch/x86/kernel/apic/io_apic.c
> @@ -2545,6 +2545,9 @@ void irq_force_complete_move(int irq)
> struct irq_desc *desc = irq_to_desc(irq);
> struct irq_cfg *cfg = desc->chip_data;
>
> + if (!cfg)
> + return;
> +
> __irq_complete_move(&desc, cfg->vector);
> }
> #else
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo(a)vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Andrew Jones on
On 04/27/2010 07:09 PM, Prarit Bhargava wrote:
>
>
> On 04/27/2010 12:58 PM, Konrad Rzeszutek Wilk wrote:
>> On Tue, Apr 27, 2010 at 11:24:42AM -0400, Prarit Bhargava wrote:
>>
>>> Upstream PV guests fail to boot because of a NULL pointer. It is
>>> possible that
>>> xen guests have irq_desc->chip_data = NULL.
>>>
>> Can you provide a short example of test scenario? As in what I should do
>> to reproduce this problem?
>>
>
> Take the latest upstream (well ... to be honest, a bit older than that
> because of some other bugs) -- take 2.6.33 and try to boot it as a PV
> guest. I'm using a RHEL5 Xen HV fwiw ...
>
> P.

Another ingredient is to boot the guest with a configuration where its
maxvcpus is greater than its vcpus. If you have RHEL 5.5 userspace then
you can create a config with lines like this

maxvcpus = 4
vcpus = 2

with that you'll crash on boot. Then you can check that
irq_force_complete_move is on the stack if you have "preserve" for
on_crash and use xenctx to look at the state of the vcpus.

If the Xen you're using doesn't support the maxvcpus var, then I believe
you can do the same principle, but in a different way, using the
vcpus_avail var. Or, you can boot with > 1 vcpus and then attempt to
remove one with 'xm vcpu-set'.

Andrew

>
>>> Test for NULL chip_data pointer before attempting to complete an irq
>>> move.
>>>
>>> Signed-off-by: Prarit Bhargava<prarit(a)redhat.com>
>>> Acked-by: Suresh Siddha<suresh.b.siddha(a)intel.com>
>>>
>>> diff --git a/arch/x86/kernel/apic/io_apic.c
>>> b/arch/x86/kernel/apic/io_apic.c
>>> index 127b871..eb2789c 100644
>>> --- a/arch/x86/kernel/apic/io_apic.c
>>> +++ b/arch/x86/kernel/apic/io_apic.c
>>> @@ -2545,6 +2545,9 @@ void irq_force_complete_move(int irq)
>>> struct irq_desc *desc = irq_to_desc(irq);
>>> struct irq_cfg *cfg = desc->chip_data;
>>>
>>> + if (!cfg)
>>> + return;
>>> +
>>> __irq_complete_move(&desc, cfg->vector);
>>> }
>>> #else
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe
>>> linux-kernel" in
>>> the body of a message to majordomo(a)vger.kernel.org
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>> Please read the FAQ at http://www.tux.org/lkml/
>>>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Konrad Rzeszutek Wilk on
>> Can you provide a short example of test scenario? As in what I should do
>> to reproduce this problem?
>>
>
> Take the latest upstream (well ... to be honest, a bit older than that
> because of some other bugs) -- take 2.6.33 and try to boot it as a PV

2.6.34-rc5 PV boots under Xen for me (and pretty much since 2.6.33 +
Suresh fix for the CONFIG_RODATA_MARK).

Perhaps I am missing some of the .config options you have set that make it not work?

The irqbalance daemon looks to be running - but I think you are hitting
this during bootup? How long do you have to wait for this to trigger?

How many CPUs did you assign to your guest?

What are the "other bugs" you speak off?

> guest. I'm using a RHEL5 Xen HV fwiw ...

OK, so your control domain is RHEL5. Mine is the Jeremy's xen/next one
(2.6.32). Let me try to compile RHEL5 under FC11 - any tricks necessary
to do that?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/