From: Ben Greear on
On 07/13/2010 08:29 PM, Robert Hancock wrote:
> On Tue, Jul 13, 2010 at 8:22 PM, Ben Greear<greearb(a)candelatech.com> wrote:
>>> Can you print out bus->number and devfn and look that up in lspci to
>>> find out which device it's hitting? It looks like there's a device with
>>> a PCI Express extended capability header that has a extended capability
>>> ID of 0000h and a next capability offset of 100h, which points to
>>> itself, causing the infinite loop. I'm guessing that if pcie_cap>> 20
>>> <= pos then it should give up and break out of the loop, since it means
>>> that the next capability pointer is invalidly pointing to the same or a
>>> previous entry..
>>
>> Bailing out like that does let it boot.
>>
>> As for the bus and devfn: bus: 0 devfn: 129 (decimal)
>>
>> I'm not sure what to look for in lspci, but here is the output with -n:
>
> That will be device 0x10 function 1, this one:
>
> 00:10.1 0600: 8086:25f0 (rev b1)
>
> Intel 5000 Series Chipset FSB Registers, apparently.. What does lspci
> -vv show for that device?

00:10.1 Host bridge: Intel Corporation 5000 Series Chipset FSB Registers (rev b1)
Subsystem: Super Micro Computer Inc Unknown device 9780
Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Kernel modules: i5000_edac, i5k_amb

Thanks,
Ben


--
Ben Greear <greearb(a)candelatech.com>
Candela Technologies Inc http://www.candelatech.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Ben Greear on
On 07/14/2010 08:36 AM, Pan, Jacob jun wrote:
> what is the config size of 10.1?
> ls -l /sys/bus/pci/devices/0000:00:10.1/config
>
> if that is 256, it might be related to this patch.

[root(a)ice-si-dmz ~]# ls -l /sys/bus/pci/devices/0000:00:10.1/config
-rw-r--r-- 1 root root 4096 2010-07-13 19:14 /sys/bus/pci/devices/0000:00:10.1/config

Thanks,
Ben

--
Ben Greear <greearb(a)candelatech.com>
Candela Technologies Inc http://www.candelatech.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Ben Greear on
On 07/14/2010 08:36 AM, Pan, Jacob jun wrote:
> what is the config size of 10.1?
> ls -l /sys/bus/pci/devices/0000:00:10.1/config
>
> if that is 256, it might be related to this patch.
>
>> From e9b1d5d0ff4d3ae86050dc4c91b3147361c7af9e Mon Sep 17 00:00:00 2001
> From: H. Peter Anvin<hpa(a)linux.intel.com>
> Date: Fri, 14 May 2010 13:55:57 -0700
> Subject: [PATCH] x86, mrst: Don't blindly access extended config space
>
> Do not blindly access extended configuration space unless we actively
> know we're on a Moorestown platform. The fixed-size BAR capability
> lives in the extended configuration space, and thus is not applicable
> if the configuration space isn't appropriately sized.
>
> This fixes booting certain VMware configurations with CONFIG_MRST=y.
>
> Moorestown will add a fake PCI-X 266 capability to advertise the
> presence of extended configuration space.

I'll try this in a bit, but shouldn't we also check for no-progress in
that while loop and bail out in that case? No reason to hang on
boot just because the bios or whatever is busted?

Thanks,
Ben

>
> Reported-and-tested-by: Petr Vandrovec<petr(a)vandrovec.name>
> Signed-off-by: H. Peter Anvin<hpa(a)linux.intel.com>
> Acked-by: Jacob Pan<jacob.jun.pan(a)intel.com>
> Acked-by: Jesse Barnes<jbarnes(a)virtuousgeek.org>
> LKML-Reference:<AANLkTiltKUa3TrKR1M51eGw8FLNoQJSLT0k0_K5X3-OJ(a)mail.gmail.com>
> ---
> arch/x86/pci/mrst.c | 4 ++++
> 1 files changed, 4 insertions(+), 0 deletions(-)
>
> diff --git a/arch/x86/pci/mrst.c b/arch/x86/pci/mrst.c
> index 8bf2fcb..1cdc02c 100644
> --- a/arch/x86/pci/mrst.c
> +++ b/arch/x86/pci/mrst.c
> @@ -247,6 +247,10 @@ static void __devinit pci_fixed_bar_fixup(struct pci_dev *dev)
> u32 size;
> int i;
>
> + /* Must have extended configuration space */
> + if (dev->cfg_size< PCIE_CAP_OFFSET + 4)
> + return;
> +
> /* Fixup the BAR sizes for fixed BAR devices and make them unmoveable */
> offset = fixed_bar_cap(dev->bus, dev->devfn);
> if (!offset || PCI_DEVFN(2, 0) == dev->devfn ||


--
Ben Greear <greearb(a)candelatech.com>
Candela Technologies Inc http://www.candelatech.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Ben Greear on
On 07/14/2010 08:36 AM, Pan, Jacob jun wrote:
> what is the config size of 10.1?
> ls -l /sys/bus/pci/devices/0000:00:10.1/config
>
> if that is 256, it might be related to this patch.

That patch is already in 2.6.34.y (with slight white-space
change it seems: space before <).

I just posted a patch to lkml that fixes the problem for me,
based on a suggestion by Robert Hancock.

I think this or something similar should to go 2.6.34.y stable
as well.

Thanks,
Ben

--
Ben Greear <greearb(a)candelatech.com>
Candela Technologies Inc http://www.candelatech.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Ben Greear on
On 07/14/2010 11:19 AM, Pan, Jacob jun wrote:
>> -----Original Message-----
>> From: Ben Greear [mailto:greearb(a)candelatech.com]
>> Sent: Wednesday, July 14, 2010 10:07 AM
>> To: Pan, Jacob jun
>> Cc: Robert Hancock; linux-kernel; jbarnes(a)virtuousgeek.org
>> Subject: Re: Regression: 2.6.34 boot fails on E5405 system, bisected:
>> de08e2c26
>>
>> On 07/14/2010 08:36 AM, Pan, Jacob jun wrote:
>>> what is the config size of 10.1?
>>> ls -l /sys/bus/pci/devices/0000:00:10.1/config
>>>
>>> if that is 256, it might be related to this patch.
>>
>> That patch is already in 2.6.34.y (with slight white-space
>> change it seems: space before<).
>>
>> I just posted a patch to lkml that fixes the problem for me,
>> based on a suggestion by Robert Hancock.
>>
>> I think this or something similar should to go 2.6.34.y stable
>> as well.
>>
>
>
> I have not seen the patch yet, but there is no guarantee that
> capabilities are always laid out in ascending address. So I think
> we cannot bail out when
> pcie_cap>> 20<= pos
>
> If that is some bug in the config space, can we fix it with some quirks?

No idea, but if it's on this one motherboard/device, I imagine it's somewhere
else as well.

Is there at least a maximum number of capabilities that can exist so that
you can limit the loop by that?

Thanks,
Ben

>
> Thanks,
>
> Jacob
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo(a)vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/


--
Ben Greear <greearb(a)candelatech.com>
Candela Technologies Inc http://www.candelatech.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/