From: Yinghai Lu on
On 06/30/2010 02:15 PM, Ram Pai wrote:
> PCI: skip release and reallocation of io port resources
>
> git commit 977d17bb1749517b353874ccdc9b85abc7a58c2a
> released and reallocated all resources, ioport and memory, when
> allocation of any resource of any type failed. This caused
> failure to reallocate fragile io port resources, as reported in
> https://bugzilla.kernel.org/show_bug.cgi?id=15960
>
> The problem was solved by reverting the commit, through
> git commit 769d9968e42c995eaaf61ac5583d998f32e0769a.
>
> However reverting the original patch fails MMIO resource allocation
> for SRIOV PCI-Bars on some platforms. Especially on platforms
> where the BIOS is unaware of SRIOV resource BARs.
>
> The following code, an idea proposed by Yinghai Lu, skips release
> and re-allocation of io port resources if its allocation has
> not failed in the first place.
>
> This patch applies on top of patch corresponding to
> git commit 977d17bb1749517b353874ccdc9b85abc7a58c2a
>

for safe all, i would suggest

1. put back
977d17bb1749517b353874ccdc9b85abc7a58c2a
2. and apply following patch.

[PATCH] pci: disable pci trying to reallocate pci bridge by default.

it broken Linus's Nouveau

bisected:to commit 977d17bb17
| PCI: update bridge resources to get more big ranges in PCI assign unssigned

so try disable it by default.

Signed-off-by: Yinghai Lu <yinghai(a)kernel.org>

---
Documentation/kernel-parameters.txt | 6 ++++++
drivers/pci/pci.c | 4 ++++
drivers/pci/pci.h | 2 ++
drivers/pci/setup-bus.c | 14 +++++++++-----
4 files changed, 21 insertions(+), 5 deletions(-)

Index: linux-2.6/Documentation/kernel-parameters.txt
===================================================================
--- linux-2.6.orig/Documentation/kernel-parameters.txt
+++ linux-2.6/Documentation/kernel-parameters.txt
@@ -2009,6 +2009,12 @@ and is between 256 and 4096 characters.
for broken drivers that don't call it.
skip_isa_align [X86] do not align io start addr, so can
handle more pci cards
+ try=n set the pci_try_num to reallocate the pci bridge resource
+ 1: default
+ 2: will set the num to max_depth, and try to reallocate res
+ to get big range for the bridge. assume the pci peer root
+ resource is right from _CRS or from hostbridge pci reg
+ reading out.
firmware [ARM] Do not re-enumerate the bus but instead
just use the configuration from the
bootloader. This is currently used on
Index: linux-2.6/drivers/pci/pci.c
===================================================================
--- linux-2.6.orig/drivers/pci/pci.c
+++ linux-2.6/drivers/pci/pci.c
@@ -2983,6 +2983,10 @@ static int __init pci_setup(char *str)
pci_no_aer();
} else if (!strcmp(str, "nodomains")) {
pci_no_domains();
+ } else if (!strncmp(str, "try=", 4)) {
+ int try_num = memparse(str + 4, &str);
+ if (try_num > 0)
+ pci_try_num = try_num;
} else if (!strncmp(str, "cbiosize=", 9)) {
pci_cardbus_io_size = memparse(str + 9, &str);
} else if (!strncmp(str, "cbmemsize=", 10)) {
Index: linux-2.6/drivers/pci/pci.h
===================================================================
--- linux-2.6.orig/drivers/pci/pci.h
+++ linux-2.6/drivers/pci/pci.h
@@ -212,6 +212,8 @@ static inline int pci_ari_enabled(struct
return bus->self && bus->self->ari_enabled;
}

+extern int pci_try_num;
+
#ifdef CONFIG_PCI_QUIRKS
extern int pci_is_reassigndev(struct pci_dev *dev);
resource_size_t pci_specified_resource_alignment(struct pci_dev *dev);
Index: linux-2.6/drivers/pci/setup-bus.c
===================================================================
--- linux-2.6.orig/drivers/pci/setup-bus.c
+++ linux-2.6/drivers/pci/setup-bus.c
@@ -869,6 +869,7 @@ static int __init pci_get_max_depth(void
* second and later try will clear small leaf bridge res
* will stop till to the max deepth if can not find good one
*/
+int pci_try_num = 1;
void __init
pci_assign_unassigned_resources(void)
{
@@ -879,14 +880,17 @@ pci_assign_unassigned_resources(void)
unsigned long type_mask = IORESOURCE_IO | IORESOURCE_MEM |
IORESOURCE_PREFETCH;
unsigned long failed_type;
- int max_depth = pci_get_max_depth();
- int pci_try_num;

head.next = NULL;

- pci_try_num = max_depth + 1;
- printk(KERN_DEBUG "PCI: max bus depth: %d pci_try_num: %d\n",
- max_depth, pci_try_num);
+ if (pci_try_num > 1) {
+ int max_depth = pci_get_max_depth();
+
+ if (max_depth + 1 > pci_try_num)
+ pci_try_num = max_depth + 1;
+ printk(KERN_DEBUG "PCI: max bus depth: %d pci_try_num: %d\n",
+ max_depth, pci_try_num);
+ }

again:
/* Depth first, calculate sizes and alignments of all
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Jesse Barnes on
On Wed, 30 Jun 2010 16:59:49 -0700
Ram Pai <linuxram(a)us.ibm.com> wrote:

> On Wed, Jun 30, 2010 at 04:10:26PM -0700, Linus Torvalds wrote:
> > On Wed, Jun 30, 2010 at 2:15 PM, Ram Pai <linuxram(a)us.ibm.com> wrote:
> > >       PCI: skip release and reallocation of io port resources
> >
> > Gaah. This still looks like just total ad-hoc hackery. The logic for
> > it all seems very fragile, just a random case made up from the one
> > failing issue. There's no underlying logic or design to it.
> >
> > I still think that we should just make people explicitly ask for a
> > blank slate if the bios allocations don't work out.
>
> and interactively allocate resource?

No I don't think we want to add any prompts to the kernel boot
process. :)

> > Rather than trying
> > to fix it up automatically, which has been a total rats nest of random
> > crud.
>
> Can Yinghai Lu's patch 'pci=try=' be some temporary middle ground till
> a more elaborate patch is found?
>
> His suggestion partly meets your suggestion. It does not automatically
> reassign unless the user explicitly asks for it. Hence should not
> break any working systems, at the same time can handle system like
> mine.

pci=try just doesn't communicate much, it should be something like
pci=override_bios and do as Linus suggests.

But we should continue to shoot for not ever having to use that option
on normal systems.

--
Jesse Barnes, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Yinghai Lu on
On 07/02/2010 02:35 PM, Jesse Barnes wrote:
> On Wed, 30 Jun 2010 16:59:49 -0700
> Ram Pai <linuxram(a)us.ibm.com> wrote:
>
>> On Wed, Jun 30, 2010 at 04:10:26PM -0700, Linus Torvalds wrote:
>>> On Wed, Jun 30, 2010 at 2:15 PM, Ram Pai <linuxram(a)us.ibm.com> wrote:
>>>> PCI: skip release and reallocation of io port resources
>>>
>>> Gaah. This still looks like just total ad-hoc hackery. The logic for
>>> it all seems very fragile, just a random case made up from the one
>>> failing issue. There's no underlying logic or design to it.
>>>
>>> I still think that we should just make people explicitly ask for a
>>> blank slate if the bios allocations don't work out.
>>
>> and interactively allocate resource?
>
> No I don't think we want to add any prompts to the kernel boot
> process. :)
>
>>> Rather than trying
>>> to fix it up automatically, which has been a total rats nest of random
>>> crud.
>>
>> Can Yinghai Lu's patch 'pci=try=' be some temporary middle ground till
>> a more elaborate patch is found?
>>
>> His suggestion partly meets your suggestion. It does not automatically
>> reassign unless the user explicitly asks for it. Hence should not
>> break any working systems, at the same time can handle system like
>> mine.
>
> pci=try just doesn't communicate much, it should be something like
> pci=override_bios and do as Linus suggests.

So you want to use pci=override_bios to reallocate all bios assigned resource include
peer root buses resources and pci bridge resource and pci devices BAR?

in that case, we may need to update
1. io apic related BAR to be consistent with io apic addr from MADT.
2. other ACPI related tables like return for _CRS...

or just change pci=try to pci=override_bios in my patch?

>
> But we should continue to shoot for not ever having to use that option
> on normal systems.
>

replacing legacy bios with linuxbios is cleanest way.

Yinghai
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Jesse Barnes on
On Fri, 9 Jul 2010 09:50:45 -0600
Bjorn Helgaas <bjorn.helgaas(a)hp.com> wrote:

> On Tuesday, July 06, 2010 06:49:32 pm Yinghai Lu wrote:
> > On 07/06/2010 04:58 PM, Linus Torvalds wrote:
> > > On Tue, Jul 6, 2010 at 4:13 PM, Yinghai Lu <yinghai(a)kernel.org> wrote:
> > >>
> > >> So you want to use pci=override_bios to reallocate all bios assigned resource include
> > >> peer root buses resources and pci bridge resource and pci devices BAR?
> > >
> > > In a perfect world, we'd never need this at all, but sicne that's not
> > > an option, the second-best alternative might be something like the
> > > following:
> > >
> > > pci=override=off # default
> > > pci=override=conflict # override only on conflicts
> > > pci=override=<device> # clear BIOS allocations for <device> (and any
> > > children, if it's a bus)
> >
> > current:
> > if there is conflict, like pci bridge resources or pci devices resources is not in the scope of peer root bus resource range.
> > or pci devices is not in pci bridge resources range.
> > kernel would reject the resource and try to get new range in parent resource for the children.
> >
> > so current default is overriding the conflicts already.
>
> One conflict we don't handle correctly is when we find a device that
> doesn't fit inside the root bus resources. We currently disable the
> device, but Windows just leaves it where BIOS put it.
>
> This causes this bug: https://bugzilla.kernel.org/show_bug.cgi?id=16263
> It should be fairly simple to make Linux handle this conflict the same
> way, without requiring any special kernel arguments.

Sounds reasonable. I'm open to suggestions on alternate approaches for
this issue as well.

Thanks,
--
Jesse Barnes, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/