From: Greg KH on
2.6.32-stable review patch. If anyone has any objections, please let us know.

------------------

From: Borislav Petkov <borislav.petkov(a)amd.com>

commit 048a8774ca43488d78605031f11cc206d7a2682a upstream.

We need to know the valid L3 indices interval when disabling them over
/sysfs. Do that when the core is brought online and add boundary checks
to the sysfs .store attribute.

Signed-off-by: Borislav Petkov <borislav.petkov(a)amd.com>
LKML-Reference: <1264172467-25155-6-git-send-email-bp(a)amd64.org>
Signed-off-by: H. Peter Anvin <hpa(a)zytor.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)suse.de>

---
arch/x86/kernel/cpu/intel_cacheinfo.c | 35 ++++++++++++++++++++++++++++++----
1 file changed, 31 insertions(+), 4 deletions(-)

--- a/arch/x86/kernel/cpu/intel_cacheinfo.c
+++ b/arch/x86/kernel/cpu/intel_cacheinfo.c
@@ -151,7 +151,8 @@ struct _cpuid4_info {
union _cpuid4_leaf_ebx ebx;
union _cpuid4_leaf_ecx ecx;
unsigned long size;
- unsigned long can_disable;
+ bool can_disable;
+ unsigned int l3_indices;
DECLARE_BITMAP(shared_cpu_map, NR_CPUS);
};

@@ -161,7 +162,8 @@ struct _cpuid4_info_regs {
union _cpuid4_leaf_ebx ebx;
union _cpuid4_leaf_ecx ecx;
unsigned long size;
- unsigned long can_disable;
+ bool can_disable;
+ unsigned int l3_indices;
};

unsigned short num_cache_leaves;
@@ -291,6 +293,29 @@ amd_cpuid4(int leaf, union _cpuid4_leaf_
(ebx->split.ways_of_associativity + 1) - 1;
}

+static unsigned int __cpuinit amd_calc_l3_indices(void)
+{
+ /*
+ * We're called over smp_call_function_single() and therefore
+ * are on the correct cpu.
+ */
+ int cpu = smp_processor_id();
+ int node = cpu_to_node(cpu);
+ struct pci_dev *dev = node_to_k8_nb_misc(node);
+ unsigned int sc0, sc1, sc2, sc3;
+ u32 val;
+
+ pci_read_config_dword(dev, 0x1C4, &val);
+
+ /* calculate subcache sizes */
+ sc0 = !(val & BIT(0));
+ sc1 = !(val & BIT(4));
+ sc2 = !(val & BIT(8)) + !(val & BIT(9));
+ sc3 = !(val & BIT(12)) + !(val & BIT(13));
+
+ return (max(max(max(sc0, sc1), sc2), sc3) << 10) - 1;
+}
+
static void __cpuinit
amd_check_l3_disable(int index, struct _cpuid4_info_regs *this_leaf)
{
@@ -306,7 +331,8 @@ amd_check_l3_disable(int index, struct _
(boot_cpu_data.x86_mask < 0x1)))
return;

- this_leaf->can_disable = 1;
+ this_leaf->can_disable = true;
+ this_leaf->l3_indices = amd_calc_l3_indices();
}

static int
@@ -780,7 +806,8 @@ static ssize_t store_cache_disable(struc
return -EINVAL;

/* do not allow writes outside of allowed bits */
- if (val & ~(SUBCACHE_MASK | SUBCACHE_INDEX))
+ if ((val & ~(SUBCACHE_MASK | SUBCACHE_INDEX)) ||
+ ((val & SUBCACHE_INDEX) > this_leaf->l3_indices))
return -EINVAL;

val |= BIT(30);


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Greg KH on
On Thu, May 06, 2010 at 04:13:58PM +0200, Borislav Petkov wrote:
> From: Jiri Kosina <jkosina(a)suse.cz>
> Date: Thu, May 06, 2010 at 09:46:21AM -0400
>
> > Well, we have
> >
> > #ifdef CONFIG_K8_NB
> > static inline struct pci_dev *node_to_k8_nb_misc(int node)
> > {
> > return (node < num_k8_northbridges) ? k8_northbridges[node] : NULL;
> > }
> > #else
> > static inline struct pci_dev *node_to_k8_nb_misc(int node)
> > {
> > return NULL;
> > }
> > #endif
> >
> > So it legitimately returns NULL in two cases:
> >
> > 1) if someone passes too large node
> > 2) if CONFIG_K8_NB is unset
> >
> > 1) Either we assume that node will always be in the range (i.e.
> > amd_get_nb_id() will never go crazy return anything bogus), and then we
> > could just drop the test completely. Or we want to check for such the
> > possibility, and then node_to_k8_nb_misc() is going to return NULL in such
> > cases, and so we want to check for it.
>
> No, we want to check it since K8_NB has many users - not only L3.
>
> > 2) is now moot, as all three in-tree callers are now under proper ifdefs
> > (CONFIG_CPU_SUP_AMD, which depends on CONFIG_K8_NB after your patch has
> > been applied). So I believe we could remove it, right?
>
> I don't think I understand "remove it" here. Which "it" you're referring to?
>
> > Either way, current state seems inconsistent. So either we should add the
> > return value check to amd_calc_l3_indices() as well, or remove all the
> > NULL magic altogether, i.e. the (untested) patch below.
> >
> > What do you think?
>
> Take a look at
>
> http://git.kernel.org/?p=linux/kernel/git/x86/linux-2.6-tip.git;a=commitdiff;h=f2b20e41407fccfcfacf927ff91ec888832a37af
>
> This does
>
> + if (num_k8_northbridges == 0)
> + return;
>
> in amd_check_l3_disable().
>
> So if K8_NB has failed initializing for some reason, we never go near
> the pci devs and the node_to_k8_nb_misc() calls since we effectively
> disable the L3 functionality.
>
> Thus the NULL pointer checks you remove in the patch below are
> superfluous, I agree, and I have already removed those in my tree along
> with the other improvements/fixes I'm working on right now.

So, was there ever a patch applied that fixed the bug that Jiri found in
the stable kernels with this original patch that I could apply?

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: H. Peter Anvin on
On 05/12/2010 04:19 PM, Greg KH wrote:
>>
>> So if K8_NB has failed initializing for some reason, we never go near
>> the pci devs and the node_to_k8_nb_misc() calls since we effectively
>> disable the L3 functionality.
>>
>> Thus the NULL pointer checks you remove in the patch below are
>> superfluous, I agree, and I have already removed those in my tree along
>> with the other improvements/fixes I'm working on right now.
>
> So, was there ever a patch applied that fixed the bug that Jiri found in
> the stable kernels with this original patch that I could apply?
>

Sounds like this patch in -tip should be promoted to mainline/stable...
could the AMD people please confirm?

-hpa

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: H. Peter Anvin on
On 05/12/2010 04:32 PM, H. Peter Anvin wrote:
> On 05/12/2010 04:19 PM, Greg KH wrote:
>>>
>>> So if K8_NB has failed initializing for some reason, we never go near
>>> the pci devs and the node_to_k8_nb_misc() calls since we effectively
>>> disable the L3 functionality.
>>>
>>> Thus the NULL pointer checks you remove in the patch below are
>>> superfluous, I agree, and I have already removed those in my tree along
>>> with the other improvements/fixes I'm working on right now.
>>
>> So, was there ever a patch applied that fixed the bug that Jiri found in
>> the stable kernels with this original patch that I could apply?
>>
>
> Sounds like this patch in -tip should be promoted to mainline/stable...
> could the AMD people please confirm?
>

Specifically, "waiting to hit mainline" is not an option since they are
currently queued for .35, but we need a minimal fix in .34 -- this is
the "stable backport". If this "stable backport" is simply a subset of
the patch series then let me know which ones should be moved to urgent
and sent to Linus immediately. I obviously don't have access to a setup
which can replicate this particular problem.

-hpa
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Jiri Kosina on
On Wed, 12 May 2010, H. Peter Anvin wrote:

> >>> So if K8_NB has failed initializing for some reason, we never go near
> >>> the pci devs and the node_to_k8_nb_misc() calls since we effectively
> >>> disable the L3 functionality.
> >>>
> >>> Thus the NULL pointer checks you remove in the patch below are
> >>> superfluous, I agree, and I have already removed those in my tree along
> >>> with the other improvements/fixes I'm working on right now.
> >>
> >> So, was there ever a patch applied that fixed the bug that Jiri found in
> >> the stable kernels with this original patch that I could apply?
> >>
> >
> > Sounds like this patch in -tip should be promoted to mainline/stable...
> > could the AMD people please confirm?
> >
>
> Specifically, "waiting to hit mainline" is not an option since they are
> currently queued for .35, but we need a minimal fix in .34 -- this is
> the "stable backport". If this "stable backport" is simply a subset of
> the patch series then let me know which ones should be moved to urgent
> and sent to Linus immediately. I obviously don't have access to a setup
> which can replicate this particular problem.

One is already there -- 0e152cd7c16832bd5cadee0c2e41d9959bc9b6f9. This
particular one solely was enough to fix the problem I have been hitting.

Plus there is f2b20e41407fccfcf ("x86, cacheinfo: Turn off L3 cache
index disable feature in virtualized environments") in -tip tree, which
fixes a crash of similar nature I have been seeing on bare metal, but for
virtualized environments.
This one might be considered for .34/stable still.

--
Jiri Kosina
SUSE Labs, Novell Inc.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/