From: Rusty Russell on
On Fri, 2005-06-10 at 09:03 -0500, Stephen Lord wrote:
> Hi,
>
> I am having troubles getting any recent kernel to boot successfully
> on one of my machines, a generic 2.6GHz P4 box with HT enabled
> running an updated Fedora Core 3 distro. This is present in
> 2.6.12-rc6. It does not manifest itself with the Fedora Core
> kernels which have identical initrd contents as far as the
> init script and the set of modules included goes.
>
> The problem manifests itself as various undefined symbols from
> module loads. Here is the relevant section from the init script

Module loading is synchronous. All I can think of is that a module is
pulling in another module which requires it asynchronously (you need to
do this because your own module symbols are not available until *after*
init succeeds), or a hotplug interaction (hotplug is async, too).

Rusty.
--
A bad analogy is like a leaky screwdriver -- Richard Braakman

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Steve Lord on
Andrew Morton wrote:
> Stephen Lord <lord(a)xfs.org> wrote:
>
>>Pozsýr Balýzs wrote:
>> > On Sat, Jun 11, 2005 at 08:23:20AM -0500, Steve Lord wrote:
>> >
>> >>I think this is not actually module loading itself, but a problem
>> >>between the fork/exec/wait code in nash and the kernel.
>> >
>> >
>> > I do not use nash, only bash, so this is not a nash-specific issue.
>> >
>> >
>>
>> I disabled hyperthreading and things started working, so are there any
>> HT related scheduling bugs right now?
>
>
> There haven't been any scheduler changes for some time. There have been a
> few low-level SMT changes I think.
>
> Are you able to identify which kernel version broke it?
>

Still have not narrowed this down too far, disabling SMT made no
difference, disabling SMP did, which I was expecting.

Steve

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: K.R. Foley on
Steve Lord wrote:
> Andrew Morton wrote:
>
>> Stephen Lord <lord(a)xfs.org> wrote:
>>
>>> Pozsýr Balýzs wrote:
>>> > On Sat, Jun 11, 2005 at 08:23:20AM -0500, Steve Lord wrote:
>>> > >>I think this is not actually module loading itself, but a problem
>>> >>between the fork/exec/wait code in nash and the kernel.
>>> > > > I do not use nash, only bash, so this is not a nash-specific
>>> issue.
>>> > >
>>> I disabled hyperthreading and things started working, so are there any
>>> HT related scheduling bugs right now?
>>
>>
>>
>> There haven't been any scheduler changes for some time. There have
>> been a
>> few low-level SMT changes I think.
>>
>> Are you able to identify which kernel version broke it?
>>
>
> Still have not narrowed this down too far, disabling SMT made no
> difference, disabling SMP did, which I was expecting.
>
> Steve
>

I initially saw this with 2.6.12-rc1 and every version up through rc3. I
haven't tried with later versions. :-/ I initially reported here:
http://marc.theaimsgroup.com/?l=linux-kernel&m=111235814529008&w=2

The way that I got around it was to compile in my aic7xxx driver instead
of making it a module. I have also recently received an email from
someone saying that disabling module unloading would also solve it. That
very well may be true since I did run into another booting problem
(2.6.12-rc5) that disabling module unloading fixed :-/ I haven't had a
chance to go back and check this out though.

So to summarize: I have a dual 933 with aic7xxx compiled in to get
passed the problem described above. I have a dual 2.6 w/HT that I have
disabled module unloading to get passed another boot condition.


--
kr

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: K.R. Foley on
Steve Lord wrote:
> Andrew Morton wrote:
>
>> Stephen Lord <lord(a)xfs.org> wrote:
>>
>>> Pozsýr Balýzs wrote:
>>> > On Sat, Jun 11, 2005 at 08:23:20AM -0500, Steve Lord wrote:
>>> > >>I think this is not actually module loading itself, but a problem
>>> >>between the fork/exec/wait code in nash and the kernel.
>>> > > > I do not use nash, only bash, so this is not a nash-specific
>>> issue.
>>> > >
>>> I disabled hyperthreading and things started working, so are there any
>>> HT related scheduling bugs right now?
>>
>>
>>
>> There haven't been any scheduler changes for some time. There have
>> been a
>> few low-level SMT changes I think.
>>
>> Are you able to identify which kernel version broke it?
>>
>
> Still have not narrowed this down too far, disabling SMT made no
> difference, disabling SMP did, which I was expecting.
>
> Steve
>

I initially saw this with 2.6.12-rc1 and every version up through rc3. I
haven't tried with later versions. :-/ I initially reported here:
http://marc.theaimsgroup.com/?l=linux-kernel&m=111235814529008&w=2

The way that I got around it was to compile in my aic7xxx driver instead
of making it a module. I have also recently received an email from
someone saying that disabling module unloading would also solve it. That
very well may be true since I did run into another booting problem
(2.6.12-rc5) that disabling module unloading fixed :-/ I haven't had a
chance to go back and check this out though.

So to summarize: I have a dual 933 with aic7xxx compiled in to get
passed the problem described above. I have a dual 2.6 w/HT that I have
disabled module unloading to get passed another boot condition.


--
kr

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Steve Lord on
K.R. Foley wrote:
> Steve Lord wrote:
>
>> Andrew Morton wrote:
>>
>>> Stephen Lord <lord(a)xfs.org> wrote:
>>>
>>>> Pozsýr Balýzs wrote:
>>>> > On Sat, Jun 11, 2005 at 08:23:20AM -0500, Steve Lord wrote:
>>>> > >>I think this is not actually module loading itself, but a problem
>>>> >>between the fork/exec/wait code in nash and the kernel.
>>>> > > > I do not use nash, only bash, so this is not a nash-specific
>>>> issue.
>>>> > >
>>>> I disabled hyperthreading and things started working, so are there any
>>>> HT related scheduling bugs right now?
>>>
>>>
>>>
>>>
>>> There haven't been any scheduler changes for some time. There have
>>> been a
>>> few low-level SMT changes I think.
>>>
>>> Are you able to identify which kernel version broke it?
>>>
>>
>> Still have not narrowed this down too far, disabling SMT made no
>> difference, disabling SMP did, which I was expecting.
>>
>> Steve
>>
>
> I initially saw this with 2.6.12-rc1 and every version up through rc3. I
> haven't tried with later versions. :-/ I initially reported here:
> http://marc.theaimsgroup.com/?l=linux-kernel&m=111235814529008&w=2
>
> The way that I got around it was to compile in my aic7xxx driver instead
> of making it a module. I have also recently received an email from
> someone saying that disabling module unloading would also solve it. That
> very well may be true since I did run into another booting problem
> (2.6.12-rc5) that disabling module unloading fixed :-/ I haven't had a
> chance to go back and check this out though.
>
> So to summarize: I have a dual 933 with aic7xxx compiled in to get
> passed the problem described above. I have a dual 2.6 w/HT that I have
> disabled module unloading to get passed another boot condition.
>
>

I found another system which exhibits the problem, a dual Xeon
with HT support.

Here is one of the cpus from /proc/cpuinfo

processor : 0
vendor_id : GenuineIntel
cpu family : 15
model : 1
model name : Intel(R) Xeon(TM) CPU 1.40GHz
stepping : 1
cpu MHz : 1393.851
cache size : 256 KB
physical id : 0
siblings : 2
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm
bogomips : 2752.51

I discovered that if I disable P4 support on this host and run with
P3 Xeon support instead, things start working. The host type in the
boot up is identified as a P4/Xeon:

Jun 14 11:25:19 k4 kernel: Booting processor 2/2 eip 3000
Jun 14 11:25:19 k4 kernel: CPU 2 irqstacks, hard=c03e7000 soft=c03df000
Jun 14 11:25:19 k4 kernel: Initializing CPU#2
Jun 14 11:25:19 k4 kernel: CPU: Trace cache: 12K uops, L1 D cache: 8K
Jun 14 11:25:19 k4 kernel: CPU: L2 cache: 256K
Jun 14 11:25:19 k4 kernel: CPU: L3 cache: 512K
Jun 14 11:25:19 k4 kernel: CPU: Physical Processor ID: 1
Jun 14 11:25:19 k4 kernel: Intel machine check architecture supported.
Jun 14 11:25:19 k4 kernel: Intel machine check reporting enabled on CPU#2.
Jun 14 11:25:19 k4 kernel: CPU2: Intel P4/Xeon Extended MCE MSRs (12) available
Jun 14 11:25:19 k4 kernel: CPU2: Intel(R) Xeon(TM) CPU 1.40GHz stepping 01

So is this some P4 specific optimization which is not working as
intended?

Steve

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
First  |  Prev  |  Next  |  Last
Pages: 1 2 3 4 5 6
Next: Am1771 wireless driver?