From: Arno on
In comp.sys.ibm.pc.hardware.storage Vlad_Inhaler <andrew.williams(a)t-online.de> wrote:
> On Mar 9, 12:52?am, Arno <m...(a)privacy.net> wrote:
>> In comp.sys.ibm.pc.hardware.storage Ant <a...(a)zimage.comant> wrote:
>>
>> > On 3/7/2010 9:20 AM PT, Arno typed:
>>
>> >> The reason no disk access is possible is simple: A kernel
>> >> panic only hapens when the kernel internal state is regarded
>> >> as seriously corrupted. A disk access could then cause serious
>> >> filesystem corruption (at least writing) and is therefore
>> >> not done.
>> > So Windows' blue screens with memory dumps are different?
>>
>> You can get a memory dump under Linux as well by using
>> the magic sys-req ley (if compiled in), but you cannto write
>> to disk after a panic. It is a safety measure.
>>
>> Arno
>> --
>> Arno Wagner, Dr. sc. techn., Dipl. Inform., CISSP -- Email: a...(a)wagner.name
>> GnuPG: ?ID: 1E25338F ?FP: 0C30 5782 9D93 F785 E79C ?0296 797F 6B50 1E25 338F
>> ----
>> Cuddly UI's are the manifestation of wishful thinking. -- Dylan Evans


> I would have no hesitation in creating a special partition for panic
> dumps, hell - if standard Linux filesystems are that sensitive I'd
> even make it VFAT or whatever else is necessary.
> I have reproducible kernel hangs under a certain kind of load, they
> are *not* temperature related and I have no way of working out what
> the hell is going on. Oh, the machine is dual-boot and I don't have
> these problems under XP.

> Going further into that here would be hijacking this thread, and I
> have tried that before now anyway without success.

> Having some sensible way of taking dumps for further analysis would be
> a really *good thing* - hell, I'd even put an additional old IDE drive
> in there as a destination device if that was what it took. Sorry, but
> that is a 'safety feature' I am not that happy with. Windows can do
> it, mainframe OSs can do it . . .

And Linux can do it. It just dumps to console instead of disk and
this choice is resonable because fo data safety, albeit sometimes
inconvenient in cheap setups. (Nothing against cheap setups, but
they are a bit limited on the hardware side and that sometimes is
inconvenient.)

You are supposed to have more than one of these boxes in one place
and then there is no issue. You can also use a number of
serial-over-internet devices to record logs. Or a laptop with
serial interface placed next to the offending machine. Or a modem.
Or a serial data recorder, for example the Logomatic v2 Serial
SD Datalogger (-> Google), which costs about 50 EUR.

The cheapest solution is usually just a serial crossover cable to
the next box in the rack that is under your control. Remember
that this is a sercer OS we are talking about here, not an
MS single-user-no-network OS that has over the course of time
been heavily extended.

Side note: With server PC hardware you get an IPMI console that
also gives you the output, so the comparison with big iron is not
fair. The serial console is the low-low-cost solution.

I should also add that a "soft panic" (which is closest to a blue
screen) typically dumps to /var/log/messages. It is only a hard panic
that is limited to the console. A hard panic corresponds to a lockup
without blue screen on windows.

Arno

--
Arno Wagner, Dr. sc. techn., Dipl. Inform., CISSP -- Email: arno(a)wagner.name
GnuPG: ID: 1E25338F FP: 0C30 5782 9D93 F785 E79C 0296 797F 6B50 1E25 338F
----
Cuddly UI's are the manifestation of wishful thinking. -- Dylan Evans
From: David Brown on
Rod Speed wrote:
> Ant wrote
>> Arno wrote
>
>>> On the other hand, the serial interface is simple, so console
>>> output, including error messages, will still be written to it.
>>> If you need that output, connect a different computer to
>>> the serial port, activate the serial console and capture
>>> its output. I have done this a number of times, mostly to
>>> try out experimental kernels on a cluster, but also to debug
>>> kernel panics.
>
>> Can I use my old serial external dial-up modem for this?
>

It should be possible if the connection was up and running in advance -
I doubt if you'd be able to get a new connection after a disaster.

> Nope, you need a serial cable between the PCs.
>

That's the best idea.

> It would be a lot better if Linux allowed a dump to a USB stick if
> you are happy to risk the contents of the USB stick on a kernal panic.
>

It's the price you pay for flexibility - most of Linux doesn't know that
you have a USB stick attached. It's all just files.

From: Jerry Peters on
In comp.sys.ibm.pc.hardware.storage Vlad_Inhaler <andrew.williams(a)t-online.de> wrote:
> On Mar 9, 12:52�am, Arno <m...(a)privacy.net> wrote:
>> In comp.sys.ibm.pc.hardware.storage Ant <a...(a)zimage.comant> wrote:
>>
>> > On 3/7/2010 9:20 AM PT, Arno typed:
>>
>> >> The reason no disk access is possible is simple: A kernel
>> >> panic only hapens when the kernel internal state is regarded
>> >> as seriously corrupted. A disk access could then cause serious
>> >> filesystem corruption (at least writing) and is therefore
>> >> not done.
>> > So Windows' blue screens with memory dumps are different?
>>
>> You can get a memory dump under Linux as well by using
>> the magic sys-req ley (if compiled in), but you cannto write
>> to disk after a panic. It is a safety measure.
>>
>> Arno
>> --
>> Arno Wagner, Dr. sc. techn., Dipl. Inform., CISSP -- Email: a...(a)wagner.name
>> GnuPG: �ID: 1E25338F �FP: 0C30 5782 9D93 F785 E79C �0296 797F 6B50 1E25 338F
>> ----
>> Cuddly UI's are the manifestation of wishful thinking. -- Dylan Evans
>
>
> I would have no hesitation in creating a special partition for panic
> dumps, hell - if standard Linux filesystems are that sensitive I'd
> even make it VFAT or whatever else is necessary.
> I have reproducible kernel hangs under a certain kind of load, they
> are *not* temperature related and I have no way of working out what
> the hell is going on. Oh, the machine is dual-boot and I don't have
> these problems under XP.
>
> Going further into that here would be hijacking this thread, and I
> have tried that before now anyway without success.
>
> Having some sensible way of taking dumps for further analysis would be
> a really *good thing* - hell, I'd even put an additional old IDE drive
> in there as a destination device if that was what it took. Sorry, but
> that is a 'safety feature' I am not that happy with. Windows can do
> it, mainframe OSs can do it . . .

Have you looked at kexec? I believe it's designed to allow you to boot
another kernel (already loaded in RAM, IIRC) and dump the failed
environment.

The mainframe OS I'm familiar with, MVS aka Z/Os ,does it by booting a
special program which dumps RAM and optionally virtual memory to tape
or disk.

Jerry
From: Rod Speed on
David Brown wrote:
> Rod Speed wrote:
>> Ant wrote
>>> Arno wrote
>>
>>>> On the other hand, the serial interface is simple, so console
>>>> output, including error messages, will still be written to it.
>>>> If you need that output, connect a different computer to
>>>> the serial port, activate the serial console and capture
>>>> its output. I have done this a number of times, mostly to
>>>> try out experimental kernels on a cluster, but also to debug
>>>> kernel panics.
>>
>>> Can I use my old serial external dial-up modem for this?
>>
>
> It should be possible if the connection was up and running in advance
> - I doubt if you'd be able to get a new connection after a disaster.
>
>> Nope, you need a serial cable between the PCs.
>>
>
> That's the best idea.
>
>> It would be a lot better if Linux allowed a dump to a USB stick if
>> you are happy to risk the contents of the USB stick on a kernal
>> panic.
>
> It's the price you pay for flexibility - most of Linux doesn't know
> that you have a USB stick attached. It's all just files.

It isnt most of linux that matters, its just what does the dump that needs to know about it.


From: Darren Salt on
I demand that Arno may or may not have written...

> In comp.sys.ibm.pc.hardware.storage Ant <ant(a)zimage.comant> wrote:
>> On 3/7/2010 8:56 AM PT, Yousuf Khan typed:
>>>>> HOWTO enable core-dumps - LinuxReviews
>>>>> http://en.linuxreviews.org/HOWTO_enable_core-dumps
>>>> Thanks. Isn't this for program crashes, not kernel panics? I wonder
>>>> why it was removed because I used to see those core files from crashes.
>>> You may want to ask in a Linux newsgroup for more details.
>> I am already am. ;)

> You don't need to,

What – ask in a Linux newsgroup? ;-)

(No, I'm not going to not post this to c.o.l.h.)

> no disk access is possible after a kernel panic, hence no logging. The only
> thing you can do, is to look at the screen or to enable the serial console
> output and log that on another machine.

I normally use netconsole for that.

http://www.mjmwired.net/kernel/Documentation/networking/netconsole.txt

[snip]
--
| Darren Salt | linux at youmustbejoking | nr. Ashington, | Doon
| using Debian GNU/Linux | or ds ,demon,co,uk | Northumberland | Army
| + http://www.youmustbejoking.demon.co.uk/ & http://tlasd.wordpress.com/

I've given up reading books; I find that it takes my mind off myself.