From: Eric W. Biederman on
Vitaly Mayatskikh <v.mayatskih(a)gmail.com> writes:

> Patch applies to 2.6.34-rc5
>
> On x86 platform, even if hardware is 64-bit capable, kernel starts
> execution in 32-bit mode. When system is kdump-enabled, crashed kernel
> switches to 32 bit mode and jumps into new kernel. This automatically
> limits location of dump-capture kernel image and it's initrd by first
> 4Gb of memory. Switching to 32 bit mode is performed by purgatory
> code, which has relocations of type R_X86_64_32S (32-bit signed), and
> this cuts "good" address space for crash kernel down to 2 Gb. I/O
> regions may cut down this space further.
>
> When system has a lot of memory (hundreds of gigabytes), dump-capture
> kernel also needs relatively a lot of memory to account old kernel's
> pages. It may be impossible to reserve enough memory below 2 or even 4
> Gb. Simplest solution is it break dump-capture kernel's reserved
> memory region into two pieces: first (small) region for kernel and
> initrd images may be easily placed in "good" address space in the
> beginning of physical memory, and second region may be located
> anywhere.
>
> This serie of patches realizes this approach. It requires also changes
> in kexec utility to make this feature work, but is
> backward-compatible: old versions of kexec will work with new
> kernel. I will post patch to kexec-tools upstream separately.

Have you tried loading a 64bit vmlinux directly into a higher address
range? There may be a bit or two missing but you should be able to
load a linux kernel above 4GB. I tested the basics of that mechanism
when I made the 64bit relocatable kernel.

I don't buy the argument that there is a direct connection between
the amount of memory you have and how much memory it takes to dump it.
Even an indirect connections seems suspicious.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: H. Peter Anvin on
On 04/22/2010 03:07 PM, Eric W. Biederman wrote:
>
> Have you tried loading a 64bit vmlinux directly into a higher address
> range? There may be a bit or two missing but you should be able to
> load a linux kernel above 4GB. I tested the basics of that mechanism
> when I made the 64bit relocatable kernel.
>
> I don't buy the argument that there is a direct connection between
> the amount of memory you have and how much memory it takes to dump it.
> Even an indirect connections seems suspicious.
>

We actually have a 64-bit entry point even in bzImage; it is at offset
+0x200 from the 32-bit entry point. Right now that offset is not
exported anywhere, but it has been stable for a very long time... at
least for as far back as the decompressor has been 64 bits.

The interface to the 64-bit code is by necessity wider, since there is
no such thing as paging off in 64-bit mode, but it probably isn't *too*
hard to figure out how page tables need to be set up in order to work
properly. At that point, it would be good to document it.

-hpa
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Vivek Goyal on
On Thu, Apr 22, 2010 at 03:07:11PM -0700, Eric W. Biederman wrote:
> Vitaly Mayatskikh <v.mayatskih(a)gmail.com> writes:
>
> > Patch applies to 2.6.34-rc5
> >
> > On x86 platform, even if hardware is 64-bit capable, kernel starts
> > execution in 32-bit mode. When system is kdump-enabled, crashed kernel
> > switches to 32 bit mode and jumps into new kernel. This automatically
> > limits location of dump-capture kernel image and it's initrd by first
> > 4Gb of memory. Switching to 32 bit mode is performed by purgatory
> > code, which has relocations of type R_X86_64_32S (32-bit signed), and
> > this cuts "good" address space for crash kernel down to 2 Gb. I/O
> > regions may cut down this space further.
> >
> > When system has a lot of memory (hundreds of gigabytes), dump-capture
> > kernel also needs relatively a lot of memory to account old kernel's
> > pages. It may be impossible to reserve enough memory below 2 or even 4
> > Gb. Simplest solution is it break dump-capture kernel's reserved
> > memory region into two pieces: first (small) region for kernel and
> > initrd images may be easily placed in "good" address space in the
> > beginning of physical memory, and second region may be located
> > anywhere.
> >
> > This serie of patches realizes this approach. It requires also changes
> > in kexec utility to make this feature work, but is
> > backward-compatible: old versions of kexec will work with new
> > kernel. I will post patch to kexec-tools upstream separately.
>
> Have you tried loading a 64bit vmlinux directly into a higher address
> range? There may be a bit or two missing but you should be able to
> load a linux kernel above 4GB. I tested the basics of that mechanism
> when I made the 64bit relocatable kernel.

I guess even if it works, for distributions it will become additional
liability to carry vmlinux (instead of relocatable bzImage). So we shall
have to find a way to make bzImage work.

>
> I don't buy the argument that there is a direct connection between
> the amount of memory you have and how much memory it takes to dump it.
> Even an indirect connections seems suspicious.

Memory requirement by user space might be of interest though like dump
filtering tools. I vaguely remember that it used to first traverse all
the memory pages, create some internal data structures and then start
dumping.

So memory required by filtering tool might be directly proportional to
amount of memory present in the system.

Vitaly, have you really run into cases where 2G upper limit is a concern.
What is the configuration you have, how much memory it has and how much
memory are you planning to reserve for kdump kernel?

Thanks
Vivek
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Vivek Goyal on
On Thu, Apr 22, 2010 at 05:48:53PM -0700, Eric W. Biederman wrote:
> Vivek Goyal <vgoyal(a)redhat.com> writes:
>
> > On Thu, Apr 22, 2010 at 03:07:11PM -0700, Eric W. Biederman wrote:
> >> Vitaly Mayatskikh <v.mayatskih(a)gmail.com> writes:
> >> >
> >> > This serie of patches realizes this approach. It requires also changes
> >> > in kexec utility to make this feature work, but is
> >> > backward-compatible: old versions of kexec will work with new
> >> > kernel. I will post patch to kexec-tools upstream separately.
> >>
> >> Have you tried loading a 64bit vmlinux directly into a higher address
> >> range? There may be a bit or two missing but you should be able to
> >> load a linux kernel above 4GB. I tested the basics of that mechanism
> >> when I made the 64bit relocatable kernel.
> >
> > I guess even if it works, for distributions it will become additional
> > liability to carry vmlinux (instead of relocatable bzImage). So we shall
> > have to find a way to make bzImage work.
>
> As Peter pointed out we actually have everything thing we need except
> a bit of documentation and the flag that says this is a 64bit kernel.
>
> >From a testing perspective a 64bit vmlinux should work today without
> changes. Once it is confirmed there is a solution with the 64bit
> kernel we just need a small patch to boot.txt and a few tweaks to
> /sbin/kexec to handle a 64bit bzImage.
>

Agreed. Doing little more testing and fixing some issues, if need be, and
making 64 bzImage work is the better way instead of splitting the reserved
memory.

Thanks
Vivek
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Eric W. Biederman on
Vivek Goyal <vgoyal(a)redhat.com> writes:

> On Thu, Apr 22, 2010 at 03:07:11PM -0700, Eric W. Biederman wrote:
>> Vitaly Mayatskikh <v.mayatskih(a)gmail.com> writes:
>> >
>> > This serie of patches realizes this approach. It requires also changes
>> > in kexec utility to make this feature work, but is
>> > backward-compatible: old versions of kexec will work with new
>> > kernel. I will post patch to kexec-tools upstream separately.
>>
>> Have you tried loading a 64bit vmlinux directly into a higher address
>> range? There may be a bit or two missing but you should be able to
>> load a linux kernel above 4GB. I tested the basics of that mechanism
>> when I made the 64bit relocatable kernel.
>
> I guess even if it works, for distributions it will become additional
> liability to carry vmlinux (instead of relocatable bzImage). So we shall
> have to find a way to make bzImage work.

As Peter pointed out we actually have everything thing we need except
a bit of documentation and the flag that says this is a 64bit kernel.

From a testing perspective a 64bit vmlinux should work today without
changes. Once it is confirmed there is a solution with the 64bit
kernel we just need a small patch to boot.txt and a few tweaks to
/sbin/kexec to handle a 64bit bzImage.

>> I don't buy the argument that there is a direct connection between
>> the amount of memory you have and how much memory it takes to dump it.
>> Even an indirect connections seems suspicious.
>
> Memory requirement by user space might be of interest though like dump
> filtering tools. I vaguely remember that it used to first traverse all
> the memory pages, create some internal data structures and then start
> dumping.
>
> So memory required by filtering tool might be directly proportional to
> amount of memory present in the system.

Assuming your dump filtering tool creates a bitmap of pages to be dumped
you get a ration of 32K to 1. Or 3MB for 100GB and 32MB for 1TB.
Which is noticeable in the worst case but definitely not enough to push
us past 2GB.

> Vitaly, have you really run into cases where 2G upper limit is a concern.
> What is the configuration you have, how much memory it has and how much
> memory are you planning to reserve for kdump kernel?

A good question.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/