From: Jeremy Fitzhardinge on
On 06/17/2010 10:35 AM, Jeremy Fitzhardinge wrote:
> I guess it would be possible to special-case ioremap to allow the
> creation of such mappings, but I don't know what kind of system-wide
> fallout would happen as a result. The consequences of something trying
> to extract a pfn from one of those ptes would be
>

....very bad, as it would result in truncated pfns and likely cause some
kind of corruption.

(oops, sent too early)

J
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Jeremy Fitzhardinge on
On 06/17/2010 07:03 AM, H. Peter Anvin wrote:
> On 06/16/2010 09:55 PM, Kenji Kaneshige wrote:
>
>>> I think they might be. Kenji?
>>>
>> No. My addresses are in the 44-bits range (around fc000000000). So it is
>> not required for my problem. This change assumes that phys_addr can be
>> above 44-bits (up to 52-bits (and higher in the future?)).
>>
>> By the way, is there linux kernel limit regarding above 44-bits physical
>> address in x86_32 PAE? For example, pfn above 32-bits is not supported?
>>
>>

That's an awkward situation. I would tend to suggest that you not
support this type of machine with a 32-bit kernel. Is it a sparse
memory system, or is there a device mapped in that range?

I guess it would be possible to special-case ioremap to allow the
creation of such mappings, but I don't know what kind of system-wide
fallout would happen as a result. The consequences of something trying
to extract a pfn from one of those ptes would be

> There are probably places at which PFNs are held in 32-bit numbers,
> although it would be good to track them down if it isn't too expensive
> to fix them (i.e. doesn't affect generic code.)
>

There are many places which hold pfns in 32 bit variables on 32 bit
systems; the standard type for pfns is "unsigned long", pretty much
everywhere in the kernel. It might be worth defining a pfn_t and
converting usage over to that, but it would be a pervasive change.

> This also affects paravirt systems, i.e. right now Xen has to locate all
> 32-bit guests below 64 GB, which limits its usefulness.
>

I don't think the limit is 64GB. A 32-bit PFN limits us to 2^44, which
is 16TB. (32-bit PV Xen guests have another unrelated limit of around
160GB physical memory because that as much m2p table will fit into the
Xen hole in the kernel mapping.)

>> #ifdef CONFIG_X86_PAE
>> /* 44=32+12, the limit we can fit into an unsigned long pfn */
>> #define __PHYSICAL_MASK_SHIFT 44
>> #define __VIRTUAL_MASK_SHIFT 32
>>
>> If there is 44-bits physical address limit, I think it's better to use
>> PHYSICAL_PAGE_MASK for masking physical address, instead of "(phys_addr
>>
>>>> PAGE_SHIFT) << PAGE_SHIFT)". The PHYSICAL_PAGE_MASK would become
>>>>
>> greater value when 44-bits physical address limit is eliminated. And
>> maybe we need to change phys_addr_valid() returns error if physical
>> address is above (1 << __PHYSICAL_MASK_SHIFT)?
>>
> The real question is how much we can fix without an unreasonable cost.
>

I think it would be a pretty large change. From the Xen's perspective,
any machine even approximately approaching the 2^44 limit will be
capable of running Xen guests in hvm mode, so PV isn't really a concern.

J
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: H. Peter Anvin on
On 06/17/2010 02:35 AM, Jeremy Fitzhardinge wrote:
>>>
>>> By the way, is there linux kernel limit regarding above 44-bits physical
>>> address in x86_32 PAE? For example, pfn above 32-bits is not supported?
>
> That's an awkward situation. I would tend to suggest that you not
> support this type of machine with a 32-bit kernel. Is it a sparse
> memory system, or is there a device mapped in that range?
>
> I guess it would be possible to special-case ioremap to allow the
> creation of such mappings, but I don't know what kind of system-wide
> fallout would happen as a result. The consequences of something trying
> to extract a pfn from one of those ptes would be
>
>> There are probably places at which PFNs are held in 32-bit numbers,
>> although it would be good to track them down if it isn't too expensive
>> to fix them (i.e. doesn't affect generic code.)
>>
>
> There are many places which hold pfns in 32 bit variables on 32 bit
> systems; the standard type for pfns is "unsigned long", pretty much
> everywhere in the kernel. It might be worth defining a pfn_t and
> converting usage over to that, but it would be a pervasive change.
>

I think you're right, and just making 2^44 work correctly would be good
enough. Doing special forwarding of all 52 bits of the real physical
address in the paravirt case (where it is self-contained and doesn't
spill into the rest of the kernel) would probably be a good thing, though.

-hpa

--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Kenji Kaneshige on
(2010/06/17 18:35), Jeremy Fitzhardinge wrote:
> On 06/17/2010 07:03 AM, H. Peter Anvin wrote:
>> On 06/16/2010 09:55 PM, Kenji Kaneshige wrote:
>>
>>>> I think they might be. Kenji?
>>>>
>>> No. My addresses are in the 44-bits range (around fc000000000). So it is
>>> not required for my problem. This change assumes that phys_addr can be
>>> above 44-bits (up to 52-bits (and higher in the future?)).
>>>
>>> By the way, is there linux kernel limit regarding above 44-bits physical
>>> address in x86_32 PAE? For example, pfn above 32-bits is not supported?
>>>
>>>
>
> That's an awkward situation. I would tend to suggest that you not
> support this type of machine with a 32-bit kernel. Is it a sparse
> memory system, or is there a device mapped in that range?
>

Device mapped range in my case.
Fortunately, the address is in 44-bits range. I'd like to focus on
making 2^44 work correctly this time.

Thanks,
Kenji Kaneshige




> I guess it would be possible to special-case ioremap to allow the
> creation of such mappings, but I don't know what kind of system-wide
> fallout would happen as a result. The consequences of something trying
> to extract a pfn from one of those ptes would be
>
>> There are probably places at which PFNs are held in 32-bit numbers,
>> although it would be good to track them down if it isn't too expensive
>> to fix them (i.e. doesn't affect generic code.)
>>
>
> There are many places which hold pfns in 32 bit variables on 32 bit
> systems; the standard type for pfns is "unsigned long", pretty much
> everywhere in the kernel. It might be worth defining a pfn_t and
> converting usage over to that, but it would be a pervasive change.
>
>> This also affects paravirt systems, i.e. right now Xen has to locate all
>> 32-bit guests below 64 GB, which limits its usefulness.
>>
>
> I don't think the limit is 64GB. A 32-bit PFN limits us to 2^44, which
> is 16TB. (32-bit PV Xen guests have another unrelated limit of around
> 160GB physical memory because that as much m2p table will fit into the
> Xen hole in the kernel mapping.)
>
>>> #ifdef CONFIG_X86_PAE
>>> /* 44=32+12, the limit we can fit into an unsigned long pfn */
>>> #define __PHYSICAL_MASK_SHIFT 44
>>> #define __VIRTUAL_MASK_SHIFT 32
>>>
>>> If there is 44-bits physical address limit, I think it's better to use
>>> PHYSICAL_PAGE_MASK for masking physical address, instead of "(phys_addr
>>>
>>>>> PAGE_SHIFT)<< PAGE_SHIFT)". The PHYSICAL_PAGE_MASK would become
>>>>>
>>> greater value when 44-bits physical address limit is eliminated. And
>>> maybe we need to change phys_addr_valid() returns error if physical
>>> address is above (1<< __PHYSICAL_MASK_SHIFT)?
>>>
>> The real question is how much we can fix without an unreasonable cost.
>>
>
> I think it would be a pretty large change. From the Xen's perspective,
> any machine even approximately approaching the 2^44 limit will be
> capable of running Xen guests in hvm mode, so PV isn't really a concern.
>
> J
>
>


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Kenji Kaneshige on
(2010/06/17 22:46), H. Peter Anvin wrote:
> On 06/17/2010 02:35 AM, Jeremy Fitzhardinge wrote:
>>>>
>>>> By the way, is there linux kernel limit regarding above 44-bits physical
>>>> address in x86_32 PAE? For example, pfn above 32-bits is not supported?
>>
>> That's an awkward situation. I would tend to suggest that you not
>> support this type of machine with a 32-bit kernel. Is it a sparse
>> memory system, or is there a device mapped in that range?
>>
>> I guess it would be possible to special-case ioremap to allow the
>> creation of such mappings, but I don't know what kind of system-wide
>> fallout would happen as a result. The consequences of something trying
>> to extract a pfn from one of those ptes would be
>>
>>> There are probably places at which PFNs are held in 32-bit numbers,
>>> although it would be good to track them down if it isn't too expensive
>>> to fix them (i.e. doesn't affect generic code.)
>>>
>>
>> There are many places which hold pfns in 32 bit variables on 32 bit
>> systems; the standard type for pfns is "unsigned long", pretty much
>> everywhere in the kernel. It might be worth defining a pfn_t and
>> converting usage over to that, but it would be a pervasive change.
>>
>
> I think you're right, and just making 2^44 work correctly would be good
> enough. Doing special forwarding of all 52 bits of the real physical
> address in the paravirt case (where it is self-contained and doesn't
> spill into the rest of the kernel) would probably be a good thing, though.
>
> -hpa
>

I'll focus on making 2^44 work correctly. Then, I'll do the following
change in the next version of my patch.

- The v.2 patch uses resource_size_t for pfn. I'll keep using
resource_size_t for pfn also in v.3, because there is no reason to
leave it being "unsigned long".

- Use PHYSICAL_PAGE_MASK for masking physical address as v.1 patch
did. I think changing the definition of PAGE_MASK is a little risky.

Thanks,
Kenji Kaneshige


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/