From: Michael Ellerman on
On Sun, 2010-03-28 at 19:43 -0700, Yinghai Lu wrote:
> They will check if the region array is big enough.
>
> __check_and_double_region_array will try to double the region if that array spare
> slots if not big enough.
> find_lmb_area() is used to find good postion for new region array.
> Old array will be copied to new array.
>
> Arch code should provide to get_max_mapped, so the new array have accessiable
> address
...
> diff --git a/mm/lmb.c b/mm/lmb.c
> index d5d5dc4..9798458 100644
> --- a/mm/lmb.c
> +++ b/mm/lmb.c
> @@ -551,6 +551,95 @@ int lmb_find(struct lmb_property *res)
> return -1;
> }
>
> +u64 __weak __init get_max_mapped(void)
> +{
> + u64 end = max_low_pfn;
> +
> + end <<= PAGE_SHIFT;
> +
> + return end;
> +}

^ This is (sort of) what lmb.rmo_size represents. So maybe instead of
adding this function, we could just say that the arch code needs to set
rmo_size up with an appropriate value, and then use that below. Though
maybe that's conflating things.

....
> +
> +void __init add_lmb_memory(u64 start, u64 end)
> +{
> + __check_and_double_region_array(&lmb.memory, &lmb_memory_region[0], start, end);
> + lmb_add(start, end - start);
> +}
> +
> +void __init reserve_lmb(u64 start, u64 end, char *name)
> +{
> + if (start == end)
> + return;
> +
> + if (WARN_ONCE(start > end, "reserve_lmb: wrong range [%#llx, %#llx]\n", start, end))
> + return;
> +
> + __check_and_double_region_array(&lmb.reserved, &lmb_reserved_region[0], start, end);
> + lmb_reserve(start, end - start);
> +}
> +
> +void __init free_lmb(u64 start, u64 end)
> +{
> + if (start == end)
> + return;
> +
> + if (WARN_ONCE(start > end, "free_lmb: wrong range [%#llx, %#llx]\n", start, end))
> + return;
> +
> + /* keep punching hole, could run out of slots too */
> + __check_and_double_region_array(&lmb.reserved, &lmb_reserved_region[0], start, end);
> + lmb_free(start, end - start);
> +}

Doesn't this mean that if I call lmb_alloc() or lmb_free() too many
times then I'll potentially run out of space? So doesn't that
essentially break the existing API?

It seems to me that rather than adding these "special" routines that
check for enough space on the way in, instead you should be checking in
lmb_add_region() - which is where AFAICS all allocs/frees/reserves
eventually end up if they need to insert a new region.

cheers


From: Yinghai Lu on
On 03/29/2010 05:22 AM, Michael Ellerman wrote:
> On Sun, 2010-03-28 at 19:43 -0700, Yinghai Lu wrote:
>> They will check if the region array is big enough.
>>
>> __check_and_double_region_array will try to double the region if that array spare
>> slots if not big enough.
>> find_lmb_area() is used to find good postion for new region array.
>> Old array will be copied to new array.
>>
>> Arch code should provide to get_max_mapped, so the new array have accessiable
>> address
> ..
>> diff --git a/mm/lmb.c b/mm/lmb.c
>> index d5d5dc4..9798458 100644
>> --- a/mm/lmb.c
>> +++ b/mm/lmb.c
>> @@ -551,6 +551,95 @@ int lmb_find(struct lmb_property *res)
>> return -1;
>> }
>>
>> +u64 __weak __init get_max_mapped(void)
>> +{
>> + u64 end = max_low_pfn;
>> +
>> + end <<= PAGE_SHIFT;
>> +
>> + return end;
>> +}
>
> ^ This is (sort of) what lmb.rmo_size represents. So maybe instead of
> adding this function, we could just say that the arch code needs to set
> rmo_size up with an appropriate value, and then use that below. Though
> maybe that's conflating things.

ok

will have another patch following this patchset. to use rmo_size replace get_max_mapped()

long __init_lmb lmb_add(u64 base, u64 size)
{
struct lmb_region *_rgn = &lmb.memory;

/* On pSeries LPAR systems, the first LMB is our RMO region. */
if (base == 0)
lmb.rmo_size = size;

return lmb_add_region(_rgn, base, size);

}

looks scary.
maybe later powerpc could used lmb_find and set_lmb_rmo_size in their arch code.


>
> ...
>> +
>> +void __init add_lmb_memory(u64 start, u64 end)
>> +{
>> + __check_and_double_region_array(&lmb.memory, &lmb_memory_region[0], start, end);
>> + lmb_add(start, end - start);
>> +}
>> +
>> +void __init reserve_lmb(u64 start, u64 end, char *name)
>> +{
>> + if (start == end)
>> + return;
>> +
>> + if (WARN_ONCE(start > end, "reserve_lmb: wrong range [%#llx, %#llx]\n", start, end))
>> + return;
>> +
>> + __check_and_double_region_array(&lmb.reserved, &lmb_reserved_region[0], start, end);
>> + lmb_reserve(start, end - start);
>> +}
>> +
>> +void __init free_lmb(u64 start, u64 end)
>> +{
>> + if (start == end)
>> + return;
>> +
>> + if (WARN_ONCE(start > end, "free_lmb: wrong range [%#llx, %#llx]\n", start, end))
>> + return;
>> +
>> + /* keep punching hole, could run out of slots too */
>> + __check_and_double_region_array(&lmb.reserved, &lmb_reserved_region[0], start, end);
>> + lmb_free(start, end - start);
>> +}
>
> Doesn't this mean that if I call lmb_alloc() or lmb_free() too many
> times then I'll potentially run out of space? So doesn't that
> essentially break the existing API?

No, I didn't touch existing API, arches other than x86 should have little change about
lmb.memory.region
lmb.reserved.region
become pointer from array.

>
> It seems to me that rather than adding these "special" routines that
> check for enough space on the way in, instead you should be checking in
> lmb_add_region() - which is where AFAICS all allocs/frees/reserves
> eventually end up if they need to insert a new region.

later i prefer to replace lmb_alloc with find_lmb_area + reserve_lmb.

Thanks

Yinghai
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Benjamin Herrenschmidt on
On Mon, 2010-03-29 at 23:22 +1100, Michael Ellerman wrote:
> ^ This is (sort of) what lmb.rmo_size represents. So maybe instead of
> adding this function, we could just say that the arch code needs to set
> rmo_size up with an appropriate value, and then use that below. Though
> maybe that's conflating things.

Well... not quite.

The RMO (which really is the RMA, historical misnaming) is the region of
memory we can access very early during boot (in real mode on ppc64 but
I plan to use it to represent the boot-time fixed mapping on ppc32 at
some stage). It's not strictly equivalent to max lowmem. However, on
ppc64, it happens to be the size of the first added LMB entry/

In any case, the LMBs should really not care. They allocate where you
tell them to.

IE. That stuff is arch specific enough that I suspect we should just
move it out, while the concept of max_lowmem is common enough (at least
for 32-bit archs) that I'm happy to have some provisions for it inside
the LMB core.

Maybe what we need is an arch call to set the allocation "limit". It
could be set in stages during boot. To the initial mapped memory (bolted
TLB) on ppc32 very early, and then pushed up to max_low_pfn as soon as
the full MMU setup is done for example. IE. All we need is an
lmb_set_alloc_limit() called by the arch in the right spots, that
defines the default allocation limit for lmb_alloc() though of course
lmb_alloc_base() can be used by callers to enforce explicit limits.

Cheers,
Ben.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Michael Ellerman on
On Mon, 2010-03-29 at 09:45 -0700, Yinghai Lu wrote:
> On 03/29/2010 05:22 AM, Michael Ellerman wrote:
> > On Sun, 2010-03-28 at 19:43 -0700, Yinghai Lu wrote:
> >> They will check if the region array is big enough.
> >>
> >> __check_and_double_region_array will try to double the region if that array spare
> >> slots if not big enough.
> >> find_lmb_area() is used to find good postion for new region array.
> >> Old array will be copied to new array.
> >>
> >> Arch code should provide to get_max_mapped, so the new array have accessiable
> >> address
> > ..
> >> diff --git a/mm/lmb.c b/mm/lmb.c
> >> index d5d5dc4..9798458 100644
> >> --- a/mm/lmb.c
> >> +++ b/mm/lmb.c
> >> @@ -551,6 +551,95 @@ int lmb_find(struct lmb_property *res)
> >> return -1;
> >> }
> >>
> >> +u64 __weak __init get_max_mapped(void)
> >> +{
> >> + u64 end = max_low_pfn;
> >> +
> >> + end <<= PAGE_SHIFT;
> >> +
> >> + return end;
> >> +}
> >
> > ^ This is (sort of) what lmb.rmo_size represents. So maybe instead of
> > adding this function, we could just say that the arch code needs to set
> > rmo_size up with an appropriate value, and then use that below. Though
> > maybe that's conflating things.
>
> ok
>
> will have another patch following this patchset. to use rmo_size replace get_max_mapped()

No don't, Benh's idea was better. Leave rmo_size for now, we can clean
that up later.

We just need a lmb.alloc_limit and a lmb_set_alloc_limit() which arch
code calls when it knows what the alloc limit is (and can call multiple
times during boot). Or maybe it should be called "default_alloc_limit",
but that's getting a bit long winded.

>
> long __init_lmb lmb_add(u64 base, u64 size)
> {
> struct lmb_region *_rgn = &lmb.memory;
>
> /* On pSeries LPAR systems, the first LMB is our RMO region. */
> if (base == 0)
> lmb.rmo_size = size;
>
> return lmb_add_region(_rgn, base, size);
>
> }
>
> looks scary.
> maybe later powerpc could used lmb_find and set_lmb_rmo_size in their arch code.

It's not really scary, and it gives you a hint where the code came from
originally :)

We can remove that later though, with some powerpc code to detect the
first memory region before we put it into lmb.

> > ...
> >> +
> >> +void __init add_lmb_memory(u64 start, u64 end)
> >> +{
> >> + __check_and_double_region_array(&lmb.memory, &lmb_memory_region[0], start, end);
> >> + lmb_add(start, end - start);
> >> +}
> >> +
> >> +void __init reserve_lmb(u64 start, u64 end, char *name)
> >> +{
> >> + if (start == end)
> >> + return;
> >> +
> >> + if (WARN_ONCE(start > end, "reserve_lmb: wrong range [%#llx, %#llx]\n", start, end))
> >> + return;
> >> +
> >> + __check_and_double_region_array(&lmb.reserved, &lmb_reserved_region[0], start, end);
> >> + lmb_reserve(start, end - start);
> >> +}
> >> +
> >> +void __init free_lmb(u64 start, u64 end)
> >> +{
> >> + if (start == end)
> >> + return;
> >> +
> >> + if (WARN_ONCE(start > end, "free_lmb: wrong range [%#llx, %#llx]\n", start, end))
> >> + return;
> >> +
> >> + /* keep punching hole, could run out of slots too */
> >> + __check_and_double_region_array(&lmb.reserved, &lmb_reserved_region[0], start, end);
> >> + lmb_free(start, end - start);
> >> +}
> >
> > Doesn't this mean that if I call lmb_alloc() or lmb_free() too many
> > times then I'll potentially run out of space? So doesn't that
> > essentially break the existing API?
>
> No, I didn't touch existing API, arches other than x86 should have little change about
> lmb.memory.region
> lmb.reserved.region
> become pointer from array.

But that's my point. You shouldn't need to touch the existing API, and
you shouldn't need to add a new parallel API. You should just be able to
add the logic for doubling the array in the lmb core, and then everyone
gets dynamically expandable lmb. I don't see any reason why we want to
have two APIs.

> > It seems to me that rather than adding these "special" routines that
> > check for enough space on the way in, instead you should be checking in
> > lmb_add_region() - which is where AFAICS all allocs/frees/reserves
> > eventually end up if they need to insert a new region.
>
> later i prefer to replace lmb_alloc with find_lmb_area + reserve_lmb.

Why? The existing code has been working for years and is well tested?

cheers


From: Yinghai Lu on
On 03/29/2010 03:20 PM, Michael Ellerman wrote:
> On Mon, 2010-03-29 at 09:45 -0700, Yinghai Lu wrote:
>> On 03/29/2010 05:22 AM, Michael Ellerman wrote:
>>> On Sun, 2010-03-28 at 19:43 -0700, Yinghai Lu wrote:
>>>> They will check if the region array is big enough.
>>>>
>>>> __check_and_double_region_array will try to double the region if that array spare
>>>> slots if not big enough.
>>>> find_lmb_area() is used to find good postion for new region array.
>>>> Old array will be copied to new array.
>>>>
>>>> Arch code should provide to get_max_mapped, so the new array have accessiable
>>>> address
>>> ..
>>>> diff --git a/mm/lmb.c b/mm/lmb.c
>>>> index d5d5dc4..9798458 100644
>>>> --- a/mm/lmb.c
>>>> +++ b/mm/lmb.c
>>>> @@ -551,6 +551,95 @@ int lmb_find(struct lmb_property *res)
>>>> return -1;
>>>> }
>>>>
>>>> +u64 __weak __init get_max_mapped(void)
>>>> +{
>>>> + u64 end = max_low_pfn;
>>>> +
>>>> + end <<= PAGE_SHIFT;
>>>> +
>>>> + return end;
>>>> +}
>>>
>>> ^ This is (sort of) what lmb.rmo_size represents. So maybe instead of
>>> adding this function, we could just say that the arch code needs to set
>>> rmo_size up with an appropriate value, and then use that below. Though
>>> maybe that's conflating things.
>>
>> ok
>>
>> will have another patch following this patchset. to use rmo_size replace get_max_mapped()
>
> No don't, Benh's idea was better. Leave rmo_size for now, we can clean
> that up later.
>
> We just need a lmb.alloc_limit and a lmb_set_alloc_limit() which arch
> code calls when it knows what the alloc limit is (and can call multiple
> times during boot). Or maybe it should be called "default_alloc_limit",
> but that's getting a bit long winded.

ok, I will get_max_mapped() for now, an will change to new field...later

>
>>
>> long __init_lmb lmb_add(u64 base, u64 size)
>> {
>> struct lmb_region *_rgn = &lmb.memory;
>>
>> /* On pSeries LPAR systems, the first LMB is our RMO region. */
>> if (base == 0)
>> lmb.rmo_size = size;
>>
>> return lmb_add_region(_rgn, base, size);
>>
>> }
>>
>> looks scary.
>> maybe later powerpc could used lmb_find and set_lmb_rmo_size in their arch code.
>
> It's not really scary, and it gives you a hint where the code came from
> originally :)
>
> We can remove that later though, with some powerpc code to detect the
> first memory region before we put it into lmb.

good.

>
>>> ...
>>>> +
>>>> +void __init add_lmb_memory(u64 start, u64 end)
>>>> +{
>>>> + __check_and_double_region_array(&lmb.memory, &lmb_memory_region[0], start, end);
>>>> + lmb_add(start, end - start);
>>>> +}
>>>> +
>>>> +void __init reserve_lmb(u64 start, u64 end, char *name)
>>>> +{
>>>> + if (start == end)
>>>> + return;
>>>> +
>>>> + if (WARN_ONCE(start > end, "reserve_lmb: wrong range [%#llx, %#llx]\n", start, end))
>>>> + return;
>>>> +
>>>> + __check_and_double_region_array(&lmb.reserved, &lmb_reserved_region[0], start, end);
>>>> + lmb_reserve(start, end - start);
>>>> +}
>>>> +
>>>> +void __init free_lmb(u64 start, u64 end)
>>>> +{
>>>> + if (start == end)
>>>> + return;
>>>> +
>>>> + if (WARN_ONCE(start > end, "free_lmb: wrong range [%#llx, %#llx]\n", start, end))
>>>> + return;
>>>> +
>>>> + /* keep punching hole, could run out of slots too */
>>>> + __check_and_double_region_array(&lmb.reserved, &lmb_reserved_region[0], start, end);
>>>> + lmb_free(start, end - start);
>>>> +}
>>>
>>> Doesn't this mean that if I call lmb_alloc() or lmb_free() too many
>>> times then I'll potentially run out of space? So doesn't that
>>> essentially break the existing API?
>>
>> No, I didn't touch existing API, arches other than x86 should have little change about
>> lmb.memory.region
>> lmb.reserved.region
>> become pointer from array.
>
> But that's my point. You shouldn't need to touch the existing API, and
> you shouldn't need to add a new parallel API. You should just be able to
> add the logic for doubling the array in the lmb core, and then everyone
> gets dynamically expandable lmb. I don't see any reason why we want to
> have two APIs.

that will have too much change for all x86 caller.

that set of API is __init, and will be freed later.

and these two are just wrapper for old one to make x86 transition more easily.

>
>>> It seems to me that rather than adding these "special" routines that
>>> check for enough space on the way in, instead you should be checking in
>>> lmb_add_region() - which is where AFAICS all allocs/frees/reserves
>>> eventually end up if they need to insert a new region.
>>
>> later i prefer to replace lmb_alloc with find_lmb_area + reserve_lmb.
>
> Why? The existing code has been working for years and is well tested?

We need to find_lmb_area to find good position for new reserved region array.

static void __init __reserve_lmb(u64 start, u64 end, char *name)
{
__check_and_double_region_array(&lmb.reserved, &lmb_reserved_region[0], start, end);
lmb_reserve(start, end - start);
}

with the execlude area in __check_and_double_region_lmb() will not put the new array overlap with area that we are going to reserve.

and find_lmb_area + reserve_lmb should be identical to lmb_alloc...

Thanks

Yinghai Lu
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/