From: Thomas Gleixner on
On Mon, 22 Mar 2010, Yinghai Lu wrote:
> On 03/22/2010 03:09 PM, Thomas Gleixner wrote:
> > On Mon, 22 Mar 2010, Yinghai Lu wrote:
> >> On 03/22/2010 12:37 PM, Ingo Molnar wrote:
>
> >> 1. need to keep e820
> >
> > That's neither an argument for using lmb nor an argument not to use
> > lmb. e820 is x86 specific BIOS wreckage and it's whole purpose is
> > just to feed information into a (hopefully) generic early resource
> > management facility.
> >
> > e820 _CANNOT_ be generalized. Period.

I still want to know, what "need to keep e820" means for you.

> >> 2. use e820 range with RAM to fill lmb.memory when finizing_e820
> >
> > What's finizing_e820 ???
> finish_e820_parsing();

Yinghai, come on. Are you really expecting that everyone involved in
this discussion goes to look up what the heck finish_e820_parsing()
is doing ?

You want to explain why your solution is better or why lmb is not
sufficient, so you better go and explain what finish_e820_parsing()
is, why finish_e820_parsing() is important and why lmb cannot cope
with it.

> >> 3. use lmb.reserved to replace early_res.
> >
> > What's the implication of doing that ?
>
> early_res array is only corresponding to lmb.reserved, aka reserved
> region from kernel.

Is it only corresponding (somehow) or is it a full equivivalent ?

> >> current lmb is merging the region, we can not use name tag any more.
> >
> > What's wrong with merging of regions ? Are you arguing about a
> > specific region ("the region") ?

Care to answer my question ?

> >
> > Which name tag ? And why is that name tag important ?
>
> struct early_res {
> u64 start, end;
> char name[15];
> char overlap_ok;
> };

I'm starting to get annoyed, really. What is that name field for and
why is that "name" field important ?

> >
> >> may need to dump early_memtest, and use early_res for bootmem at
> >> first.
> >
> > Why exactly might early_memtest not longer be possible ?
>
> early_memtest need to call find_e820_area_size
> current lmb doesn't have that kind of find util.
> the one memory subtract reserved memory by kernel.

What subtracts what ? And why is it that hard to fix that ?

> >
> > What means "early_res for bootmem" ?
>
> use early_res to replace bootmem, the CONFIG_NO_BOOTMEM.
> that need early_res can be double or increase the slots automatically.

-ENOPARSE

Yinghai, I asked you to take your time and explain things in detail
instead of shooting unparseable answers within a minute.

Everyone else in this discussion tries to be as explanatory as
possible, just you expect that everyone else is going to dig out the
crystal ball to understand the deeper meanings of your patches.

Again, please take your time to explain what needs to be done or what
is impossible to solve in your opinion, so we can get that resolved in
a way which is satisfactory and useful for all parties involved.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Yinghai Lu on
On 03/22/2010 03:53 PM, Thomas Gleixner wrote:
> On Mon, 22 Mar 2010, Yinghai Lu wrote:
>> On 03/22/2010 03:09 PM, Thomas Gleixner wrote:
>>> On Mon, 22 Mar 2010, Yinghai Lu wrote:
>>>> On 03/22/2010 12:37 PM, Ingo Molnar wrote:
>>
>>>> 1. need to keep e820
>>>
>>> That's neither an argument for using lmb nor an argument not to use
>>> lmb. e820 is x86 specific BIOS wreckage and it's whole purpose is
>>> just to feed information into a (hopefully) generic early resource
>>> management facility.
>>>
>>> e820 _CANNOT_ be generalized. Period.
>
> I still want to know, what "need to keep e820" means for you.

keep the most arch/x86/kernel/e820.c, and later when finish_e820_parsing() is called,
fill lmb.memory according to e820 entries with E820_RAM type.



>
>>>> 2. use e820 range with RAM to fill lmb.memory when finizing_e820
>>>
>>> What's finizing_e820 ???
>> finish_e820_parsing();
>
> Yinghai, come on. Are you really expecting that everyone involved in
> this discussion goes to look up what the heck finish_e820_parsing()
> is doing ?
>
> You want to explain why your solution is better or why lmb is not
> sufficient, so you better go and explain what finish_e820_parsing()
> is, why finish_e820_parsing() is important and why lmb cannot cope
> with it.

current x86:
a. setup e820 array.
b. early_parm mem= and memmap= related code will adjust the e820.

we don't need to call lmb_enforce_memory_limit().

>
>>>> 3. use lmb.reserved to replace early_res.
>>>
>>> What's the implication of doing that ?
>>
>> early_res array is only corresponding to lmb.reserved, aka reserved
>> region from kernel.
>
> Is it only corresponding (somehow) or is it a full equivivalent ?

early_res is not sorted and merged.

>
>>>> current lmb is merging the region, we can not use name tag any more.
>>>
>>> What's wrong with merging of regions ? Are you arguing about a
>>> specific region ("the region") ?
>
> Care to answer my question ?
if range get merged, you can not use name with them.
>
>>>
>>> Which name tag ? And why is that name tag important ?
>>
>> struct early_res {
>> u64 start, end;
>> char name[15];
>> char overlap_ok;
>> };
>
> I'm starting to get annoyed, really. What is that name field for and
> why is that "name" field important ?

at least later when some code free a wrong range, we can figure who cause the problem.

>
>>>
>>>> may need to dump early_memtest, and use early_res for bootmem at
>>>> first.
>>>
>>> Why exactly might early_memtest not longer be possible ?
>>
>> early_memtest need to call find_e820_area_size
>> current lmb doesn't have that kind of find util.
>> the one memory subtract reserved memory by kernel.
>
> What subtracts what ? And why is it that hard to fix that ?

lmb.memory - lmb.reserved

or e820 E820_RAM entries - early_res

move some code from early_res to lmb.c?

>
>>>
>>> What means "early_res for bootmem" ?
>>
>> use early_res to replace bootmem, the CONFIG_NO_BOOTMEM.
>> that need early_res can be double or increase the slots automatically.
>
> -ENOPARSE
>
> Yinghai, I asked you to take your time and explain things in detail
> instead of shooting unparseable answers within a minute.
>
> Everyone else in this discussion tries to be as explanatory as
> possible, just you expect that everyone else is going to dig out the
> crystal ball to understand the deeper meanings of your patches.
>
> Again, please take your time to explain what needs to be done or what
> is impossible to solve in your opinion, so we can get that resolved in
> a way which is satisfactory and useful for all parties involved.

to make x86 to use lmb, we need to extend lmb to have find_early_area.

static int __init find_overlapped_early(u64 start, u64 end)
{
int i;
struct lmb_properties *r;

for (i = 0; i < lmb.reserved_cnt && lmb.reserved.region[i].size; i++) {
r = &lmb.reserved.region[i];
if (end > r->base && start < (r->base + r->size))
break;
}

return i;
}


/* Check for already reserved areas */
static inline int __init bad_addr(u64 *addrp, u64 size, u64 align)
{
int i;
u64 addr = *addrp;
int changed = 0;
struct lmb_properties *r;
again:
i = find_overlapped_early(addr, addr + size);
r = &lmb.reserved.region[i];
if (i < lmb.reserved.cnt && r->size) {
*addrp = addr = round_up(r->base + r->size, align);
changed = 1;
goto again;
}
return changed;
}

u64 __init find_early_area(u64 ei_start, u64 ei_last, u64 start, u64 end,
u64 size, u64 align)
{
u64 addr, last;

addr = round_up(ei_start, align);
if (addr < start)
addr = round_up(start, align);
if (addr >= ei_last)
goto out;
while (bad_addr(&addr, size, align) && addr+size <= ei_last)
;
last = addr + size;
if (last > ei_last)
goto out;
if (last > end)
goto out;

return addr;

out:
return -1ULL;
}

find_early_area_size()...

and use them we can have find_lmb_free_area

/*
* Find a free area with specified alignment in a specific range.
*/
u64 __init find_lmb_area(u64 start, u64 end, u64 size, u64 align)
{
int i;

for (i = 0; i < lmb.memory.cnt; i++) {
u64 ei_start = lmb.memory.region[i].base;
u64 ei_end = ei_start + lmb.memory.region[i].size;

addr = find_early_area(ei_start, ei_last, start, end,
size, align);

if (addr != -1ULL)
return addr;
}
return -1ULL;
}


also later we can use with active_range for bootmem replacement.

u64 __init find_memory_core_early(int nid, u64 size, u64 align,
u64 goal, u64 limit)
{
int i;

/* need to go over early_node_map to find out good range for node */
for_each_active_range_index_in_nid(i, nid) {
u64 addr;
u64 ei_start, ei_last;

ei_last = early_node_map[i].end_pfn;
ei_last <<= PAGE_SHIFT;
ei_start = early_node_map[i].start_pfn;
ei_start <<= PAGE_SHIFT;
addr = find_early_area(ei_start, ei_last,
goal, limit, size, align);

if (addr == -1ULL)
continue;

return addr;
}

return -1ULL;
}


Yinghai
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Thomas Gleixner on
B1;2005;0cYinghai,

On Mon, 22 Mar 2010, Yinghai Lu wrote:
> On 03/22/2010 03:53 PM, Thomas Gleixner wrote:
> > On Mon, 22 Mar 2010, Yinghai Lu wrote:
> >> On 03/22/2010 03:09 PM, Thomas Gleixner wrote:
> >>> On Mon, 22 Mar 2010, Yinghai Lu wrote:
> >>>> On 03/22/2010 12:37 PM, Ingo Molnar wrote:
> >>
> >>>> 1. need to keep e820
> >>>
> >>> That's neither an argument for using lmb nor an argument not to use
> >>> lmb. e820 is x86 specific BIOS wreckage and it's whole purpose is
> >>> just to feed information into a (hopefully) generic early resource
> >>> management facility.
> >>>
> >>> e820 _CANNOT_ be generalized. Period.
> >
> > I still want to know, what "need to keep e820" means for you.
>
> keep the most arch/x86/kernel/e820.c, and later when
> finish_e820_parsing() is called, fill lmb.memory according to e820
> entries with E820_RAM type.

Right, and we never get rid of e820.c at all. Simply because e820 is a
x86 specific clusterfuck. No way to find anything which is remotely
insane like that in any other architecture.

I really do not understand why you ever thought that moving that code
to a generic place is something useful and acceptable.

The point is, that some of the algorithms which e820 needs to sanitize
the maps might be of general use, but definitely not the whole e820
crappola. And if you look close, lmb has most of them already.

> >
> >>>> 2. use e820 range with RAM to fill lmb.memory when finizing_e820
> >>>
> >>> What's finizing_e820 ???
> >> finish_e820_parsing();
> >
> > Yinghai, come on. Are you really expecting that everyone involved in
> > this discussion goes to look up what the heck finish_e820_parsing()
> > is doing ?
> >
> > You want to explain why your solution is better or why lmb is not
> > sufficient, so you better go and explain what finish_e820_parsing()
> > is, why finish_e820_parsing() is important and why lmb cannot cope
> > with it.
>
> current x86:
> a. setup e820 array.
> b. early_parm mem= and memmap= related code will adjust the e820.

Dammit. I asked for an explanation not for some headword
listing. These bullet points do _NOT_ explain at all why e820 is
superior.

> we don't need to call lmb_enforce_memory_limit().

Of course you do not need to call lmb_enforce_memory_limit() simply
because it is not relevant to the existing e820 code at all.

What's the point ?

> >
> >>>> 3. use lmb.reserved to replace early_res.
> >>>
> >>> What's the implication of doing that ?
> >>
> >> early_res array is only corresponding to lmb.reserved, aka reserved
> >> region from kernel.
> >
> > Is it only corresponding (somehow) or is it a full equivivalent ?
>
> early_res is not sorted and merged.

So what's the implication for x86 vs. the early_res stuff ? Any down
sides, up sides other than not sorted and merged?

> >>>> current lmb is merging the region, we can not use name tag any more.
> >>>
> >>> What's wrong with merging of regions ? Are you arguing about a
> >>> specific region ("the region") ?
> >
> > Care to answer my question ?
> if range get merged, you can not use name with them.

Why does that matter ?

> >>>
> >>> Which name tag ? And why is that name tag important ?
> >>
> >> struct early_res {
> >> u64 start, end;
> >> char name[15];
> >> char overlap_ok;
> >> };
> >
> > I'm starting to get annoyed, really. What is that name field for and
> > why is that "name" field important ?
>
> at least later when some code free a wrong range, we can figure who cause the problem.

That does not explain the value of the name field at all. If some code
frees a wrong range a backtrace is always more helpful than some
arbitrary name field. Am I missing something ?

> >>>> may need to dump early_memtest, and use early_res for bootmem at
> >>>> first.
> >>>
> >>> Why exactly might early_memtest not longer be possible ?
> >>
> >> early_memtest need to call find_e820_area_size
> >> current lmb doesn't have that kind of find util.
> >> the one memory subtract reserved memory by kernel.
> >
> > What subtracts what ? And why is it that hard to fix that ?
>
> lmb.memory - lmb.reserved
>
> or e820 E820_RAM entries - early_res
>
> move some code from early_res to lmb.c?

Care to explain in clear wording what you need to solve ? "move some
code from early_res to lmb.c?" is definitely not an useful
explanation.

> >>>
> >>> What means "early_res for bootmem" ?
> >>
> >> use early_res to replace bootmem, the CONFIG_NO_BOOTMEM.
> >> that need early_res can be double or increase the slots automatically.
> >
> > -ENOPARSE
> >
> > Yinghai, I asked you to take your time and explain things in detail
> > instead of shooting unparseable answers within a minute.
> >
> > Everyone else in this discussion tries to be as explanatory as
> > possible, just you expect that everyone else is going to dig out the
> > crystal ball to understand the deeper meanings of your patches.
> >
> > Again, please take your time to explain what needs to be done or what
> > is impossible to solve in your opinion, so we can get that resolved in
> > a way which is satisfactory and useful for all parties involved.
>
> to make x86 to use lmb, we need to extend lmb to have find_early_area.

Why ?

> static int __init find_overlapped_early(u64 start, u64 end)
> {

No, posting arbitrary code snippets which you think are necessary to
solve it is not the way to go.

There is _ZERO_ explanation _WHY_ you think that this is a
prerequisite.

Those largely uncommented commented code snippets (uncommented as the
corresponding code in x86) are _NOT_ an explanation at all.

You just state that you need that whole bunch just w/o telling _WHY_.

The more I look into this I doubt that there is an actual reason for
this complexity. It just looks like it has grown that way by fixing
corner cases all over the place and not out of a real design
requirement.

Either that or it's just the lack of understanding how to map lmb
functionality to the problem at hand as certainly LMB does not map 1:1
to the current x86 way of solving that problem.

Please give a proper explanation for this, really !

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Yinghai Lu on
On 03/22/2010 05:45 PM, Thomas Gleixner wrote:
> B1;2005;0cYinghai,
>>
>> to make x86 to use lmb, we need to extend lmb to have find_early_area.
>
> Why ?

unless you want to dump two users of find_e820_area_size()

YH

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Thomas Gleixner on
Yinghai,

On Mon, 22 Mar 2010, Yinghai Lu wrote:

> On 03/22/2010 05:45 PM, Thomas Gleixner wrote:
> > B1;2005;0cYinghai,
> >>
> >> to make x86 to use lmb, we need to extend lmb to have find_early_area.
> >
> > Why ?
>
> unless you want to dump two users of find_e820_area_size()

I don't care about the two users of find_e820_area_size() as long as
you are not willing to spend more than a split second to explain
things.

I'm really fed up to pull answers from your nose bit by bit. Me and
others wrote lengthy explanations and asked precise questions.

All we get are some meager bones thrown our way.

If that's your understanding of community work, fine. It's just not
our way of working together. After spending valuable time on that I
completely agree with Dave on:

"Now I know that you _REALLY_ aren't listening to us."

It's not only that you are not listening, you are simply ignoring our
concerns. Go ahead with that, but do not wonder about us ignoring you
as well.

Thanks,

tglx

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/