From: Minchan Kim on
On Thu, Jul 29, 2010 at 11:47:26AM -0500, Christoph Lameter wrote:
> On Fri, 30 Jul 2010, Minchan Kim wrote:
>
> > The thing is valid section also have a invalid memmap.
>
> Oww... . A valid section points to a valid memmap memory block (the page
> structs) but the underlying memory pages may not present. So you can check
> the (useless) page structs for the page state of the not present pages in
> the memory map. If the granularity of the sparsemem mapping is not
> sufficient for your purpose then you can change the sparsemem config
> (configuration is in arch/<arch>/include/asm/sparsemem.h but does not
> exist for arm).
>
> > It means section 0 is an incompletely filled section.
> > Nontheless, current pfn_valid of sparsemem checks pfn loosely.
> > It checks only mem_section's validation but ARM can free mem_map on hole
> > to save memory space. So in above case, pfn on 0x25000000 can pass pfn_valid's
> > validation check. It's not what we want.
>
> IMHO ARM should not poke holes in the memmap sections. The guarantee of
> the full presence of the section is intentional to avoid having to do
> these checks that you are proposing. The page allocator typically expects
> to be able to check all page structs in one basic allocation unit.
>
> Also pfn_valid then does not have to touch the pag struct to perform its
> function as long as we guarantee the presence of the memmap section.

Absolutely Right. Many mm guys wanted to do it.
But Russell doesn't want it.
Please, look at the discussion.

http://www.spinics.net/lists/arm-kernel/msg93026.html

In fact, we didn't determine the approache at that time.
But I think we can't give up ARM's usecase although sparse model
dosn't be desinged to the such granularity. and I think this approach
can solve ARM's FLATMEM's pfn_valid problem which is doing binar search.
So I just tried to solve this problem. But Russell still be quiet.

Okay. I will wait other's opinion.
First of all, let's fix the approach.
Russell, Could you speak your opinion about this approach or your suggestion?

--
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Christoph Lameter on
On Fri, 30 Jul 2010, Minchan Kim wrote:

> But Russell doesn't want it.
> Please, look at the discussion.
>
> http://www.spinics.net/lists/arm-kernel/msg93026.html
>
> In fact, we didn't determine the approache at that time.
> But I think we can't give up ARM's usecase although sparse model
> dosn't be desinged to the such granularity. and I think this approach

The sparse model goes down to page size memmap granularity. The problem
that you may have is with aligning the maximum allocation unit of the
page allocator with the section size of sparsemem. If you reduce your
maximum allocation units then you can get more granularity.

> can solve ARM's FLATMEM's pfn_valid problem which is doing binar search.

OMG.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Russell King - ARM Linux on
On Thu, Jul 29, 2010 at 12:30:23PM -0500, Christoph Lameter wrote:
> On Fri, 30 Jul 2010, Minchan Kim wrote:
>
> > But Russell doesn't want it.
> > Please, look at the discussion.
> >
> > http://www.spinics.net/lists/arm-kernel/msg93026.html
> >
> > In fact, we didn't determine the approache at that time.
> > But I think we can't give up ARM's usecase although sparse model
> > dosn't be desinged to the such granularity. and I think this approach
>
> The sparse model goes down to page size memmap granularity. The problem
> that you may have is with aligning the maximum allocation unit of the
> page allocator with the section size of sparsemem. If you reduce your
> maximum allocation units then you can get more granularity.

Then why is there no advantage to adding 512kB memory modules in a machine
with memory spaced at 64MB apart with sparsemem - the mem_map array for
each sparsemem section is 512kB in size. So the additional 512kB memory
modules give you nothing because they're completely full of mem_map array.

_That's_ the kind of problem that makes sparsemem unsuitable for... sparse
memory layouts found in the embedded world.

And that also makes flatmem unsuitable for use on ARM when you have such
memory layouts - four banks of discrete memory spaced at 64MB over a 256MB
range, which can have a size down to 512kB each.

And no, setting the sparse section size to 512kB doesn't work - memory is
offset by 256MB already, so you need a sparsemem section array of 1024
entries just to cover that - with the full 256MB populated, that's 512
unused entries followed by 512 used entries. That too is going to waste
memory like nobodies business.

Basically, what's come out of this discussion is that the kernel really
_sucks_ when it comes to handling sparse memory layouts found in on ARM.

> > can solve ARM's FLATMEM's pfn_valid problem which is doing binar search.
>
> OMG.

No, it is NOT that expensive. Most people go "omg, binary search on
a cached architecture, that's insane". That statement is soo far from
reality that the statement itself is insane.

The binary search operates on a very small amount of data, and results
in two or possibly three cache lines at the most being loaded, assuming
a full 8 banks of memory information passed. Most systems pass one or
maybe two banks - so the _entire_ thing fits within one cache line - a
cache line which will have already been loaded.

So no, this binary search is not as expensive as you think it is.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Christoph Lameter on
On Thu, 29 Jul 2010, Russell King - ARM Linux wrote:

> And no, setting the sparse section size to 512kB doesn't work - memory is
> offset by 256MB already, so you need a sparsemem section array of 1024
> entries just to cover that - with the full 256MB populated, that's 512
> unused entries followed by 512 used entries. That too is going to waste
> memory like nobodies business.

SPARSEMEM EXTREME does not handle that?

Some ARMs seem to have MMUs. If so then use SPARSEMEM_VMEMMAP. You can map
4k pages for the mmap through a page table. Redirect unused 4k blocks to
the NULL page.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Dave Hansen on
On Thu, 2010-07-29 at 19:33 +0100, Russell King - ARM Linux wrote:
> And no, setting the sparse section size to 512kB doesn't work - memory is
> offset by 256MB already, so you need a sparsemem section array of 1024
> entries just to cover that - with the full 256MB populated, that's 512
> unused entries followed by 512 used entries. That too is going to waste
> memory like nobodies business.

Sparsemem could use some work in the case where memory doesn't start at
0x0. But, it doesn't seem like it would be _too_ oppressive to add.
It's literally just adding an offset to all of the places where a
physical address is stuck into the system. It'll make a few of the
calculations longer, of course, but it should be manageable.

Could you give some full examples of how the memory is laid out on these
systems? I'm having a bit of a hard time visualizing it.

As Christoph mentioned, SPARSEMEM_EXTREME might be viable here, too.

If you free up parts of the mem_map[] array, how does the buddy
allocator still work? I thought we required at 'struct page's to be
contiguous and present for at least 2^MAX_ORDER-1 pages in one go.

-- Dave

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/