From: Kukjin Kim on
Minchan Kim wrote:
>
Hi :-)

> Kukjin, Could you test below patch?

Sure.

> I don't have any sparsemem system. Sorry.

No problem...
And in the same test, there was no problem ;-)

It means has no kernel panic with your this patch.

If you need other test on sparsemem system, please let me know.

Thanks.

Best regards,
Kgene.
--
Kukjin Kim <kgene.kim(a)samsung.com>, Senior Engineer,
SW Solution Development Team, Samsung Electronics Co., Ltd.
>
> -- CUT DOWN HERE --
>
> Kukjin reported oops happen while he change min_free_kbytes
> http://www.spinics.net/lists/arm-kernel/msg92894.html
> It happen by memory map on sparsemem.
>
> The system has a memory map following as.
> section 0 section 1 section 2
> 0x20000000-0x25000000, 0x40000000-0x50000000, 0x50000000-0x58000000
> SECTION_SIZE_BITS 28(256M)
>
> It means section 0 is an incompletely filled section.
> Nontheless, current pfn_valid of sparsemem checks pfn loosely.
>
> It checks only mem_section's validation.
> So in above case, pfn on 0x25000000 can pass pfn_valid's validation check.
> It's not what we want.
>
> The Following patch adds check valid pfn range check on pfn_valid of
sparsemem.
>
> Signed-off-by: Minchan Kim <minchan.kim(a)gmail.com>
> Reported-by: Kukjin Kim <kgene.kim(a)samsung.com>
>
> P.S)
> It is just RFC. If we agree with this, I will make the patch on mmotm.
>
> --
>
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index b4d109e..6c2147a 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -979,6 +979,8 @@ struct mem_section {
> struct page_cgroup *page_cgroup;
> unsigned long pad;
> #endif
> + unsigned long start_pfn;
> + unsigned long end_pfn;
> };
>
> #ifdef CONFIG_SPARSEMEM_EXTREME
> @@ -1039,6 +1041,12 @@ static inline int valid_section(struct mem_section
> *section)
> return (section && (section->section_mem_map &
> SECTION_HAS_MEM_MAP));
> }
>
> +static inline int valid_section_pfn(struct mem_section *section, unsigned
long pfn)
> +{
> + return ((section && (section->section_mem_map &
> SECTION_HAS_MEM_MAP)) &&
> + (section->start_pfn <= pfn && pfn < section->end_pfn));
> +}
> +
> static inline int valid_section_nr(unsigned long nr)
> {
> return valid_section(__nr_to_section(nr));
> @@ -1053,7 +1061,7 @@ static inline int pfn_valid(unsigned long pfn)
> {
> if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
> return 0;
> - return valid_section(__nr_to_section(pfn_to_section_nr(pfn)));
> + return valid_section_pfn(__nr_to_section(pfn_to_section_nr(pfn)),
pfn);
> }
>
> diff --git a/mm/sparse.c b/mm/sparse.c
> index 95ac219..bde9090 100644
> --- a/mm/sparse.c
> +++ b/mm/sparse.c
> @@ -195,6 +195,8 @@ void __init memory_present(int nid, unsigned long
start,
> unsigned long end)
> if (!ms->section_mem_map)
> ms->section_mem_map =
> sparse_encode_early_nid(nid) |
>
> SECTION_MARKED_PRESENT;
> + ms->start_pfn = start;
> + ms->end_pfn = end;
> }
> }
>
>
>
> --
> Kind regards,
> Minchan Kim

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: KAMEZAWA Hiroyuki on
On Tue, 13 Jul 2010 00:53:48 +0900
Minchan Kim <minchan.kim(a)gmail.com> wrote:

> Kukjin, Could you test below patch?
> I don't have any sparsemem system. Sorry.
>
> -- CUT DOWN HERE --
>
> Kukjin reported oops happen while he change min_free_kbytes
> http://www.spinics.net/lists/arm-kernel/msg92894.html
> It happen by memory map on sparsemem.
>
> The system has a memory map following as.
> section 0 section 1 section 2
> 0x20000000-0x25000000, 0x40000000-0x50000000, 0x50000000-0x58000000
> SECTION_SIZE_BITS 28(256M)
>
> It means section 0 is an incompletely filled section.
> Nontheless, current pfn_valid of sparsemem checks pfn loosely.
>
> It checks only mem_section's validation.
> So in above case, pfn on 0x25000000 can pass pfn_valid's validation check.
> It's not what we want.
>
> The Following patch adds check valid pfn range check on pfn_valid of sparsemem.
>
> Signed-off-by: Minchan Kim <minchan.kim(a)gmail.com>
> Reported-by: Kukjin Kim <kgene.kim(a)samsung.com>
>
> P.S)
> It is just RFC. If we agree with this, I will make the patch on mmotm.
>
> --
>
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index b4d109e..6c2147a 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -979,6 +979,8 @@ struct mem_section {
> struct page_cgroup *page_cgroup;
> unsigned long pad;
> #endif
> + unsigned long start_pfn;
> + unsigned long end_pfn;
> };
>

I have 2 concerns.
1. This makes mem_section twice. Wasting too much memory and not good for cache.
But yes, you can put this under some CONFIG which has small number of mem_section[].

2. This can't be help for a case where a section has multiple small holes.


Then, my proposal for HOLES_IN_MEMMAP sparsemem is below.
==
Some architectures unmap memmap[] for memory holes even with SPARSEMEM.
To handle that, pfn_valid() should check there are really memmap or not.
For that purpose, __get_user() can be used.
This idea is from ia64_pfn_valid().

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu(a)jp.fujitsu.com>
---
include/linux/mmzone.h | 12 ++++++++++++
mm/sparse.c | 17 +++++++++++++++++
2 files changed, 29 insertions(+)

Index: mmotm-2.6.35-0701/include/linux/mmzone.h
===================================================================
--- mmotm-2.6.35-0701.orig/include/linux/mmzone.h
+++ mmotm-2.6.35-0701/include/linux/mmzone.h
@@ -1047,12 +1047,24 @@ static inline struct mem_section *__pfn_
return __nr_to_section(pfn_to_section_nr(pfn));
}

+#ifndef CONFIG_ARCH_HAS_HOLES_IN_MEMMAP
static inline int pfn_valid(unsigned long pfn)
{
if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
return 0;
return valid_section(__nr_to_section(pfn_to_section_nr(pfn)));
}
+#else
+extern int pfn_valid_mapped(unsigned long pfn);
+static inline int pfn_valid(unsigned long pfn)
+{
+ if (pfn_to_seciton_nr(pfn) >= NR_MEM_SECTIONS)
+ return 0;
+ if (!valid_section(__nr_to_section(pfn_to_section_nr(pfn))))
+ return 0;
+ return pfn_valid_mapped(pfn);
+}
+#endif

static inline int pfn_present(unsigned long pfn)
{
Index: mmotm-2.6.35-0701/mm/sparse.c
===================================================================
--- mmotm-2.6.35-0701.orig/mm/sparse.c
+++ mmotm-2.6.35-0701/mm/sparse.c
@@ -799,3 +799,20 @@ void sparse_remove_one_section(struct zo
free_section_usemap(memmap, usemap);
}
#endif
+
+#ifdef CONFIG_ARCH_HAS_HOLES_IN_MEMMAP
+int pfn_valid_mapped(unsigned long pfn)
+{
+ struct page *page = pfn_to_page(pfn);
+ char *lastbyte = (char *)(page+1)-1;
+ char byte;
+
+ if(__get_user(byte, page) != 0)
+ return 0;
+
+ if ((((unsigned long)page) & PAGE_MASK) ==
+ (((unsigned long)lastbyte) & PAGE_MASK))
+ return 1;
+ return (__get_user(byte,lastbyte) == 0);
+}
+#endif





--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Minchan Kim on
On Tue, Jul 13, 2010 at 12:19 PM, KAMEZAWA Hiroyuki
<kamezawa.hiroyu(a)jp.fujitsu.com> wrote:
> On Tue, 13 Jul 2010 00:53:48 +0900
> Minchan Kim <minchan.kim(a)gmail.com> wrote:
>
>> Kukjin, Could you test below patch?
>> I don't have any sparsemem system. Sorry.
>>
>> -- CUT DOWN HERE --
>>
>> Kukjin reported oops happen while he change min_free_kbytes
>> http://www.spinics.net/lists/arm-kernel/msg92894.html
>> It happen by memory map on sparsemem.
>>
>> The system has a memory map following as.
>> � � �section 0 � � � � � � section 1 � � � � � � �section 2
>> 0x20000000-0x25000000, 0x40000000-0x50000000, 0x50000000-0x58000000
>> SECTION_SIZE_BITS 28(256M)
>>
>> It means section 0 is an incompletely filled section.
>> Nontheless, current pfn_valid of sparsemem checks pfn loosely.
>>
>> It checks only mem_section's validation.
>> So in above case, pfn on 0x25000000 can pass pfn_valid's validation check.
>> It's not what we want.
>>
>> The Following patch adds check valid pfn range check on pfn_valid of sparsemem.
>>
>> Signed-off-by: Minchan Kim <minchan.kim(a)gmail.com>
>> Reported-by: Kukjin Kim <kgene.kim(a)samsung.com>
>>
>> P.S)
>> It is just RFC. If we agree with this, I will make the patch on mmotm.
>>
>> --
>>
>> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
>> index b4d109e..6c2147a 100644
>> --- a/include/linux/mmzone.h
>> +++ b/include/linux/mmzone.h
>> @@ -979,6 +979,8 @@ struct mem_section {
>> � � � � struct page_cgroup *page_cgroup;
>> � � � � unsigned long pad;
>> �#endif
>> + � � � unsigned long start_pfn;
>> + � � � unsigned long end_pfn;
>> �};
>>
>
> I have 2 concerns.
> �1. This makes mem_section twice. Wasting too much memory and not good for cache.
> � �But yes, you can put this under some CONFIG which has small number of mem_section[].
>

I think memory usage isn't a big deal. but for cache, we can move
fields into just after section_mem_map.

> �2. This can't be help for a case where a section has multiple small holes.

I agree. But this(not punched hole but not filled section problem)
isn't such case. But it would be better to handle it altogether. :)

>
> Then, my proposal for HOLES_IN_MEMMAP sparsemem is below.
> ==
> Some architectures unmap memmap[] for memory holes even with SPARSEMEM.
> To handle that, pfn_valid() should check there are really memmap or not.
> For that purpose, __get_user() can be used.

Look at free_unused_memmap. We don't unmap pte of hole memmap.
Is __get_use effective, still?




--
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: KAMEZAWA Hiroyuki on
On Tue, 13 Jul 2010 13:11:14 +0900
Minchan Kim <minchan.kim(a)gmail.com> wrote:

> On Tue, Jul 13, 2010 at 12:19 PM, KAMEZAWA Hiroyuki
> <kamezawa.hiroyu(a)jp.fujitsu.com> wrote:
> > On Tue, 13 Jul 2010 00:53:48 +0900
> > Minchan Kim <minchan.kim(a)gmail.com> wrote:
> >
> >> Kukjin, Could you test below patch?
> >> I don't have any sparsemem system. Sorry.
> >>
> >> -- CUT DOWN HERE --
> >>
> >> Kukjin reported oops happen while he change min_free_kbytes
> >> http://www.spinics.net/lists/arm-kernel/msg92894.html
> >> It happen by memory map on sparsemem.
> >>
> >> The system has a memory map following as.
> >>      section 0             section 1              section 2
> >> 0x20000000-0x25000000, 0x40000000-0x50000000, 0x50000000-0x58000000
> >> SECTION_SIZE_BITS 28(256M)
> >>
> >> It means section 0 is an incompletely filled section.
> >> Nontheless, current pfn_valid of sparsemem checks pfn loosely.
> >>
> >> It checks only mem_section's validation.
> >> So in above case, pfn on 0x25000000 can pass pfn_valid's validation check.
> >> It's not what we want.
> >>
> >> The Following patch adds check valid pfn range check on pfn_valid of sparsemem.
> >>
> >> Signed-off-by: Minchan Kim <minchan.kim(a)gmail.com>
> >> Reported-by: Kukjin Kim <kgene.kim(a)samsung.com>
> >>
> >> P.S)
> >> It is just RFC. If we agree with this, I will make the patch on mmotm.
> >>
> >> --
> >>
> >> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> >> index b4d109e..6c2147a 100644
> >> --- a/include/linux/mmzone.h
> >> +++ b/include/linux/mmzone.h
> >> @@ -979,6 +979,8 @@ struct mem_section {
> >>         struct page_cgroup *page_cgroup;
> >>         unsigned long pad;
> >>  #endif
> >> +       unsigned long start_pfn;
> >> +       unsigned long end_pfn;
> >>  };
> >>
> >
> > I have 2 concerns.
> >  1. This makes mem_section twice. Wasting too much memory and not good for cache.
> >    But yes, you can put this under some CONFIG which has small number of mem_section[].
> >
>
> I think memory usage isn't a big deal. but for cache, we can move
> fields into just after section_mem_map.
>
I don't think so. This addtional field can eat up the amount of memory you saved
by unmap.

> >  2. This can't be help for a case where a section has multiple small holes.
>
> I agree. But this(not punched hole but not filled section problem)
> isn't such case. But it would be better to handle it altogether. :)
>
> >
> > Then, my proposal for HOLES_IN_MEMMAP sparsemem is below.
> > ==
> > Some architectures unmap memmap[] for memory holes even with SPARSEMEM.
> > To handle that, pfn_valid() should check there are really memmap or not.
> > For that purpose, __get_user() can be used.
>
> Look at free_unused_memmap. We don't unmap pte of hole memmap.
> Is __get_use effective, still?
>
__get_user() works with TLB and page table, the vaddr is really mapped or not.
If you got SEGV, __get_user() returns -EFAULT. It works per page granule.

Thanks,
-Kame

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Minchan Kim on
On Tue, Jul 13, 2010 at 1:23 PM, KAMEZAWA Hiroyuki
<kamezawa.hiroyu(a)jp.fujitsu.com> wrote:
> On Tue, 13 Jul 2010 13:11:14 +0900
> Minchan Kim <minchan.kim(a)gmail.com> wrote:
>
>> On Tue, Jul 13, 2010 at 12:19 PM, KAMEZAWA Hiroyuki
>> <kamezawa.hiroyu(a)jp.fujitsu.com> wrote:
>> > On Tue, 13 Jul 2010 00:53:48 +0900
>> > Minchan Kim <minchan.kim(a)gmail.com> wrote:
>> >
>> >> Kukjin, Could you test below patch?
>> >> I don't have any sparsemem system. Sorry.
>> >>
>> >> -- CUT DOWN HERE --
>> >>
>> >> Kukjin reported oops happen while he change min_free_kbytes
>> >> http://www.spinics.net/lists/arm-kernel/msg92894.html
>> >> It happen by memory map on sparsemem.
>> >>
>> >> The system has a memory map following as.
>> >> � � �section 0 � � � � � � section 1 � � � � � � �section 2
>> >> 0x20000000-0x25000000, 0x40000000-0x50000000, 0x50000000-0x58000000
>> >> SECTION_SIZE_BITS 28(256M)
>> >>
>> >> It means section 0 is an incompletely filled section.
>> >> Nontheless, current pfn_valid of sparsemem checks pfn loosely.
>> >>
>> >> It checks only mem_section's validation.
>> >> So in above case, pfn on 0x25000000 can pass pfn_valid's validation check.
>> >> It's not what we want.
>> >>
>> >> The Following patch adds check valid pfn range check on pfn_valid of sparsemem.
>> >>
>> >> Signed-off-by: Minchan Kim <minchan.kim(a)gmail.com>
>> >> Reported-by: Kukjin Kim <kgene.kim(a)samsung.com>
>> >>
>> >> P.S)
>> >> It is just RFC. If we agree with this, I will make the patch on mmotm.
>> >>
>> >> --
>> >>
>> >> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
>> >> index b4d109e..6c2147a 100644
>> >> --- a/include/linux/mmzone.h
>> >> +++ b/include/linux/mmzone.h
>> >> @@ -979,6 +979,8 @@ struct mem_section {
>> >> � � � � struct page_cgroup *page_cgroup;
>> >> � � � � unsigned long pad;
>> >> �#endif
>> >> + � � � unsigned long start_pfn;
>> >> + � � � unsigned long end_pfn;
>> >> �};
>> >>
>> >
>> > I have 2 concerns.
>> > �1. This makes mem_section twice. Wasting too much memory and not good for cache.
>> > � �But yes, you can put this under some CONFIG which has small number of mem_section[].
>> >
>>
>> I think memory usage isn't a big deal. but for cache, we can move
>> fields into just after section_mem_map.
>>
> I don't think so. This addtional field can eat up the amount of memory you saved
> by unmap.

Agree.

>
>> > �2. This can't be help for a case where a section has multiple small holes.
>>
>> I agree. But this(not punched hole but not filled section problem)
>> isn't such case. But it would be better to handle it altogether. :)
>>
>> >
>> > Then, my proposal for HOLES_IN_MEMMAP sparsemem is below.
>> > ==
>> > Some architectures unmap memmap[] for memory holes even with SPARSEMEM.
>> > To handle that, pfn_valid() should check there are really memmap or not.
>> > For that purpose, __get_user() can be used.
>>
>> Look at free_unused_memmap. We don't unmap pte of hole memmap.
>> Is __get_use effective, still?
>>
> __get_user() works with TLB and page table, the vaddr is really mapped or not.
> If you got SEGV, __get_user() returns -EFAULT. It works per page granule.

I mean following as.
For example, there is a struct page in on 0x20000000.

int pfn_valid_mapped(unsigned long pfn)
{
struct page *page = pfn_to_page(pfn); /* hole page is 0x2000000 */
char *lastbyte = (char *)(page+1)-1; /* lastbyte is 0x2000001f */
char byte;

/* We pass this test since free_unused_memmap doesn't unmap pte */
if(__get_user(byte, page) != 0)
return 0;
/*
* (0x20000000 & PAGE_MASK) == (0x2000001f & PAGE_MASK)
* So, return 1, it is wrong result.
*/
if ((((unsigned long)page) & PAGE_MASK) ==
(((unsigned long)lastbyte) & PAGE_MASK))
return 1;
return (__get_user(byte,lastbyte) == 0);
}

Am I missing something?


--
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/