From: Michael Kerrisk on
Hi Andi,

On Tue, Dec 8, 2009 at 11:16 PM, Andi Kleen <andi(a)firstfloor.org> wrote:
>
> Process based injection is much easier to handle for test programs,
> who can first bring a page into a specific state and then test.
> So add a new MADV_SOFT_OFFLINE to soft offline a page, similar
> to the existing hard offline injector.

I see that this made its way into 2.6.33. Could you write a short
piece on it for the madvise.2 man page?

Thanks,

Michael


> Signed-off-by: Andi Kleen <ak(a)linux.intel.com>
>
> ---
> �include/asm-generic/mman-common.h | � �1 +
> �mm/madvise.c � � � � � � � � � � �| � 15 ++++++++++++---
> �2 files changed, 13 insertions(+), 3 deletions(-)
>
> Index: linux/include/asm-generic/mman-common.h
> ===================================================================
> --- linux.orig/include/asm-generic/mman-common.h
> +++ linux/include/asm-generic/mman-common.h
> @@ -35,6 +35,7 @@
> �#define MADV_DONTFORK �10 � � � � � � �/* don't inherit across fork */
> �#define MADV_DOFORK � �11 � � � � � � �/* do inherit across fork */
> �#define MADV_HWPOISON �100 � � � � � � /* poison a page for testing */
> +#define MADV_SOFT_OFFLINE 101 � � � � �/* soft offline page for testing */
>
> �#define MADV_MERGEABLE � 12 � � � � � �/* KSM may merge identical pages */
> �#define MADV_UNMERGEABLE 13 � � � � � �/* KSM may not merge identical pages */
> Index: linux/mm/madvise.c
> ===================================================================
> --- linux.orig/mm/madvise.c
> +++ linux/mm/madvise.c
> @@ -9,6 +9,7 @@
> �#include <linux/pagemap.h>
> �#include <linux/syscalls.h>
> �#include <linux/mempolicy.h>
> +#include <linux/page-isolation.h>
> �#include <linux/hugetlb.h>
> �#include <linux/sched.h>
> �#include <linux/ksm.h>
> @@ -222,7 +223,7 @@ static long madvise_remove(struct vm_are
> �/*
> �* Error injection support for memory error handling.
> �*/
> -static int madvise_hwpoison(unsigned long start, unsigned long end)
> +static int madvise_hwpoison(int bhv, unsigned long start, unsigned long end)
> �{
> � � � �int ret = 0;
>
> @@ -233,6 +234,14 @@ static int madvise_hwpoison(unsigned lon
> � � � � � � � �int ret = get_user_pages_fast(start, 1, 0, &p);
> � � � � � � � �if (ret != 1)
> � � � � � � � � � � � �return ret;
> + � � � � � � � if (bhv == MADV_SOFT_OFFLINE) {
> + � � � � � � � � � � � printk(KERN_INFO "Soft offlining page %lx at %lx\n",
> + � � � � � � � � � � � � � � � page_to_pfn(p), start);
> + � � � � � � � � � � � ret = soft_offline_page(p, MF_COUNT_INCREASED);
> + � � � � � � � � � � � if (ret)
> + � � � � � � � � � � � � � � � break;
> + � � � � � � � � � � � continue;
> + � � � � � � � }
> � � � � � � � �printk(KERN_INFO "Injecting memory failure for page %lx at %lx\n",
> � � � � � � � � � � � page_to_pfn(p), start);
> � � � � � � � �/* Ignore return value for now */
> @@ -333,8 +342,8 @@ SYSCALL_DEFINE3(madvise, unsigned long,
> � � � �size_t len;
>
> �#ifdef CONFIG_MEMORY_FAILURE
> - � � � if (behavior == MADV_HWPOISON)
> - � � � � � � � return madvise_hwpoison(start, start+len_in);
> + � � � if (behavior == MADV_HWPOISON || behavior == MADV_SOFT_OFFLINE)
> + � � � � � � � return madvise_hwpoison(behavior, start, start+len_in);
> �#endif
> � � � �if (!madvise_behavior_valid(behavior))
> � � � � � � � �return error;
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo(a)kvack.org. �For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont(a)kvack.org"> email(a)kvack.org </a>
>



--
Michael Kerrisk Linux man-pages maintainer;
http://www.kernel.org/doc/man-pages/
Author of "The Linux Programming Interface", http://blog.man7.org/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Michael Kerrisk on
Hi Andi,

Thanks for this. Some comments below.

On Sat, Jun 19, 2010 at 3:20 PM, Andi Kleen <andi(a)firstfloor.org> wrote:
> On Sat, Jun 19, 2010 at 02:36:28PM +0200, Michael Kerrisk wrote:
>> Hi Andi,
>>
>> On Tue, Dec 8, 2009 at 11:16 PM, Andi Kleen <andi(a)firstfloor.org> wrote:
>> >
>> > Process based injection is much easier to handle for test programs,
>> > who can first bring a page into a specific state and then test.
>> > So add a new MADV_SOFT_OFFLINE to soft offline a page, similar
>> > to the existing hard offline injector.
>>
>> I see that this made its way into 2.6.33. Could you write a short
>> piece on it for the madvise.2 man page?
>
> Also fixed the previous snippet slightly.

(thanks)

> commit edb43354f0ffc04bf4f23f01261f9ea9f43e0d3d
> Author: Andi Kleen <ak(a)linux.intel.com>
> Date: � Sat Jun 19 15:19:28 2010 +0200
>
> � �MADV_SOFT_OFFLINE
>
> � �Signed-off-by: Andi Kleen <ak(a)linux.intel.com>
>
> diff --git a/man2/madvise.2 b/man2/madvise.2
> index db29feb..9dccd97 100644
> --- a/man2/madvise.2
> +++ b/man2/madvise.2
> @@ -154,7 +154,15 @@ processes.
> �This operation may result in the calling process receiving a
> �.B SIGBUS
> �and the page being unmapped.
> -This feature is intended for memory testing.
> +This feature is intended for testing of memory error handling code.
> +This feature is only available if the kernel was configured with
> +.BR CONFIG_MEMORY_FAILURE .
> +.TP
> +.BR MADV_SOFT_OFFLINE " (Since Linux 2.6.33)
> +Soft offline a page. This will result in the memory of the page
> +being copied to a new page and original page be offlined. The operation

Can you explain the term "offlined" please.

> +should be transparent to the calling process.

Does "should be transparent" mean "is normally invisible"?

Thanks,

Michael

> +This feature is intended for testing of memory error handling code.
> �This feature is only available if the kernel was configured with
> �.BR CONFIG_MEMORY_FAILURE .
> �.TP
>
>



--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Author of "The Linux Programming Interface" http://blog.man7.org/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Michael Kerrisk on
Hi Andi,

On Sat, Jun 19, 2010 at 3:30 PM, Andi Kleen <andi(a)firstfloor.org> wrote:
> On Sat, Jun 19, 2010 at 03:25:16PM +0200, Michael Kerrisk wrote:
>> Hi Andi,
>>
>> Thanks for this. Some comments below.
>>
>> On Sat, Jun 19, 2010 at 3:20 PM, Andi Kleen <andi(a)firstfloor.org> wrote:
>> > On Sat, Jun 19, 2010 at 02:36:28PM +0200, Michael Kerrisk wrote:
>> >> Hi Andi,
>> >>
>> >> On Tue, Dec 8, 2009 at 11:16 PM, Andi Kleen <andi(a)firstfloor.org> wrote:
>> >> >
>> >> > Process based injection is much easier to handle for test programs,
>> >> > who can first bring a page into a specific state and then test.
>> >> > So add a new MADV_SOFT_OFFLINE to soft offline a page, similar
>> >> > to the existing hard offline injector.
>> >>
>> >> I see that this made its way into 2.6.33. Could you write a short
>> >> piece on it for the madvise.2 man page?
>> >
>> > Also fixed the previous snippet slightly.
>>
>> (thanks)
>>
>> > commit edb43354f0ffc04bf4f23f01261f9ea9f43e0d3d
>> > Author: Andi Kleen <ak(a)linux.intel.com>
>> > Date: � Sat Jun 19 15:19:28 2010 +0200
>> >
>> > � �MADV_SOFT_OFFLINE
>> >
>> > � �Signed-off-by: Andi Kleen <ak(a)linux.intel.com>
>> >
>> > diff --git a/man2/madvise.2 b/man2/madvise.2
>> > index db29feb..9dccd97 100644
>> > --- a/man2/madvise.2
>> > +++ b/man2/madvise.2
>> > @@ -154,7 +154,15 @@ processes.
>> > �This operation may result in the calling process receiving a
>> > �.B SIGBUS
>> > �and the page being unmapped.
>> > -This feature is intended for memory testing.
>> > +This feature is intended for testing of memory error handling code.
>> > +This feature is only available if the kernel was configured with
>> > +.BR CONFIG_MEMORY_FAILURE .
>> > +.TP
>> > +.BR MADV_SOFT_OFFLINE " (Since Linux 2.6.33)
>> > +Soft offline a page. This will result in the memory of the page
>> > +being copied to a new page and original page be offlined. The operation
>>
>> Can you explain the term "offlined" please.
>
> The memory is not used anymore and taken out of normal
> memory management (until unpoisoned)

Is there a userspace operation to unpoison (i.e., reverse MADV_SOFT_OFFLINE)?

I ask because I wondered if there is something additional to be documented.

> and the "HardwareCorrupted:" counter in /proc/meminfo increases
>
> (don't put the later in, I'm thinking about changing that)

Okay.

>>
>> > +should be transparent to the calling process.
>>
>> Does "should be transparent" mean "is normally invisible"?
>
> Yes. It's similar to being swapped out and swapped in again.

Okay.

Thanks,

Michael


--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Author of "The Linux Programming Interface" http://blog.man7.org/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Michael Kerrisk on
Hi Andi,

On Sat, Jun 19, 2010 at 4:09 PM, Andi Kleen <andi(a)firstfloor.org> wrote:
> On Sat, Jun 19, 2010 at 03:43:28PM +0200, Michael Kerrisk wrote:
>> Is there a userspace operation to unpoison (i.e., reverse MADV_SOFT_OFFLINE)?
>
> Yes, but it's only a debugfs interface currently.

Okay -- thanks.

>> I ask because I wondered if there is something additional to be documented.
>
> I don't think debugfs needs manpages atm.

Okay.

I edited your text somewhat. Could you please review the below.

Cheers,

Michael

..TP
..BR MADV_SOFT_OFFLINE " (Since Linux 2.6.33)
Soft offline the pages in the range specified by
..I addr
and
..IR length .
This memory of each page in the specified range is copied to a new page,
and the original page is offlined
(i.e., no longer used, and taken out of normal memory management).
The effect of the
..B MADV_SOFT_OFFLINE
operation is normally invisible to (i.e., does not change the semantics of)
the calling process.
This feature is intended for testing of memory error-handling code;
it is only available if the kernel was configured with
..BR CONFIG_MEMORY_FAILURE .
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Michael Kerrisk on
Hi Andi,
On Sat, Jun 19, 2010 at 9:52 PM, Andi Kleen <andi(a)firstfloor.org> wrote:
>> .TP
>> .BR MADV_SOFT_OFFLINE " (Since Linux 2.6.33)
>> Soft offline the pages in the range specified by
>> .I addr
>> and
>> .IR length .
>> This memory of each page in the specified range is copied to a new page,
>
> Actually there are some cases where it's also dropped if it's cached page.
>
> Perhaps better would be something more fuzzy like
>
> "the contents are preserved"

The problem to me is that this gets so fuzzy that it's hard to
understand the meaning (I imagine many readers will ask: "What does it
mean that the contents are preserved"?). Would you be able to come up
with a wording that is a little miore detailed?

>> and the original page is offlined
>> (i.e., no longer used, and taken out of normal memory management).

Thanks,

Michael


--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Author of "The Linux Programming Interface" http://blog.man7.org/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/