From: Linus Torvalds on


On Fri, 9 Apr 2010, Johannes Weiner wrote:
> + /*
> + * 1. case: vma and next are split parts of one root vma.
> + * Their anon_vma_chain is equal and we can drop that of next.
> + *
> + * 2. case: one vma was instantiated as mergeable with the
> + * other one and inherited the other one's primary anon_vma as
> + * the singleton in its chain.
> + *
> + * If next came after vma, vma's chain is already an unstrict
> + * superset of next's and we can treat it like case 1.
> + *
> + * If vma has the singleton chain, we have to copy next's
> + * unique anon_vmas over.
> + */

This comment makes my head hurt. In fact, the whole anon_vma thing hurts
my head.

Can we have some better high-level documentation on what happens for all
the cases.

- split (mprotect, or munmap in the middle):

anon_vma_clone: the two vma's will have the same anon_vma, and the
anon_vma chains will be equivalent.

- merge (mprotect that creates a mergeable state):

anon_vma_merge: we're supposed to have a anon_vma_chain that is
a superset of the two chains of the merged entries.

- fork:

anon_vma_fork: each new vma will have a _new_ anon_vma as it's
primary one, and will link to the old primary trough the
anon_vma_chain. It's doing this with a anon_vma_clone() followed
by adding an entra entry to the new anon_vma, and setting
vma->anon_vma to the new one.

- create/mmap:

anon_vma_prepare: find a mergeable anon_vma and use that as a
singleton, because the other entries on the anon_vma chain won't
matter, since they cannot be associated with any pages associated
with the newly created vma..

Correct?

Quite frankly, just looking at that, I can't see how we get to your rules.
At least not trivially. Especially with multiple merges, I don't see
how "singleton" is such a special case.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Rik van Riel on
On 04/09/2010 07:22 PM, Linus Torvalds wrote:
>
>
> On Fri, 9 Apr 2010, Johannes Weiner wrote:
>> + /*
>> + * 1. case: vma and next are split parts of one root vma.
>> + * Their anon_vma_chain is equal and we can drop that of next.
>> + *
>> + * 2. case: one vma was instantiated as mergeable with the
>> + * other one and inherited the other one's primary anon_vma as
>> + * the singleton in its chain.
>> + *
>> + * If next came after vma, vma's chain is already an unstrict
>> + * superset of next's and we can treat it like case 1.
>> + *
>> + * If vma has the singleton chain, we have to copy next's
>> + * unique anon_vmas over.
>> + */
>
> This comment makes my head hurt. In fact, the whole anon_vma thing hurts
> my head.
>
> Can we have some better high-level documentation on what happens for all
> the cases.
>
> - split (mprotect, or munmap in the middle):
>
> anon_vma_clone: the two vma's will have the same anon_vma, and the
> anon_vma chains will be equivalent.
>
> - merge (mprotect that creates a mergeable state):
>
> anon_vma_merge: we're supposed to have a anon_vma_chain that is
> a superset of the two chains of the merged entries.
>
> - fork:
>
> anon_vma_fork: each new vma will have a _new_ anon_vma as it's
> primary one, and will link to the old primary trough the
> anon_vma_chain. It's doing this with a anon_vma_clone() followed
> by adding an entra entry to the new anon_vma, and setting
> vma->anon_vma to the new one.
>
> - create/mmap:
>
> anon_vma_prepare: find a mergeable anon_vma and use that as a
> singleton, because the other entries on the anon_vma chain won't
> matter, since they cannot be associated with any pages associated
> with the newly created vma..
>
> Correct?

This is indeed correct.

> Quite frankly, just looking at that, I can't see how we get to your rules.
> At least not trivially. Especially with multiple merges, I don't see
> how "singleton" is such a special case.

The trick is in the fact that anon_vma_merge is only called
when vma->anon_vma == vma1->anon_vma.

If the top anon_vmas are different, then anon_vma_merge will
not be called.

This means that VMAs which have recently passed through fork
will not be passed to anon_vma_merge, because their top
anon_vmas are different.

That leaves just the split & create cases, which will be
passed to anon_vma_merge when they are merged.

In case of split, they will have identical anon_vma chains.

In case of create + merge, one of the two VMAs will have
the whole anon_vma chain, while the other one has just
the top anon_vma.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Linus Torvalds on


On Fri, 9 Apr 2010, Linus Torvalds wrote:
>
> Can we have some better high-level documentation on what happens for all
> the cases.
>
> - split (mprotect, or munmap in the middle):
>
> anon_vma_clone: the two vma's will have the same anon_vma, and the
> anon_vma chains will be equivalent.
>
> - merge (mprotect that creates a mergeable state):
>
> anon_vma_merge: we're supposed to have a anon_vma_chain that is
> a superset of the two chains of the merged entries.
>
> - fork:
>
> anon_vma_fork: each new vma will have a _new_ anon_vma as it's
> primary one, and will link to the old primary trough the
> anon_vma_chain. It's doing this with a anon_vma_clone() followed
> by adding an entra entry to the new anon_vma, and setting
> vma->anon_vma to the new one.
>
> - create/mmap:
>
> anon_vma_prepare: find a mergeable anon_vma and use that as a
> singleton, because the other entries on the anon_vma chain won't
> matter, since they cannot be associated with any pages associated
> with the newly created vma..
>
> Correct?

Ok, so I don't know if the above is correct, but if it is, let's ignore
the "merge" case as being complex, and look at the other cases.

With fork, the main anon_vma becomes different, so let's ignore that. That
always means that the resulting list is not comparable or compatible, and
we'll never mix them up.

If we make one very _simple_ rule for the create/mmap case, namely that we
only re-use another _singleton_ anon_vma, then split and create case will
look exactly the same. And in particular, we get a very simple and
powerful rule: if the anon_vma matches, then the _list_ will also always
match.

And that, in turn, would make 'merge' trivial too: you really can always
drop the side that goes away. There's never any question about how to
merge the lists, or which to pick, because every single operation that
leaves the anon_vma the same will guarantee that the list will be
identical too.

So now the simple rule is that if the anon_vma is the same, then the list
of associated anon_vma's will always be the same - across all of merge,
split and create.

Isn't that a _much_ simpler model to think about?

So _instead_ of all the patches that have floated about, I would suggest
this simple change to "find_mergeable_anon_vma()" instead..

Oh, and maybe it's the meds talking again. I'm feeling better than
yesterday, but am still a bit lightheaded.

Linus

---
mm/mmap.c | 6 ++++--
1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/mm/mmap.c b/mm/mmap.c
index 75557c6..462a8ca 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -850,7 +850,8 @@ struct anon_vma *find_mergeable_anon_vma(struct vm_area_struct *vma)
vm_flags = vma->vm_flags & ~(VM_READ|VM_WRITE|VM_EXEC);
vm_flags |= near->vm_flags & (VM_READ|VM_WRITE|VM_EXEC);

- if (near->anon_vma && vma->vm_end == near->vm_start &&
+ if (near->anon_vma && list_is_singular(&near->anon_vma_chain) &&
+ vma->vm_end == near->vm_start &&
mpol_equal(vma_policy(vma), vma_policy(near)) &&
can_vma_merge_before(near, vm_flags,
NULL, vma->vm_file, vma->vm_pgoff +
@@ -871,7 +872,8 @@ try_prev:
vm_flags = vma->vm_flags & ~(VM_READ|VM_WRITE|VM_EXEC);
vm_flags |= near->vm_flags & (VM_READ|VM_WRITE|VM_EXEC);

- if (near->anon_vma && near->vm_end == vma->vm_start &&
+ if (near->anon_vma && list_is_singular(&near->anon_vma_chain) &&
+ near->vm_end == vma->vm_start &&
mpol_equal(vma_policy(near), vma_policy(vma)) &&
can_vma_merge_after(near, vm_flags,
NULL, vma->vm_file, vma->vm_pgoff))
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Rik van Riel on
On 04/09/2010 08:03 PM, Linus Torvalds wrote:

> This is why I suggest that we limit the "re-use an existing vma for a new
> case" to the singleton case, which means that now you _never_ have
> differences at all. There's no spreading on splitting. Merging is trivial.

That looks like it should work.

> Now, admittedly, I'm really hopped up on cough medication, so the feeling
> of this solving all the problems in the universe may not be entirely
> accurate. But it feels so _right_.
>
> I hope if feels right when I'm off my meds too.

I am not on any cough meds, and your patch looks right.
OTOH, maybe I should be on some kind of cold meds, because
I haven't been feeling right all week...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Rik van Riel on
On 04/09/2010 07:56 PM, Linus Torvalds wrote:

> So _instead_ of all the patches that have floated about, I would suggest
> this simple change to "find_mergeable_anon_vma()" instead..

Boris, this is your chance to really ruin our week :)

If the bug persists with Linus's patch, we've been fixing
the wrong bug all week long, and you are experiencing
something else...

I'm getting really curious now.

> ---
> mm/mmap.c | 6 ++++--
> 1 files changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 75557c6..462a8ca 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -850,7 +850,8 @@ struct anon_vma *find_mergeable_anon_vma(struct vm_area_struct *vma)
> vm_flags = vma->vm_flags& ~(VM_READ|VM_WRITE|VM_EXEC);
> vm_flags |= near->vm_flags& (VM_READ|VM_WRITE|VM_EXEC);
>
> - if (near->anon_vma&& vma->vm_end == near->vm_start&&
> + if (near->anon_vma&& list_is_singular(&near->anon_vma_chain)&&
> + vma->vm_end == near->vm_start&&
> mpol_equal(vma_policy(vma), vma_policy(near))&&
> can_vma_merge_before(near, vm_flags,
> NULL, vma->vm_file, vma->vm_pgoff +
> @@ -871,7 +872,8 @@ try_prev:
> vm_flags = vma->vm_flags& ~(VM_READ|VM_WRITE|VM_EXEC);
> vm_flags |= near->vm_flags& (VM_READ|VM_WRITE|VM_EXEC);
>
> - if (near->anon_vma&& near->vm_end == vma->vm_start&&
> + if (near->anon_vma&& list_is_singular(&near->anon_vma_chain)&&
> + near->vm_end == vma->vm_start&&
> mpol_equal(vma_policy(near), vma_policy(vma))&&
> can_vma_merge_after(near, vm_flags,
> NULL, vma->vm_file, vma->vm_pgoff))

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/