From: Rik van Riel on
On 04/06/2010 06:09 AM, KOSAKI Motohiro wrote:
>> (b) is also impossible. SLAB_DESTROY_BY_RCU delay the page for anon_vma
>> freeing until next rcu period. It mean rcu_read_lock()+page_mapped()
>> can see kfree()ed page. but it is safe. noone corrupt it.
>
> by the way: I haven't understand why rik's per process anon_vma concept
> works correctly with ksm. ksm increase anon_vma->ksm_refcount. but it seems
> not guranteed vma->anon_vma and page->anon_vma are the same.

KSM removes the page from its original anon_vma.

If the page gets reinstantiated (copy on write), it will be
created in the vma->anon_vma.

Am I overlooking something?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Rik van Riel on
On 04/06/2010 04:53 AM, KOSAKI Motohiro wrote:

> Today, I've reviewed this patch carefully. but I haven't found any bug.

> Also, I've runned stress workload with shrink_all_memory() today. but
> I couldn't reproduce the issue. hmm.. (perhaps I'm no lucky guy.
> I'm frequently fail to reproduce)
>
> I'll continue to work.

My status with this bug is the same - I have gone through
the code from all angles, but have not found any other bugs
yet (except for that leak - which could leave invalid
pointers behind).

This makes me wonder if perhaps the bug is a side effect
of something Borislav (and the other reproducers) have
in their kernel configuration, which we do not have.

Another (unlikely) thing is that the fix for the leak
makes the bug go away. Yes, very unlikely.

Borislav, could you please send us your .config ?

Also, if you have the time, could you try out the
patch (-v2) I mailed in a little up this thread
that fixes the memory leak in anon_vma_fork?

I suspect it should not change anything, but it
could be useful to rule out anyway.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Rik van Riel on
On 04/06/2010 11:34 AM, Minchan Kim wrote:

> Let's see the unlink_anon_vmas.
>
> 1. list_for_each_entry_safe(avc,next, vma->anon_vma_chain, same_vma)
> 2. anon_vma_unlink
> 3. spin_lock(anon_vma->lock)<-- HERE LOCK.
> 4. list_del(anon_vma_chain->same_anon_vma);
>
> What if anon_vma is destroyed and reuse by SLAB_XXX_RCU for another
> anon_vma object between 2 and 3?
> I mean how to make sure 3) does lock valid anon_vma?
>
> I hope it is culprit.

How can the anon_vma get destroyed and reused, when this
anon_vma_chain still has a reference to it (and the
anon_vma has not been freed yet)?

What combination of circumstances is necessary for
your bug hypothetical to happen?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Rik van Riel on
On 04/06/2010 12:53 PM, Linus Torvalds wrote:
> On Wed, 7 Apr 2010, Minchan Kim wrote:
>>
>> unmap_and_move
>> remove_migration_ptes
>> rmap_walk
>> rmap_walk_anon
>>
>> We always has rcu_read_lock about anon page in unmap_and_move.
>> So I think it's not buggy. What am I missing?
>
> Ok, in that case it's fine.
>
> However, it does bring back my comment about all those anonvma changes:
> the locking is totally undocumented.
>
> Why isn't there a thing _saying_ that it's ok because of this?
>
> Why is there no comment about the locking of that 'same_vma' /
> 'vma->anon_vma_chain' except for the totally nonsensical one about
> page_table_lock (which doesn't protect _any_ of the other cases)?

Which other cases? When do we ever walk the "same_vma" list
not from the context of the process owning the vma?

This bug in page_referenced is walking the "same_anon_vma" list,
which is locked with the anon_vma->lock.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Rik van Riel on
On 04/06/2010 05:27 PM, Linus Torvalds wrote:

> I still don't see _how_ it happens, though. That 'struct anon_vma' is very
> simple, and contains literally just the lock and that list_head.

It gets more fun. It looks like the anon_vma is only
allocated through anon_vma_alloc() and only handled
by the functions in rmap.c

By themselves, all of those functions look alright.

However, I think I may have found a possible bug in
the interplay between anon_vma_prepare() and vma_adjust(),
across several mprotect invocations.

Let me explain what I think may be going on in small
steps, since it is quite subtle (assuming I am right).

1) a process forks, creating a second "layer" of
anon_vma objects for the VMAs that have anon pages

2) a new VMA is created adjacant to an existing one,
with different permissions

3) anon_vma_prepare is called on the new VMA, this
only links the "top" anon_vma to the new VMA, since
that is the anon_vma where all new pages get
instantiated anyway (this would be part of the bug)

4) mprotect changes the permission of one of the VMAs,
causing the old and the new VMAs to get merged

5) vma_adjust calls anon_vma_merge, causing the anon_vma
chain of one of the VMAs to get nuked - with bad luck,
this is the original one, leaving just the new anon_vma
attached to the VMA

6) if the parent process quits, the old anon_vma structs
get freed

7) meanwhile, we may still have some anonymous pages
stick around in memory that have their page->mapping
point to a freed anon_vma struct

Does this look like it could happen?

If so, I'll cook up a patch to change anon_vma_prepare
and find_mergeable_anon_vma to attach the whole chain
of anon_vmas to the new VMA, using anon_vma_clone().
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/