From: Xiao Guangrong on


Marcelo Tosatti wrote:
> On Tue, Jul 13, 2010 at 05:45:27PM +0800, Xiao Guangrong wrote:
>> In speculative path, the page is not real write-access, no need mark it
>> dirty, so clear dirty bit in this path and later examine this bit when
>> we release the page
>>
>> Signed-off-by: Xiao Guangrong <xiaoguangrong(a)cn.fujitsu.com>
>> ---
>> arch/x86/kvm/mmu.c | 24 +++++++++++-------------
>> 1 files changed, 11 insertions(+), 13 deletions(-)
>
> Unfortunately all pages that kvm creates translations for are marked
> dirty due to get_user_pages(w=1), except KSM which makes them read-only
> later.

Marcelo, i have looked into get_user_pages() function, but not catch where
to make page dirty, could you point it out for me? :-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Xiao Guangrong on


Xiao Guangrong wrote:

>
> spte = rmap_next(kvm, rmapp, NULL);
> @@ -1879,9 +1877,9 @@ static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
> * whether the guest actually used the pte (in order to detect
> * demand paging).
> */
> - spte = shadow_base_present_pte | shadow_dirty_mask;
> + spte = shadow_base_present_pte;
> if (!speculative)
> - spte |= shadow_accessed_mask;
> + spte |= shadow_accessed_mask | shadow_dirty_mask;

It breaks read-only shadow page, i'll update it... please ignore this version...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Xiao Guangrong on


Marcelo Tosatti wrote:
> On Wed, Jul 14, 2010 at 09:24:22AM +0800, Xiao Guangrong wrote:
>>
>> Marcelo Tosatti wrote:
>>> On Tue, Jul 13, 2010 at 05:45:27PM +0800, Xiao Guangrong wrote:
>>>> In speculative path, the page is not real write-access, no need mark it
>>>> dirty, so clear dirty bit in this path and later examine this bit when
>>>> we release the page
>>>>
>>>> Signed-off-by: Xiao Guangrong <xiaoguangrong(a)cn.fujitsu.com>
>>>> ---
>>>> arch/x86/kvm/mmu.c | 24 +++++++++++-------------
>>>> 1 files changed, 11 insertions(+), 13 deletions(-)
>>> Unfortunately all pages that kvm creates translations for are marked
>>> dirty due to get_user_pages(w=1), except KSM which makes them read-only
>>> later.
>> Marcelo, i have looked into get_user_pages() function, but not catch where
>> to make page dirty, could you point it out for me? :-)
>
> See set_page_dirty call in mm/memory.c::follow_page.

Yeah, you are right, and i want to use another way to do it since track dirty bit
is too heavy, also it's dangerous if we miss to set page dirty.

How about just track access bit for speculative path, we set page both accessed and
dirty(if it's writable) only if the access bit is set?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Avi Kivity on
On 07/14/2010 05:06 PM, Marcelo Tosatti wrote:
>
>> Yeah, you are right, and i want to use another way to do it since track dirty bit
>> is too heavy, also it's dangerous if we miss to set page dirty.
>>
>> How about just track access bit for speculative path, we set page both accessed and
>> dirty(if it's writable) only if the access bit is set?
>>
> A useful thing to do would be to allow read-only mappings, in the fault
> path (Lai sent a few patches in that direction sometime ago but there
> was no follow up).
>
> So in the case of a read-only fault from the guest, you'd inform
> get_user_pages() that read-only access is acceptable (so swapcache pages
> can be mapped, or qemu can mprotect(PROT_READ) guest memory).
>

I'd like get_user_pages_ptes_fast(), that returns the pte along with the
page. It can be used for a couple of purposes:

- on read faults or speculative mappings, ask for read access, but allow
write if the pte is writeable
- stick the page size into free bits (need 2 to support 1G pages) to
speed up host_mapping_level()

--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Xiao Guangrong on


Marcelo Tosatti wrote:

>> How about just track access bit for speculative path, we set page both accessed and
>> dirty(if it's writable) only if the access bit is set?
>
> A useful thing to do would be to allow read-only mappings, in the fault
> path (Lai sent a few patches in that direction sometime ago but there
> was no follow up).
>
> So in the case of a read-only fault from the guest, you'd inform
> get_user_pages() that read-only access is acceptable (so swapcache pages
> can be mapped, or qemu can mprotect(PROT_READ) guest memory).
>

Yeah, it's a great work, i guess Lai will post the new version soon.

And, even we do this, i think the page dirty track is still needed, right?
Then, how about my new idea to track page dirty for speculative path, just
as below draft patch does:

@@ -687,10 +687,11 @@ static void drop_spte(struct kvm *kvm, u64 *sptep, u64 new_spte)
if (!is_rmap_spte(old_spte))
return;
pfn = spte_to_pfn(old_spte);
- if (old_spte & shadow_accessed_mask)
+ if (old_spte & shadow_accessed_mask) {
kvm_set_pfn_accessed(pfn);
- if (is_writable_pte(old_spte))
- kvm_set_pfn_dirty(pfn);
+ if (is_writable_pte(old_spte))
+ kvm_set_pfn_dirty(pfn);
+ }
rmap_remove(kvm, sptep);
}

@@ -1920,8 +1921,11 @@ static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
* demand paging).
*/
spte = shadow_base_present_pte | shadow_dirty_mask;
- if (!speculative)
+ if (!speculative) {
spte |= shadow_accessed_mask;
+ if (is_writable_pte(*sptep))
+ kvm_set_pfn_dirty(pfn);
+ }
if (!dirty)
pte_access &= ~ACC_WRITE_MASK;
if (pte_access & ACC_EXEC_MASK)

It uses access bit to track both page accessed and page dirty, and it's rather cheap...



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/