From: Avi Kivity on
On 07/18/2010 06:23 PM, Gleb Natapov wrote:
> On Sun, Jul 18, 2010 at 06:14:11PM +0300, Avi Kivity wrote:
>
>> On 07/17/2010 07:31 AM, Gleb Natapov wrote:
>>
>>>>> Currently pages allocated for guest memory are required to be RW, so after your series
>>>>> behaviour will remain exactly the same as before.
>>>>>
>>>> Except KSM pages.
>>>>
>>>>
>>> KSM page will be COWed by __get_user_pages_fast(addr, 1, 1, page) in
>>> get_user_page_and_protection() just like it COWed now, no?
>>>
>> Well, we don't want to COW it on write faults.
>>

I meant read faults here.

>> The optimal behaviour is:
>>
>> - write faults: COW and instantiate a writeable spte
>>
> So do we or don't we want to COW on write faults?
>

We do (no choice).

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Lai Jiangshan on
On 07/16/2010 03:19 PM, Gleb Natapov wrote:

>> +/* get a current mapped page fast, and test whether the page is writable. */
>> +static struct page *get_user_page_and_protection(unsigned long addr,
>> + int *writable)
>> +{
>> + struct page *page[1];
>> +
>> + if (__get_user_pages_fast(addr, 1, 1, page) == 1) {
>> + *writable = 1;
>> + return page[0];
>> + }
>> + if (__get_user_pages_fast(addr, 1, 0, page) == 1) {
>> + *writable = 0;
>> + return page[0];
>> + }
>> + return NULL;
>> +}
>> +
>> +static pfn_t kvm_get_pfn_for_page_fault(struct kvm *kvm, gfn_t gfn,
>> + int write_fault, int *host_writable)
>> +{
>> + unsigned long addr;
>> + struct page *page;
>> +
>> + if (!write_fault) {
>> + addr = gfn_to_hva(kvm, gfn);
>> + if (kvm_is_error_hva(addr)) {
>> + get_page(bad_page);
>> + return page_to_pfn(bad_page);
>> + }
>> +
>> + page = get_user_page_and_protection(addr, host_writable);
>> + if (page)
>> + return page_to_pfn(page);
>> + }
>> +
>> + *host_writable = 1;
>> + return kvm_get_pfn_for_gfn(kvm, gfn);
>> +}
>> +
> kvm_get_pfn_for_gfn() returns fault_page if page is mapped RO, so caller
> of kvm_get_pfn_for_page_fault() and kvm_get_pfn_for_gfn() will get
> different results when called on the same page. Not good.
> kvm_get_pfn_for_page_fault() logic should be folded into
> kvm_get_pfn_for_gfn().
>


The different results are the things we just need.
We don't want to copy and write a page which is mapped RO when
only read fault.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Gleb Natapov on
On Thu, Jul 29, 2010 at 10:15:22AM +0800, Lai Jiangshan wrote:
> On 07/16/2010 03:19 PM, Gleb Natapov wrote:
>
> >> +/* get a current mapped page fast, and test whether the page is writable. */
> >> +static struct page *get_user_page_and_protection(unsigned long addr,
> >> + int *writable)
> >> +{
> >> + struct page *page[1];
> >> +
> >> + if (__get_user_pages_fast(addr, 1, 1, page) == 1) {
> >> + *writable = 1;
> >> + return page[0];
> >> + }
> >> + if (__get_user_pages_fast(addr, 1, 0, page) == 1) {
> >> + *writable = 0;
> >> + return page[0];
> >> + }
> >> + return NULL;
> >> +}
> >> +
> >> +static pfn_t kvm_get_pfn_for_page_fault(struct kvm *kvm, gfn_t gfn,
> >> + int write_fault, int *host_writable)
> >> +{
> >> + unsigned long addr;
> >> + struct page *page;
> >> +
> >> + if (!write_fault) {
> >> + addr = gfn_to_hva(kvm, gfn);
> >> + if (kvm_is_error_hva(addr)) {
> >> + get_page(bad_page);
> >> + return page_to_pfn(bad_page);
> >> + }
> >> +
> >> + page = get_user_page_and_protection(addr, host_writable);
> >> + if (page)
> >> + return page_to_pfn(page);
> >> + }
> >> +
> >> + *host_writable = 1;
> >> + return kvm_get_pfn_for_gfn(kvm, gfn);
> >> +}
> >> +
> > kvm_get_pfn_for_gfn() returns fault_page if page is mapped RO, so caller
> > of kvm_get_pfn_for_page_fault() and kvm_get_pfn_for_gfn() will get
> > different results when called on the same page. Not good.
> > kvm_get_pfn_for_page_fault() logic should be folded into
> > kvm_get_pfn_for_gfn().
> >
>
>
> The different results are the things we just need.
How so? Users of kvm_get_pfn_for_gfn() will think that page is invalid
and may report misconfiguration to userspace and users of
kvm_get_pfn_for_page_fault() will think that the access to page is OK.
There are no many users of kvm_get_pfn_for_gfn() and may be your patch
replace all of them with kvm_get_pfn_for_page_fault(), but this just
strengthen the point that they should be merged.

> We don't want to copy and write a page which is mapped RO when
> only read fault.
I don't see how returning inconsistent results helps us achieving that.

BTW since kvm_get_pfn_for_gfn() will never map RO page
get_user_page_and_protection() will never find any RO pages. Looks like
kvm_get_pfn_for_page_fault() is equivalent to kvm_get_pfn_for_gfn()
since !write_fault section will at best find mapped RW page.

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/