KVM: MMU: introduce pte_prefetch_topup_memory

Prev: KVM: MMU: fix race between 'walk_addr' and 'fetch'
Next: stable? quality assurance?

From: Avi Kivity on 11 Jul 2010 09:10

On 07/06/2010 01:49 PM, Xiao Guangrong wrote:
> Introduce this function to topup prefetch cache
>
>
>
> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> index 3dcd55d..cda4587 100644
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -89,6 +89,8 @@ module_param(oos_shadow, bool, 0644);
> }
> #endif
>
> +#define PTE_PREFETCH_NUM 16
>

Let's make it 8 to start with... It's frightening enough.

(8 = one cache line in both guest and host)

> @@ -316,15 +318,16 @@ static void update_spte(u64 *sptep, u64 new_spte)
> }
> }
>
> -static int mmu_topup_memory_cache(struct kvm_mmu_memory_cache *cache,
> - struct kmem_cache *base_cache, int min)
> +static int __mmu_topup_memory_cache(struct kvm_mmu_memory_cache *cache,
> + struct kmem_cache *base_cache, int min,
> + int max, gfp_t flags)
> {
> void *obj;
>
> if (cache->nobjs>= min)
> return 0;
> - while (cache->nobjs< ARRAY_SIZE(cache->objects)) {
> - obj = kmem_cache_zalloc(base_cache, GFP_KERNEL);
> + while (cache->nobjs< max) {
> + obj = kmem_cache_zalloc(base_cache, flags);
> if (!obj)
> return -ENOMEM;
> cache->objects[cache->nobjs++] = obj;
> @@ -332,6 +335,20 @@ static int mmu_topup_memory_cache(struct kvm_mmu_memory_cache *cache,
> return 0;
> }
>
> +static int mmu_topup_memory_cache(struct kvm_mmu_memory_cache *cache,
> + struct kmem_cache *base_cache, int min)
> +{
> + return __mmu_topup_memory_cache(cache, base_cache, min,
> + ARRAY_SIZE(cache->objects), GFP_KERNEL);
> +}
> +
> +static int pte_prefetch_topup_memory_cache(struct kvm_vcpu *vcpu)
> +{
> + return __mmu_topup_memory_cache(&vcpu->arch.mmu_rmap_desc_cache,
> + rmap_desc_cache, PTE_PREFETCH_NUM,
> + PTE_PREFETCH_NUM, GFP_ATOMIC);
> +}
> +
>

Just make the ordinary topup sufficient for prefetch. If we allocate
too much, we don't lose anything, the memory remains for the next time
around.

Note for shadow pages or pte chains you don't need extra pages, since
the prefetch fits in just one shadow page. You only need extra for rmap.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Xiao Guangrong on 11 Jul 2010 23:20

Avi Kivity wrote:
> On 07/06/2010 01:49 PM, Xiao Guangrong wrote:
>> Introduce this function to topup prefetch cache
>>
>>
>>
>> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
>> index 3dcd55d..cda4587 100644
>> --- a/arch/x86/kvm/mmu.c
>> +++ b/arch/x86/kvm/mmu.c
>> @@ -89,6 +89,8 @@ module_param(oos_shadow, bool, 0644);
>> }
>> #endif
>>
>> +#define PTE_PREFETCH_NUM 16
>>
>
> Let's make it 8 to start with... It's frightening enough.
>
> (8 = one cache line in both guest and host)

Umm, before post this patchset, i have done the draft performance test for
different prefetch distance, and it shows 16 is the best distance that we can
get highest performance.

>
>> @@ -316,15 +318,16 @@ static void update_spte(u64 *sptep, u64 new_spte)
>> }
>> }
>>
>> -static int mmu_topup_memory_cache(struct kvm_mmu_memory_cache *cache,
>> - struct kmem_cache *base_cache, int min)
>> +static int __mmu_topup_memory_cache(struct kvm_mmu_memory_cache *cache,
>> + struct kmem_cache *base_cache, int min,
>> + int max, gfp_t flags)
>> {
>> void *obj;
>>
>> if (cache->nobjs>= min)
>> return 0;
>> - while (cache->nobjs< ARRAY_SIZE(cache->objects)) {
>> - obj = kmem_cache_zalloc(base_cache, GFP_KERNEL);
>> + while (cache->nobjs< max) {
>> + obj = kmem_cache_zalloc(base_cache, flags);
>> if (!obj)
>> return -ENOMEM;
>> cache->objects[cache->nobjs++] = obj;
>> @@ -332,6 +335,20 @@ static int mmu_topup_memory_cache(struct
>> kvm_mmu_memory_cache *cache,
>> return 0;
>> }
>>
>> +static int mmu_topup_memory_cache(struct kvm_mmu_memory_cache *cache,
>> + struct kmem_cache *base_cache, int min)
>> +{
>> + return __mmu_topup_memory_cache(cache, base_cache, min,
>> + ARRAY_SIZE(cache->objects), GFP_KERNEL);
>> +}
>> +
>> +static int pte_prefetch_topup_memory_cache(struct kvm_vcpu *vcpu)
>> +{
>> + return __mmu_topup_memory_cache(&vcpu->arch.mmu_rmap_desc_cache,
>> + rmap_desc_cache, PTE_PREFETCH_NUM,
>> + PTE_PREFETCH_NUM, GFP_ATOMIC);
>> +}
>> +
>>
>
> Just make the ordinary topup sufficient for prefetch. If we allocate
> too much, we don't lose anything, the memory remains for the next time
> around.
>

Umm, but at the worst case, we should allocate 40 items for rmap, it's heavy
for GFP_ATOMIC allocation and holding mmu_lock.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Avi Kivity on 12 Jul 2010 08:30

On 07/12/2010 06:05 AM, Xiao Guangrong wrote:
>
> Avi Kivity wrote:
>
>> On 07/06/2010 01:49 PM, Xiao Guangrong wrote:
>>
>>> Introduce this function to topup prefetch cache
>>>
>>>
>>>
>>> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
>>> index 3dcd55d..cda4587 100644
>>> --- a/arch/x86/kvm/mmu.c
>>> +++ b/arch/x86/kvm/mmu.c
>>> @@ -89,6 +89,8 @@ module_param(oos_shadow, bool, 0644);
>>> }
>>> #endif
>>>
>>> +#define PTE_PREFETCH_NUM 16
>>>
>>>
>> Let's make it 8 to start with... It's frightening enough.
>>
>> (8 = one cache line in both guest and host)
>>
> Umm, before post this patchset, i have done the draft performance test for
> different prefetch distance, and it shows 16 is the best distance that we can
> get highest performance.
>

What's the different between 8 and 16?

I'm worried that there are workloads that don't benefit from prefetch,
and we may regress there. So I'd like to limit it, at least at first.

btw, what about dirty logging? will prefetch cause pages to be marked dirty?

We may need to instantiate prefetched pages with spte.d=0 and examine it
when tearing down the spte.

>>> +static int pte_prefetch_topup_memory_cache(struct kvm_vcpu *vcpu)
>>> +{
>>> + return __mmu_topup_memory_cache(&vcpu->arch.mmu_rmap_desc_cache,
>>> + rmap_desc_cache, PTE_PREFETCH_NUM,
>>> + PTE_PREFETCH_NUM, GFP_ATOMIC);
>>> +}
>>> +
>>>
>>>
>> Just make the ordinary topup sufficient for prefetch. If we allocate
>> too much, we don't lose anything, the memory remains for the next time
>> around.
>>
>>
> Umm, but at the worst case, we should allocate 40 items for rmap, it's heavy
> for GFP_ATOMIC allocation and holding mmu_lock.
>
>

Why use GFP_ATOMIC at all? Make mmu_topup_memory_caches() always assume
we'll be prefetching.

Why 40? I think all we need is PTE_PREFETCH_NUM rmap entries.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Xiao Guangrong on 12 Jul 2010 21:30

Avi Kivity wrote:
> On 07/12/2010 06:05 AM, Xiao Guangrong wrote:
>>
>> Avi Kivity wrote:
>>
>>> On 07/06/2010 01:49 PM, Xiao Guangrong wrote:
>>>
>>>> Introduce this function to topup prefetch cache
>>>>
>>>>
>>>>
>>>> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
>>>> index 3dcd55d..cda4587 100644
>>>> --- a/arch/x86/kvm/mmu.c
>>>> +++ b/arch/x86/kvm/mmu.c
>>>> @@ -89,6 +89,8 @@ module_param(oos_shadow, bool, 0644);
>>>> }
>>>> #endif
>>>>
>>>> +#define PTE_PREFETCH_NUM 16
>>>>
>>>>
>>> Let's make it 8 to start with... It's frightening enough.
>>>
>>> (8 = one cache line in both guest and host)
>>>
>> Umm, before post this patchset, i have done the draft performance test
>> for
>> different prefetch distance, and it shows 16 is the best distance that
>> we can
>> get highest performance.
>>
>
> What's the different between 8 and 16?
>
> I'm worried that there are workloads that don't benefit from prefetch,
> and we may regress there. So I'd like to limit it, at least at first.
>

OK

> btw, what about dirty logging? will prefetch cause pages to be marked
> dirty?
>
> We may need to instantiate prefetched pages with spte.d=0 and examine it
> when tearing down the spte.
>

Yeah, all speculative path are broken dirty bit tracking, and i guess it's
need more review, so i plan to do it in the separate patch, i'll post it after
this patchset merged, could you allow it?

>>>> +static int pte_prefetch_topup_memory_cache(struct kvm_vcpu *vcpu)
>>>> +{
>>>> + return __mmu_topup_memory_cache(&vcpu->arch.mmu_rmap_desc_cache,
>>>> + rmap_desc_cache, PTE_PREFETCH_NUM,
>>>> + PTE_PREFETCH_NUM, GFP_ATOMIC);
>>>> +}
>>>> +
>>>>
>>>>
>>> Just make the ordinary topup sufficient for prefetch. If we allocate
>>> too much, we don't lose anything, the memory remains for the next time
>>> around.
>>>
>>>
>> Umm, but at the worst case, we should allocate 40 items for rmap, it's
>> heavy
>> for GFP_ATOMIC allocation and holding mmu_lock.
>>
>>
>
> Why use GFP_ATOMIC at all? Make mmu_topup_memory_caches() always assume
> we'll be prefetching.
>
> Why 40? I think all we need is PTE_PREFETCH_NUM rmap entries.
>

Oh, i see your mean now, i'll increase rmap entries in mmu_topup_memory_caches()

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Avi Kivity on 13 Jul 2010 00:30

On 07/13/2010 04:16 AM, Xiao Guangrong wrote:
>
>
>> btw, what about dirty logging? will prefetch cause pages to be marked
>> dirty?
>>
>> We may need to instantiate prefetched pages with spte.d=0 and examine it
>> when tearing down the spte.
>>
>>
> Yeah, all speculative path are broken dirty bit tracking, and i guess it's
> need more review, so i plan to do it in the separate patch, i'll post it after
> this patchset merged, could you allow it?
>
>

Regressions? no. Or do you mean the problem already exists? Where?

--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

| Next | Last
Pages: 1 2 3
Prev: KVM: MMU: fix race between 'walk_addr' and 'fetch'
Next: stable? quality assurance?

KVM: MMU: introduce pte_prefetch_topup_memory_cache()