zcache: page cache compression support [Kernel]

Prev: [PATCH 1/4 update] firewire: cdev: some clarifications to the API documentation
Next: Regression 2.6.34+ -> 2.6.34-rc5: radeon KMS rs780 problems

From: Nitin Gupta on 20 Jul 2010 10:00

On 07/20/2010 01:27 AM, Dan Magenheimer wrote:
>> We only keep pages that compress to PAGE_SIZE/2 or less. Compressed
>> chunks are
>> stored using xvmalloc memory allocator which is already being used by
>> zram
>> driver for the same purpose. Zero-filled pages are checked and no
>> memory is
>> allocated for them.
>
> I'm curious about this policy choice. I can see why one
> would want to ensure that the average page is compressed
> to less than PAGE_SIZE/2, and preferably PAGE_SIZE/2
> minus the overhead of the data structures necessary to
> track the page. And I see that this makes no difference
> when the reclamation algorithm is random (as it is for
> now). But once there is some better reclamation logic,
> I'd hope that this compression factor restriction would
> be lifted and replaced with something much higher. IIRC,
> compression is much more expensive than decompression
> so there's no CPU-overhead argument here either,
> correct?
>
>

Its true that we waste CPU cycles for every incompressible page
encountered but still we can't keep such pages in RAM since this
is what host wanted to reclaim and we can't help since compression
failed. Compressed caching makes sense only when we keep highly
compressible pages in RAM, regardless of reclaim scheme.

Keeping (nearly) incompressible pages in RAM probably makes sense
for Xen's case where cleancache provider runs *inside* a VM, sending
pages to host. So, if VM is limited to say 512M and host has 64G RAM,
caching guest pages, with or without compression, will help.

Thanks,
Nitin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Dan Magenheimer on 20 Jul 2010 10:40

> On 07/20/2010 01:27 AM, Dan Magenheimer wrote:
> >> We only keep pages that compress to PAGE_SIZE/2 or less. Compressed
> >> chunks are
> >> stored using xvmalloc memory allocator which is already being used
> by
> >> zram
> >> driver for the same purpose. Zero-filled pages are checked and no
> >> memory is
> >> allocated for them.
> >
> > I'm curious about this policy choice. I can see why one
> > would want to ensure that the average page is compressed
> > to less than PAGE_SIZE/2, and preferably PAGE_SIZE/2
> > minus the overhead of the data structures necessary to
> > track the page. And I see that this makes no difference
> > when the reclamation algorithm is random (as it is for
> > now). But once there is some better reclamation logic,
> > I'd hope that this compression factor restriction would
> > be lifted and replaced with something much higher. IIRC,
> > compression is much more expensive than decompression
> > so there's no CPU-overhead argument here either,
> > correct?
>
> Its true that we waste CPU cycles for every incompressible page
> encountered but still we can't keep such pages in RAM since this
> is what host wanted to reclaim and we can't help since compression
> failed. Compressed caching makes sense only when we keep highly
> compressible pages in RAM, regardless of reclaim scheme.
>
> Keeping (nearly) incompressible pages in RAM probably makes sense
> for Xen's case where cleancache provider runs *inside* a VM, sending
> pages to host. So, if VM is limited to say 512M and host has 64G RAM,
> caching guest pages, with or without compression, will help.

I agree that the use model is a bit different, but PAGE_SIZE/2
still seems like an unnecessarily strict threshold. For
example, saving 3000 clean pages in 2000*PAGE_SIZE of RAM
still seems like a considerable space savings. And as
long as the _average_ is less than some threshold, saving
a few slightly-less-than-ideally-compressible pages doesn't
seem like it would be a problem. For example, IMHO, saving two
pages when one compresses to 2047 bytes and the other compresses
to 2049 bytes seems just as reasonable as saving two pages that
both compress to 2048 bytes.

Maybe the best solution is to make the threshold a sysfs
settable? Or maybe BOTH the single-page threshold and
the average threshold as two different sysfs settables?
E.g. throw away a put page if either it compresses poorly
or adding it to the pool would push the average over.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Nitin Gupta on 21 Jul 2010 00:30

On 07/20/2010 07:58 PM, Dan Magenheimer wrote:
>> On 07/20/2010 01:27 AM, Dan Magenheimer wrote:
>>>> We only keep pages that compress to PAGE_SIZE/2 or less. Compressed
>>>> chunks are
>>>> stored using xvmalloc memory allocator which is already being used
>> by
>>>> zram
>>>> driver for the same purpose. Zero-filled pages are checked and no
>>>> memory is
>>>> allocated for them.
>>>
>>> I'm curious about this policy choice. I can see why one
>>> would want to ensure that the average page is compressed
>>> to less than PAGE_SIZE/2, and preferably PAGE_SIZE/2
>>> minus the overhead of the data structures necessary to
>>> track the page. And I see that this makes no difference
>>> when the reclamation algorithm is random (as it is for
>>> now). But once there is some better reclamation logic,
>>> I'd hope that this compression factor restriction would
>>> be lifted and replaced with something much higher. IIRC,
>>> compression is much more expensive than decompression
>>> so there's no CPU-overhead argument here either,
>>> correct?
>>
>> Its true that we waste CPU cycles for every incompressible page
>> encountered but still we can't keep such pages in RAM since this
>> is what host wanted to reclaim and we can't help since compression
>> failed. Compressed caching makes sense only when we keep highly
>> compressible pages in RAM, regardless of reclaim scheme.
>>
>> Keeping (nearly) incompressible pages in RAM probably makes sense
>> for Xen's case where cleancache provider runs *inside* a VM, sending
>> pages to host. So, if VM is limited to say 512M and host has 64G RAM,
>> caching guest pages, with or without compression, will help.
>
> I agree that the use model is a bit different, but PAGE_SIZE/2
> still seems like an unnecessarily strict threshold. For
> example, saving 3000 clean pages in 2000*PAGE_SIZE of RAM
> still seems like a considerable space savings. And as
> long as the _average_ is less than some threshold, saving
> a few slightly-less-than-ideally-compressible pages doesn't
> seem like it would be a problem. For example, IMHO, saving two
> pages when one compresses to 2047 bytes and the other compresses
> to 2049 bytes seems just as reasonable as saving two pages that
> both compress to 2048 bytes.
>
> Maybe the best solution is to make the threshold a sysfs
> settable? Or maybe BOTH the single-page threshold and
> the average threshold as two different sysfs settables?
> E.g. throw away a put page if either it compresses poorly
> or adding it to the pool would push the average over.
>

Considering overall compression average instead of bothering about
individual page compressibility seems like a good point. Still, I think
storing completely incompressible pages isn't desirable.

So, I agree with the idea of separate sysfs tunables for average and single-page
compression thresholds with defaults conservatively set to 50% and PAGE_SIZE/2
respectively. I will include these in "v2" patches.

Thanks,
Nitin

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Dan Magenheimer on 21 Jul 2010 13:40

> > Maybe the best solution is to make the threshold a sysfs
> > settable? Or maybe BOTH the single-page threshold and
> > the average threshold as two different sysfs settables?
> > E.g. throw away a put page if either it compresses poorly
> > or adding it to the pool would push the average over.
>
> Considering overall compression average instead of bothering about
> individual page compressibility seems like a good point. Still, I think
> storing completely incompressible pages isn't desirable.
>
> So, I agree with the idea of separate sysfs tunables for average and
> single-page
> compression thresholds with defaults conservatively set to 50% and
> PAGE_SIZE/2
> respectively. I will include these in "v2" patches.

Unless the single-page compression threshold is higher than the
average, the average is useless. IMHO I'd suggest at least
5*PAGE_SIZE/8 as the single-page threshold, possibly higher.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Greg KH on 22 Jul 2010 15:20

On Fri, Jul 16, 2010 at 06:07:42PM +0530, Nitin Gupta wrote:
> Frequently accessed filesystem data is stored in memory to reduce access to
> (much) slower backing disks. Under memory pressure, these pages are freed and
> when needed again, they have to be read from disks again. When combined working
> set of all running application exceeds amount of physical RAM, we get extereme
> slowdown as reading a page from disk can take time in order of milliseconds.

<snip>

Given that there were a lot of comments and changes for this series, can
you resend them with your updates so I can then apply them if they are
acceptable to everyone?

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

First | Prev | Next | Last
Pages: 1 2 3 4
Prev: [PATCH 1/4 update] firewire: cdev: some clarifications to the API documentation
Next: Regression 2.6.34+ -> 2.6.34-rc5: radeon KMS rs780 problems