File caching [UK Mac]

Prev: BBC video clips stopped working
Next: iPhone and Exchange sync

From: Richard Tobin on 2 Jul 2010 03:42

In article <timstreater-D1D354.21272601072010(a)news.individual.net>,
Tim Streater <timstreater(a)waitrose.com> wrote:

>Am I right in thinking that the OS keeps a cache of files that have been
>read recently?

It's not specifically a cache of files, but of blocks. If part of
a file is in use, that part may be cached when other parts aren't.

>If this is the case, how is this cache managed? I'd like to think that
>if memory gets short it doesn't just get paged out, rather that files
>get dumped from the cache.

Modern operating systems usually have a unified buffer cache where
file blocks and blocks of process memory are managed in a single
cache. I don't know the details of the MacOS X buffer cache, but
there is no reason for it to absolutely prefer pages corresponding
to data in files over "ordinary memory" pages or vice versa. If
a running program is accessing a file, the OS should keep both
the program and the file in memory, in preference to idle programs
and file blocks not recently accessed.

Older versions of unix had a seperate buffer cache for file blocks,
whose size was determined at boot time as a proprtion of the amount of
physical memory available.

-- Richard

From: Richard Tobin on 2 Jul 2010 05:18

In article <timstreater-6748C5.09034502072010(a)news.individual.net>,
Tim Streater <timstreater(a)waitrose.com> wrote:

>So are you saying that it's possible that disk blocks in the cache might
>get paged out (not, according to Jamie)? That would seem perverse to me
>- why not just dump them.

No, they aren't paged to swap space like memory blocks.

Remember that files in the cache include ones being written: if a
block have been modified it has to be written to the file rather than
just dumped. Conversely, program memory blocks can be dumped (rather
than written to swap) if they haven't been modified since they were
read in.

So in fact you can treat file blocks and program memory blocks quite
uniformly. You treat program memory as a cache for the version in
swap space just as the file blocks are a cache for the version in the
file. When you need some memory you find a page that hasn't recently
been accessed; if it's modified you have to write it out - to swap if
it's program memory, to the file if it's a file block. The distinction
between the two is further blurred by the fact that programs can
map files into memory, so a page in program memory may be backed
by a file rather than swap.

Of course, in practice you don't normally find a modified page and
write it out because you need the memory; you have a background
process writing out modified pages and marking them as clean so they
can just be discarded if needed.

Furthermore, most modern systems can use files for swap rather than
the traditional dedicated partition. And the (read-only) program code
itself has long been paged in from the executable file rather than a
copy in swap. So paging a program memory page in or out *is* reading
or writing it to a file.

-- Richard

From: Richard Kettlewell on 2 Jul 2010 06:20

Jaimie Vandenbergh <jaimie(a)sometimes.sessile.org> writes:
> Tim Streater <timstreater(a)waitrose.com> wrote:

>> Richard, thanks. It's 25 years since I was looking at any of this stuff,
>> under VMS, and only then from the PoV of wanting to set the memory
>> parameters for an app that was going to run continuously (i.e., months
>> at a time) so it didn't page-fault at all once loaded.
>>
>> Interesting to see what's happened in the meantime.
>
> Not an awful lot... I think the last new bit of cunning in the field
> was memory-mapped files, described above and detailed at eg
> http://en.wikipedia.org/wiki/Memory_mapped_file , but that was
> probably in place last time you checked.

AFAIK they go back at least to Multics.

--
http://www.greenend.org.uk/rjk/

|
Pages: 1
Prev: BBC video clips stopped working
Next: iPhone and Exchange sync