From: Chad.Public on
I have a data collection system running on Solaris 9 which is
constantly writing data into numerous files, hundreds of thousands of
files. Each of these files contain a day of data, the current "day
file" has data appended to it throughout the day (something like a
512-byte write every 15 seconds might be average) and other, older
files are normally, but not certainly, idle. There is a fixed
structure but no fixed file names, new data streams can arrive at any
time creating new directories and files. There are a handful of
different software packages doing the writing of the data and the
overall design is not easily changed.

To export this data with as little latency as possible I have written
software to recursively scan a specified directory tree and monitor the
modification dates and sizes of each file found. Given the nature of
how the data is written the scanning software has the concept of 3
different file types: active, idle and quiet. Active files have been
modified recently and are scanned every pass, idle files have been
modified not-so-recently (hours) and are scanned every X passes and
quiet files (days) are never scanned after the initial discovery.

Problem is that this scanning can be pretty resource intensive and is
not all that fast with (nearing) a million files. I've optimized about
as far as I am able via different file types (active, idle, quiet),
file list in binary tree, etc. The calls to stat(2) are what is now
taking all the time.

What I'd like of course is to use some service that will tell me when
any of those files change so I don't have to actively stat all the
files. Something like what inotifiy does, but Solaris doesn't seem to
have it and I don't think Linux's inotify was really designed for
monitoring so many files. There is some effort to add file changes to
the event framework for Open Solaris, but that's probably a ways off
and, like inotify, probably will not be designed for what I need.

I've thought of other ideas like creating an "opaque" file system that
I could mount over any file system I want and have it notify me of
changes via a device or something. That would be ideal.

I would appreciate any thoughts to the feasibility of such a
development and how to get started or any other hair-brained ideas that
might lead me in a new direction.

cheers,
Chad