From: Ferenc Wagner on
Phillip Lougher <phillip.lougher(a)gmail.com> writes:

> On Thu, Mar 18, 2010 at 4:38 PM, Ferenc Wagner <wferi(a)niif.hu> wrote:
>
>> I could only compare apples to oranges before porting the patch to the
>> LZMA variant.  So I refrain from that for a couple of days yet.  But
>> meanwhile I started adding a pluggable backend framework to SquashFS,
>> and would much appreciate some comments about the applicability of this
>> idea.  The patch is (intended to be) a no-op, applies on top of current
>> git (a3d3203e4bb40f253b1541e310dc0f9305be7c84).
>
> This looks promising, making the backend pluggable (like the new
> compressor framework) is far better and cleaner than scattering the
> code full of #ifdef's. Far better than the previous patch :-)

Yeah, the previous patch was only a little bit more than a proof that I
can make SquashFS work on an MTD device. The MTD access part is
probably the only thing to criticize there: maybe it would be better
done in blocks of some particular size, via a different interface.

> +static void *bdev_init(struct squashfs_sb_info *msblk, u64 index,
> size_t length)
> +{
> + struct squashfs_bdev *bdev = msblk->backend_data;
> + struct buffer_head *bh;
> +
> + bh = kcalloc((msblk->block_size >> bdev->devblksize_log2) + 1,
> + sizeof(*bh), GFP_KERNEL);
>
> You should alloc against the larger of msblk->block_size and
> METADATA_SIZE (8 Kbytes). Block_size could be 4 Kbytes only.

Hmm, okay. Though this code is a verbatim copy of that in block.c.

> +static int fill_bdev_super(struct super_block *sb, void *data, int silent)
> +{
> + struct squashfs_sb_info *msblk;
> + struct squashfs_bdev *bdev;
> + int err = squashfs_fill_super2(sb, data, silent, &squashfs_bdev_ops);
> + if (err)
> + return err;
> +
> + bdev = kzalloc(sizeof(*bdev), GFP_KERNEL);
> + if (!bdev)
> + return -ENOMEM;
> +
> + bdev->devblksize = sb_min_blocksize(sb, BLOCK_SIZE);
> + bdev->devblksize_log2 = ffz(~bdev->devblksize);
> +
> + msblk = sb->s_fs_info;
> + msblk->backend_data = bdev;
> + return 0;
> +}
>
> This function looks rather 'back-to-front' to me. I'm assuming that
> squashfs_fill_super2() will be the current fill superblock function?

Yes, with the extra parameter added.

> This function wants to read data off the filesystem through the
> backend, and yet the backend (bdev, mblk->backend_data) hasn't been
> initialised when it's called...

It can't be, because msblk = sb->s_fs_info is allocated by
squashfs_fill_super(). Now it will be passed the ops, so after
allocating msblk it can also fill out the ops. After that it can read,
and squashfs_read_data() will call the init, read and free operations of
the backend. The backend itself has no persistent state between calls
to squashfs_read_data(). Btw. struct super_block has fields named
s_blocksize and s_blocksize_bits, aren't those the same as devblksize
and devblksize_log in squashfs_sb_info? (They are being moved into
backend_data by the above.) If yes, shouldn't they be used instead?

While we're at it: is it really worth submitting all the buffer heads
at the beginning, instead of submitting them one at a time as needed by
the decompression process and letting the IO scheduler do readahead and
request coalescing as it sees fit? At the very least, that would
require less memory, while possibly not hurting performance too much.

On the other hand, would it be possible to avoid the memory copy of
uncompressed blocks by doing a straight (DMA) transfer from the device
into the page cache?

LZMA support is not in mainline yet, but I saw that unlzma is done in a
single step, which requires block-sized input and output buffers. Is
there any particular reason it's done this way, not chunk-by-chunk as
inflate? This easily costs hundreds of kilobytes of virtual memory,
which isn't negligible on embedded systems.
--
Thanks for your comments,
Feri.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Ferenc Wagner on
Ferenc Wagner <wferi(a)niif.hu> writes:

> Phillip Lougher <phillip.lougher(a)gmail.com> writes:
>
>> On Thu, Mar 18, 2010 at 4:38 PM, Ferenc Wagner <wferi(a)niif.hu> wrote:
>>
>> +static int fill_bdev_super(struct super_block *sb, void *data, int silent)
>> +{
>> + struct squashfs_sb_info *msblk;
>> + struct squashfs_bdev *bdev;
>> + int err = squashfs_fill_super2(sb, data, silent, &squashfs_bdev_ops);
>> + if (err)
>> + return err;
>> +
>> + bdev = kzalloc(sizeof(*bdev), GFP_KERNEL);
>> + if (!bdev)
>> + return -ENOMEM;
>> +
>> + bdev->devblksize = sb_min_blocksize(sb, BLOCK_SIZE);
>> + bdev->devblksize_log2 = ffz(~bdev->devblksize);
>> +
>> + msblk = sb->s_fs_info;
>> + msblk->backend_data = bdev;
>> + return 0;
>> +}
>>
>> This function looks rather 'back-to-front' to me. I'm assuming that
>> squashfs_fill_super2() will be the current fill superblock function?
>
> Yes, with the extra parameter added.
>
>> This function wants to read data off the filesystem through the
>> backend, and yet the backend (bdev, mblk->backend_data) hasn't been
>> initialised when it's called...
>
> It can't be, because msblk = sb->s_fs_info is allocated by
> squashfs_fill_super(). Now it will be passed the ops, so after
> allocating msblk it can also fill out the ops. After that it can read,
> and squashfs_read_data() will call the init, read and free operations of
> the backend.

And here we indeed have a rather fundamental problem. This isn't
specific to the discussed plugin system at all. Even in the current
code, to set msblk->block_size squashfs_fill_super() calls
squashfs_read_table() to read the superblock, which in turn calls
squashfs_read_data(), which uses msblk->block_size to allocate enough
buffer heads, but msblk->block_size just can't be set at this point.
msblk->bytes_used is preset with a dummy value to make the read
possible, but msblk->block_size is not. Fortunately, one buffer head is
allocated each time nevertheless. I wonder what a correct solution
would look lke..
--
Regards,
Feri.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Ferenc Wagner on
Phillip Lougher <phillip.lougher(a)gmail.com> writes:

> A couple of specific comments...
>
> +/* A backend is initialized for each SquashFS block read operation,
> + * making further sequential reads possible from the block.
> + */
> +static void *bdev_init(struct squashfs_sb_info *msblk, u64 index,
> size_t length)
> +{
> + struct squashfs_bdev *bdev = msblk->backend_data;
> + struct buffer_head *bh;
> +
> + bh = kcalloc((msblk->block_size >> bdev->devblksize_log2) + 1,
> + sizeof(*bh), GFP_KERNEL);
>
> You should alloc against the larger of msblk->block_size and
> METADATA_SIZE (8 Kbytes). Block_size could be 4 Kbytes only.

I plugged in a max(). Couldn't that trailing +1 be converted into a +2
like this?

bh = kcalloc((max(msblk->block_size, METADATA_SIZE) + 2) >> bdev->devblksize_log2

> +static int fill_bdev_super(struct super_block *sb, void *data, int silent)
>
> This function looks rather 'back-to-front' to me. I'm assuming that
> squashfs_fill_super2() will be the current fill superblock function?
> This function wants to read data off the filesystem through the
> backend, and yet the backend (bdev, mblk->backend_data) hasn't been
> initialised when it's called...

I solved it by introducing a callback function for adding the backend.
That may be overkill, but it seems to give the most shared code.

The attached patch series survived some testing here. My only doubt:
the current backend interface necessitates a memory copy from the buffer
heads. This is no problem for mtd and lzma which copy the data anyway,
but makes this code less efficient in the bdev+zlib case.

I've got one more patch, which I forgot to export, to pull out the
common logic from the backend init functions back into squashfs_read_data().
With the bdev backend, that entails reading the first block twice in a
row most of the time. This again could be worked around by extending
the backend interface, but I'm not sure if it's worth it.

How does this look like now?
--
Regards,
Feri.

From: Ferenc Wagner on
Ferenc Wagner <wferi(a)niif.hu> writes:

> I've got one more patch, which I forgot to export, to pull out the
> common logic from the backend init functions back into squashfs_read_data().
> With the bdev backend, that entails reading the first block twice in a
> row most of the time. This again could be worked around by extending
> the backend interface, but I'm not sure if it's worth it.

Here it is. I also corrected the name of SQUASHFS_METADATA_SIZE, so it
may as well compile now.
--
Regards,
Feri.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Ferenc Wagner on
Now with the patch series, sorry.

Ferenc Wagner <wferi(a)niif.hu> writes:

> I've got one more patch, which I forgot to export, to pull out the
> common logic from the backend init functions back into squashfs_read_data().
> With the bdev backend, that entails reading the first block twice in a
> row most of the time. This again could be worked around by extending
> the backend interface, but I'm not sure if it's worth it.

Here it is. I also corrected the name of SQUASHFS_METADATA_SIZE, so it
may as well compile now.
--
Regards,
Feri.