ocfs2: When zero extending, do it by page. [Kernel]

Prev: vfs: Allow handle based open on symlinks
Next: [PATCH 1/2] cfq-iosched: fix tree-wide handling of rq_noidle

From: Tao Ma on 7 Jul 2010 11:30

Hi Joel,
Joel Becker wrote:
> ocfs2_zero_extend() does its zeroing block by block, but it calls a
> function named ocfs2_write_zero_page(). Let's have
> ocfs2_write_zero_page() handle the page level. From
> ocfs2_zero_extend()'s perspective, it is now page-at-a-time.
>
> Signed-off-by: Joel Becker <joel.becker(a)oracle.com>
> ---
> fs/ocfs2/aops.c | 30 --------------
> fs/ocfs2/file.c | 119 +++++++++++++++++++++++++++++++++++++++----------------
> 2 files changed, 85 insertions(+), 64 deletions(-)
>
> diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c
> index 3623ca2..9a5c931 100644
> --- a/fs/ocfs2/aops.c
> +++ b/fs/ocfs2/aops.c
> @@ -459,36 +459,6 @@ int walk_page_buffers( handle_t *handle,
> return ret;
> }
>
> -handle_t *ocfs2_start_walk_page_trans(struct inode *inode,
> - struct page *page,
> - unsigned from,
> - unsigned to)
> -{
> - struct ocfs2_super *osb = OCFS2_SB(inode->i_sb);
> - handle_t *handle;
> - int ret = 0;
> -
> - handle = ocfs2_start_trans(osb, OCFS2_INODE_UPDATE_CREDITS);
> - if (IS_ERR(handle)) {
> - ret = -ENOMEM;
> - mlog_errno(ret);
> - goto out;
> - }
> -
> - if (ocfs2_should_order_data(inode)) {
> - ret = ocfs2_jbd2_file_inode(handle, inode);
> - if (ret < 0)
> - mlog_errno(ret);
> - }
> -out:
> - if (ret) {
> - if (!IS_ERR(handle))
> - ocfs2_commit_trans(osb, handle);
> - handle = ERR_PTR(ret);
> - }
> - return handle;
> -}
> -
> static sector_t ocfs2_bmap(struct address_space *mapping, sector_t block)
> {
> sector_t status;
> diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c
> index 6a13ea6..a6e0eb6 100644
> --- a/fs/ocfs2/file.c
> +++ b/fs/ocfs2/file.c
> @@ -724,28 +724,55 @@ leave:
> return status;
> }
>
> +/*
> + * While a write will already be ordering the data, a truncate will not.
> + * Thus, we need to explicitly order the zeroed pages.
> + */
> +static handle_t *ocfs2_zero_start_ordered_transaction(struct inode *inode)
> +{
> + struct ocfs2_super *osb = OCFS2_SB(inode->i_sb);
> + handle_t *handle = NULL;
> + int ret = 0;
> +
> + if (ocfs2_should_order_data(inode))
>
This should be if (!ocfs2_should_order_data(inode)) I guess? ;)
> + goto out;
> +
> + handle = ocfs2_start_trans(osb, OCFS2_INODE_UPDATE_CREDITS);
> + if (IS_ERR(handle)) {
> + ret = -ENOMEM;
> + mlog_errno(ret);
> + goto out;
> + }
> +
>
Regards,
Tao
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Joel Becker on 7 Jul 2010 16:10

On Wed, Jul 07, 2010 at 11:19:27PM +0800, Tao Ma wrote:
> >+static handle_t *ocfs2_zero_start_ordered_transaction(struct inode *inode)
> >+{
> >+ struct ocfs2_super *osb = OCFS2_SB(inode->i_sb);
> >+ handle_t *handle = NULL;
> >+ int ret = 0;
> >+
> >+ if (ocfs2_should_order_data(inode))
> This should be if (!ocfs2_should_order_data(inode)) I guess? ;)

Of course it should. Fixed ;-)

Joel

--

"Too much walking shoes worn thin.
Too much trippin' and my soul's worn thin.
Time to catch a ride it leaves today
Her name is what it means.
Too much walking shoes worn thin."

Joel Becker
Consulting Software Developer
Oracle
E-mail: joel.becker(a)oracle.com
Phone: (650) 506-8127
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Tao Ma on 7 Jul 2010 23:50

Hi Joel,

On 07/07/2010 07:16 PM, Joel Becker wrote:
> ocfs2_zero_extend() does its zeroing block by block, but it calls a
> function named ocfs2_write_zero_page(). Let's have
> ocfs2_write_zero_page() handle the page level. From
> ocfs2_zero_extend()'s perspective, it is now page-at-a-time.
>
> Signed-off-by: Joel Becker<joel.becker(a)oracle.com>
> ---
> fs/ocfs2/aops.c | 30 --------------
> fs/ocfs2/file.c | 119 +++++++++++++++++++++++++++++++++++++++----------------
> 2 files changed, 85 insertions(+), 64 deletions(-)
>
<snip>
> -static int ocfs2_write_zero_page(struct inode *inode,
> - u64 size)
> +static int ocfs2_write_zero_page(struct inode *inode, u64 abs_from,
> + u64 abs_to)
> {
> struct address_space *mapping = inode->i_mapping;
> struct page *page;
> - unsigned long index;
> - unsigned int offset;
> + unsigned long index = abs_from>> PAGE_CACHE_SHIFT;
> handle_t *handle = NULL;
> int ret;
> + unsigned zero_from, zero_to, block_start, block_end;
>
> - offset = (size& (PAGE_CACHE_SIZE-1)); /* Within page */
> - /* ugh. in prepare/commit_write, if from==to==start of block, we
> - ** skip the prepare. make sure we never send an offset for the start
> - ** of a block
> - */
> - if ((offset& (inode->i_sb->s_blocksize - 1)) == 0) {
> - offset++;
> - }
> - index = size>> PAGE_CACHE_SHIFT;
> + BUG_ON(abs_from>= abs_to);
> + BUG_ON(abs_to> ((index + 1)<< PAGE_CACHE_SHIFT));
Sorry for not noticing this yesterday night. This can't work and will
overflow and bug out. I met with a similar bug in reflink test. See
commit d622b89.
> + BUG_ON(abs_from& (inode->i_blkbits - 1));
>
> page = grab_cache_page(mapping, index);
> if (!page) {
> @@ -754,31 +781,52 @@ static int ocfs2_write_zero_page(struct inode *inode,
> goto out;
> }
>
> - ret = ocfs2_prepare_write_nolock(inode, page, offset, offset);
> - if (ret< 0) {
> - mlog_errno(ret);
> - goto out_unlock;
> - }
> + /* Get the offsets within the page that we want to zero */
> + zero_from = abs_from& (PAGE_CACHE_SIZE - 1);
> + zero_to = abs_to& (PAGE_CACHE_SIZE - 1);
> + if (!zero_to)
> + zero_to = PAGE_CACHE_SIZE;
>
> - if (ocfs2_should_order_data(inode)) {
> - handle = ocfs2_start_walk_page_trans(inode, page, offset,
> - offset);
> - if (IS_ERR(handle)) {
> - ret = PTR_ERR(handle);
> - handle = NULL;
> + /* We know that zero_from is block aligned */
> + for (block_start = zero_from;
> + (block_start< PAGE_CACHE_SIZE)&& (block_start< zero_to);
> + block_start = block_end) {
Do we really need to check block_start < PAGE_CACHE_SIZE? I think just
check block_start < zero_to is enough since you have limit zero_to with
PAGE_CACHE_SIZE. What's more, it looks more natural(see below), does it?

for (block_start = zero_form; block_start < zero_to; block_start =
block_end) {

Regards,
Tao
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Joel Becker on 8 Jul 2010 06:00

On Thu, Jul 08, 2010 at 11:44:59AM +0800, Tao Ma wrote:
> On 07/07/2010 07:16 PM, Joel Becker wrote:
> >+ BUG_ON(abs_to> ((index + 1)<< PAGE_CACHE_SHIFT));
> Sorry for not noticing this yesterday night. This can't work and
> will overflow and bug out. I met with a similar bug in reflink test.
> See commit d622b89.

Good catch. It's obvious, now that you mention it.

> >+ /* We know that zero_from is block aligned */
> >+ for (block_start = zero_from;
> >+ (block_start< PAGE_CACHE_SIZE)&& (block_start< zero_to);
> >+ block_start = block_end) {
> Do we really need to check block_start < PAGE_CACHE_SIZE? I think
> just check block_start < zero_to is enough since you have limit
> zero_to with PAGE_CACHE_SIZE. What's more, it looks more natural(see
> below), does it?
>
> for (block_start = zero_form; block_start < zero_to; block_start =
> block_end) {

Yup. The code looked different halfway through, so I didn't
realize I was checking the same thing twice.

Joel

--

"Depend on the rabbit's foot if you will, but remember, it didn't
help the rabbit."
- R. E. Shay

Joel Becker
Consulting Software Developer
Oracle
E-mail: joel.becker(a)oracle.com
Phone: (650) 506-8127
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

|
Pages: 1
Prev: vfs: Allow handle based open on symlinks
Next: [PATCH 1/2] cfq-iosched: fix tree-wide handling of rq_noidle