From: Vivek Goyal on
On Tue, Jul 27, 2010 at 04:56:29PM +0900, KAMEZAWA Hiroyuki wrote:
> From: KAMEZAWA Hiroyuki <kamezawa.hiroyu(a)jp.fujitsu.com>
>
> Now, addresses of memory cgroup can be calculated by their ID without complex.
> This patch relplaces pc->mem_cgroup from a pointer to a unsigned short.
> On 64bit architecture, this offers us more 6bytes room per page_cgroup.
> Use 2bytes for blkio-cgroup's page tracking. More 4bytes will be used for
> some light-weight concurrent access.
>
> We may able to move this id onto flags field but ...go step by step.
>
> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu(a)jp.fujitsu.com>
> ---
> include/linux/page_cgroup.h | 3 ++-
> mm/memcontrol.c | 40 +++++++++++++++++++++++++---------------
> mm/page_cgroup.c | 2 +-
> 3 files changed, 28 insertions(+), 17 deletions(-)
>
> Index: mmotm-0719/include/linux/page_cgroup.h
> ===================================================================
> --- mmotm-0719.orig/include/linux/page_cgroup.h
> +++ mmotm-0719/include/linux/page_cgroup.h
> @@ -12,7 +12,8 @@
> */
> struct page_cgroup {
> unsigned long flags;
> - struct mem_cgroup *mem_cgroup;
> + unsigned short mem_cgroup; /* ID of assigned memory cgroup */
> + unsigned short blk_cgroup; /* Not Used..but will be. */

So later I shall have to use virtually indexed arrays in blkio controller?
Or you are just using virtually indexed arrays for lookup speed and
I can continue to use css_lookup() and not worry about using virtually
indexed arrays.

So the idea is that when a page is allocated, also store the blk_group
id and once that page is submitted for writeback, we should be able
to associate it to right blkio group?

Vivek
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: KAMEZAWA Hiroyuki on
On Tue, 27 Jul 2010 22:39:04 -0400
Vivek Goyal <vgoyal(a)redhat.com> wrote:

> On Tue, Jul 27, 2010 at 04:56:29PM +0900, KAMEZAWA Hiroyuki wrote:
> > From: KAMEZAWA Hiroyuki <kamezawa.hiroyu(a)jp.fujitsu.com>
> >
> > Now, addresses of memory cgroup can be calculated by their ID without complex.
> > This patch relplaces pc->mem_cgroup from a pointer to a unsigned short.
> > On 64bit architecture, this offers us more 6bytes room per page_cgroup.
> > Use 2bytes for blkio-cgroup's page tracking. More 4bytes will be used for
> > some light-weight concurrent access.
> >
> > We may able to move this id onto flags field but ...go step by step.
> >
> > Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu(a)jp.fujitsu.com>
> > ---
> > include/linux/page_cgroup.h | 3 ++-
> > mm/memcontrol.c | 40 +++++++++++++++++++++++++---------------
> > mm/page_cgroup.c | 2 +-
> > 3 files changed, 28 insertions(+), 17 deletions(-)
> >
> > Index: mmotm-0719/include/linux/page_cgroup.h
> > ===================================================================
> > --- mmotm-0719.orig/include/linux/page_cgroup.h
> > +++ mmotm-0719/include/linux/page_cgroup.h
> > @@ -12,7 +12,8 @@
> > */
> > struct page_cgroup {
> > unsigned long flags;
> > - struct mem_cgroup *mem_cgroup;
> > + unsigned short mem_cgroup; /* ID of assigned memory cgroup */
> > + unsigned short blk_cgroup; /* Not Used..but will be. */
>
> So later I shall have to use virtually indexed arrays in blkio controller?
> Or you are just using virtually indexed arrays for lookup speed and
> I can continue to use css_lookup() and not worry about using virtually
> indexed arrays.
>
yes. you can use css_lookup() even if it's slow.

> So the idea is that when a page is allocated, also store the blk_group
> id and once that page is submitted for writeback, we should be able
> to associate it to right blkio group?
>
blk_cgroup id can be attached whenever you wants. please overwrite
page_cgroup->blk_cgroup when it's necessary.
Did you read Ikeda's patch ? I myself doesn't have patches at this point.
This is just for make a room for recording blkio-ID, which was requested
for a year.

Hmm, but page-allocation-time doesn't sound very good for me.

Thanks.
-Kame

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Vivek Goyal on
On Wed, Jul 28, 2010 at 11:44:02AM +0900, KAMEZAWA Hiroyuki wrote:
> On Tue, 27 Jul 2010 22:39:04 -0400
> Vivek Goyal <vgoyal(a)redhat.com> wrote:
>
> > On Tue, Jul 27, 2010 at 04:56:29PM +0900, KAMEZAWA Hiroyuki wrote:
> > > From: KAMEZAWA Hiroyuki <kamezawa.hiroyu(a)jp.fujitsu.com>
> > >
> > > Now, addresses of memory cgroup can be calculated by their ID without complex.
> > > This patch relplaces pc->mem_cgroup from a pointer to a unsigned short.
> > > On 64bit architecture, this offers us more 6bytes room per page_cgroup.
> > > Use 2bytes for blkio-cgroup's page tracking. More 4bytes will be used for
> > > some light-weight concurrent access.
> > >
> > > We may able to move this id onto flags field but ...go step by step.
> > >
> > > Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu(a)jp.fujitsu.com>
> > > ---
> > > include/linux/page_cgroup.h | 3 ++-
> > > mm/memcontrol.c | 40 +++++++++++++++++++++++++---------------
> > > mm/page_cgroup.c | 2 +-
> > > 3 files changed, 28 insertions(+), 17 deletions(-)
> > >
> > > Index: mmotm-0719/include/linux/page_cgroup.h
> > > ===================================================================
> > > --- mmotm-0719.orig/include/linux/page_cgroup.h
> > > +++ mmotm-0719/include/linux/page_cgroup.h
> > > @@ -12,7 +12,8 @@
> > > */
> > > struct page_cgroup {
> > > unsigned long flags;
> > > - struct mem_cgroup *mem_cgroup;
> > > + unsigned short mem_cgroup; /* ID of assigned memory cgroup */
> > > + unsigned short blk_cgroup; /* Not Used..but will be. */
> >
> > So later I shall have to use virtually indexed arrays in blkio controller?
> > Or you are just using virtually indexed arrays for lookup speed and
> > I can continue to use css_lookup() and not worry about using virtually
> > indexed arrays.
> >
> yes. you can use css_lookup() even if it's slow.
>

Ok.

> > So the idea is that when a page is allocated, also store the blk_group
> > id and once that page is submitted for writeback, we should be able
> > to associate it to right blkio group?
> >
> blk_cgroup id can be attached whenever you wants. please overwrite
> page_cgroup->blk_cgroup when it's necessary.

> Did you read Ikeda's patch ? I myself doesn't have patches at this point.
> This is just for make a room for recording blkio-ID, which was requested
> for a year.

I have not read his patches yet. IIRC, previously there were issues
regarding which group should be charged for the page. The person who
allocated it or the thread which did last write to it etc... I guess
we can sort that out later.

>
> Hmm, but page-allocation-time doesn't sound very good for me.
>

Why?

Vivek
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: KAMEZAWA Hiroyuki on
On Tue, 27 Jul 2010 23:13:58 -0400
Vivek Goyal <vgoyal(a)redhat.com> wrote:

> > > So the idea is that when a page is allocated, also store the blk_group
> > > id and once that page is submitted for writeback, we should be able
> > > to associate it to right blkio group?
> > >
> > blk_cgroup id can be attached whenever you wants. please overwrite
> > page_cgroup->blk_cgroup when it's necessary.
>
> > Did you read Ikeda's patch ? I myself doesn't have patches at this point.
> > This is just for make a room for recording blkio-ID, which was requested
> > for a year.
>
> I have not read his patches yet. IIRC, previously there were issues
> regarding which group should be charged for the page. The person who
> allocated it or the thread which did last write to it etc... I guess
> we can sort that out later.
>
> >
> > Hmm, but page-allocation-time doesn't sound very good for me.
> >
>
> Why?
>

As you wrote, by attaching ID when a page cache is added, we'll have
much chances of free-rider until it's paged out. So, adding some
reseting-owner point may be good.

But considering real world usage, I may be wrong.
There will not be much free rider in real world, especially at write().
Then, page-allocation time may be good.

(Because database doesn't use page-cache, there will be no big random write
application.)

Thanks,
-Kame

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Vivek Goyal on
On Wed, Jul 28, 2010 at 12:21:28PM +0900, KAMEZAWA Hiroyuki wrote:
> On Wed, 28 Jul 2010 12:18:20 +0900
> KAMEZAWA Hiroyuki <kamezawa.hiroyu(a)jp.fujitsu.com> wrote:
>
> > > > Hmm, but page-allocation-time doesn't sound very good for me.
> > > >
> > >
> > > Why?
> > >
> >
> > As you wrote, by attaching ID when a page cache is added, we'll have
> > much chances of free-rider until it's paged out. So, adding some
> > reseting-owner point may be good.
> >
> > But considering real world usage, I may be wrong.
> > There will not be much free rider in real world, especially at write().
> > Then, page-allocation time may be good.
> >
> > (Because database doesn't use page-cache, there will be no big random write
> > application.)
> >
>
> Sorry, one more reason. memory cgroup has much complex code for supporting
> move_account, re-attaching memory cgroup per pages.
> So, if you take care of task-move-between-groups, blkio-ID may have
> some problems if you only support allocation-time accounting.

I think initially we can just keep it simple for blkio controller and
not move page charges across blkio cgroup when process moves.

Vivek
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/