From: Bharata B Rao on
On Fri, Feb 12, 2010 at 06:54:52PM -0800, Paul Turner wrote:
>
> The skeleton of our approach is as follows:
> - As above we maintain a global pool, per-tg, pool of unassigned quota. On it
> we track the bandwidth period, quota per period, and runtime remaining in
> the current period. As bandwidth is used within a period it is decremented
> from runtime. Runtime is currently synchronized using a spinlock, in the
> current implementation there's no reason this couldn't be done using
> atomic ops instead however the spinlock allows for a little more flexibility
> in experimentation with other schemes.
> - When a cfs_rq participating in a bandwidth constrained task_group executes
> it acquires time in sysctl_sched_cfs_bandwidth_slice (default currently
> 10ms) size chunks from the global pool, this synchronizes under rq->lock and
> is part of the update_curr path.
> - Throttled entities are dequeued immediately (as opposed to delaying this
> operation to the put path), this avoids some potentially poor load-balancer
> interactions and preserves the 'verbage' of the put_task semantic.
> Throttled entities are gated from participating in the tree at the
> {enqueue, dequeue}_entity level. They are also skipped for load
> balance in the same manner as Bharatta's patch-series employs.

I did defer the dequeue until next put because walking the se hierarchy
multiple times (from update_curr -> dequeue_entity -> update_curr) appeared
too complex when I started with it.

>
> Interface:
> ----------
> Two new cgroupfs files are added to the cpu subsystem:
> - cpu.cfs_period_us : period over which bandwidth is to be regulated
> - cpu.cfs_quota_us : bandwidth available for consumption per period
>
> One important interface change that this introduces (versus the rate limits
> proposal) is that the defined bandwidth becomes an absolute quantifier.
>
> e.g. a bandwidth of 5 seconds (cpu.cfs_quota_us=5000000) on a period of 1 second
> (cpu.cfs_period_us=1000000) would result in 5 wall seconds of cpu time being
> consumable every 1 wall second.

As I have said earlier, I would like to hear what others say about this
interface. Especially from Linux-vserver project since it is already
using the cfs hard limit patches in their test release. Herbert ?

Thanks for your work. More later when I review the individual patches
in detail.

Regards,
Bharata.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Bharata B Rao on
On Fri, Feb 12, 2010 at 06:54:52PM -0800, Paul Turner wrote:
> Todo:
> -----
> - hierarchal nr_tasks_running accounting:
> This is a deficiency currently shared with SCHED_RT rate limiting. When
> entities is throttled the running tasks it owns are not subtracted from
> rq->nr_running. This then results in us missing idle_balance() due to
> phantom tasks and load balancer weight per task calculations being
> incorrect.
>
> This code adds complexity which was both increasing the complexity of the
> initial review for this patchset and truly probably best reviewed
> independently of this feature's scope. To that end we'll post a separate
> patch for this issue against the current RT rate-limiting code and merge any
> converged on approach here as appropriate.

I had tried updating rq->nr_running in my v2 patchset
(http://lkml.org/lkml/2009/9/30/117, http://lkml.org/lkml/2009/9/30/119)
But since I felt that it added a lot of complexity, I removed it
subsequently in v3 (http://lkml.org/lkml/2009/11/9/65) and kept it similar
to RT.

>
> - throttle statistics:
> Some statistics regarding the frequency and duration of throttling
> definitely in order.

Please take a look at some of the throttling related stats I am collecting
in my patchset.

Regards,
Bharata.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/