From: David Rientjes on
On Tue, 23 Feb 2010, Miao Xie wrote:

> > Cpu hotplug sets top_cpuset's cpus_allowed to cpu_active_mask by default,
> > regardless of what was onlined or offlined. cpus_attach in the context of
> > your patch (in cpuset_attach()) passes cpu_possible_mask to
> > set_cpus_allowed_ptr() if the task is being attached to top_cpuset, my
> > question was why don't we pass cpu_active_mask instead? In other words, I
> > think we should do
> >
> > cpumask_copy(cpus_attach, cpu_active_mask);
> >
> > when attached to top_cpuset like my patch did.
>
> If we pass cpu_active_mask to set_cpus_allowed_ptr(), task->cpus_allowed just contains
> the online cpus. In this way, if we do cpu hotplug(such as: online some cpu), we must
> update cpus_allowed of all tasks in the top cpuset.
>
> But if we pass cpu_possible_mask, we needn't update cpus_allowed of all tasks in the
> top cpuset. And when the kernel looks for a cpu for task to run, the kernel will use
> cpu_active_mask to filter out offline cpus in task->cpus_allowed. Thus, it is safe.
>

That is terribly inconsistent between top_cpuset and all descendants; all
other cpusets require that task->cpus_allowed be a subset of
cpu_online_mask, including those descendants that allow all cpus (and all
mems).
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: David Rientjes on
On Wed, 24 Feb 2010, Miao Xie wrote:

> >> Sorry, Could you explain what you advised?
> >> I think it is hard to fix this problem by adding a variant, because it is
> >> hard to avoid loading a word of the mask before
> >>
> >> nodes_or(tsk->mems_allowed, tsk->mems_allowed, *newmems);
> >>
> >> and then loading another word of the mask after
> >>
> >> tsk->mems_allowed = *newmems;
> >>
> >> unless we use lock.
> >>
> >> Maybe we need a rw-lock to protect task->mems_allowed.
> >>
> >
> > I meant that we need to define synchronization only for configurations
> > that do not do atomic nodemask_t stores, it's otherwise unnecessary.
> > We'll need to load and store tsk->mems_allowed via a helper function that
> > is defined to take the rwlock for such configs and only read/write the
> > nodemask for others.
> >
>
> By investigating, we found that it is hard to guarantee the consistent between
> mempolicy and mems_allowed because mempolicy was designed as a self-update function.
> it just can be changed by one's self. Maybe we must change the implement of mempolicy.
>

Before your change, cpuset nodemask changes were serialized on
manage_mutex which would, in turn, serialize the rebinding of each
attached task's mempolicy. update_nodemask() is now serialized on
cgroup_lock(), which also protects scan_for_empty_cpusets(), so the cpuset
code protects it adequately. If a concurrent mempolicy change from a
user's set_mempolicy() happens, however, it could introduce an
inconsistency between them.

If we protect current->mems_allowed with a rwlock or seqlock for configs
where MAX_NUMNODES > BITS_PER_LONG, then we can always guarantee that we
get the entire nodemask. The same problem is present for
current->cpus_allowed, however, with NR_CPUS > BITS_PER_LONG. We must be
able to safely dereference both masks without the chance of returning
nodes_empty() or cpus_empty().
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: David Rientjes on
On Wed, 24 Feb 2010, Miao Xie wrote:

> I think it is not a big deal because it is safe and doesn't cause any problem.
> Beside that, task->cpus_allowed is initialized to cpu_possible_mask on the no-cpuset
> kernel, so using cpu_possible_mask to initialize task->cpus_allowed is reasonable.
> (top cpuset is a special cpuset, isn't it?)
>

I'm suprised that I can create a descendant cpuset of top_cpuset that
cannot include all of its parents' cpus and that the root cpuset's cpus
mask doesn't change when cpus are onlined/offlined.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/