padata: separate serial and parallel cpumasks [Kernel]

Prev: [PATCH 4/4] workqueue: fix worker management invocation without pending works
Next: pcrypt: sysfs interface

From: Steffen Klassert on 2 Jul 2010 05:00

Hi Dan.

Your e-mail client corrupted all your patches on multiple lines,
so they do not apply. Please fix your e-mail client and resend.
Documentation/email-clients.txt might help here.

On Thu, Jul 01, 2010 at 05:57:57PM +0400, Dan Kruchinin wrote:
> Hello, Steffen.
>
> Thank you for your comments about errors and coding-style in my previous
> patchset. I hope now I fixed all of them.

The Signed-off-by line and everything above goes to the git commit
message of the patch. So you should decribe your changes here.
It is good to read Documentation/SubmittingPatches and
Documentation/SubmitChecklist.

>
> Signed-off-by: Dan Kruchinin <dkruchinin(a)acm.org>
> ---
> crypto/pcrypt.c | 53 +++----
> include/linux/padata.h | 92 ++++++++----
> kernel/padata.c | 396
> +++++++++++++++++++++++++++++++++---------------
> 3 files changed, 361 insertions(+), 180 deletions(-)
>
> diff --git a/crypto/pcrypt.c b/crypto/pcrypt.c
> index 247178c..f9b8b0b 100644
> --- a/crypto/pcrypt.c
> +++ b/crypto/pcrypt.c
> @@ -38,28 +38,30 @@ struct pcrypt_instance_ctx {
>
> struct pcrypt_aead_ctx {
> struct crypto_aead *child;
> - unsigned int cb_cpu;
> + int cb_cpu;
> + unsigned int count;
> };
>
> -static int pcrypt_do_parallel(struct padata_priv *padata, unsigned int
> *cb_cpu,
> +static int pcrypt_do_parallel(struct padata_priv *padata,
> + struct pcrypt_aead_ctx *ctx,
> struct padata_instance *pinst)
> {
> - unsigned int cpu_index, cpu, i;
> + int cpu;
>
> - cpu = *cb_cpu;
> + cpu = ctx->cb_cpu;
> + if (cpu < 0) {
> + struct cpumask mask;
> + int cpu_index, i;
>
> - if (cpumask_test_cpu(cpu, cpu_active_mask))
> - goto out;
> + padata_get_cb_cpumask(pinst, &mask);
> + cpu_index = ctx->count % cpumask_weight(&mask);
> + cpu = cpumask_first(&mask);
> + for (i = 0; i < cpu_index; i++)
> + cpu = cpumask_next(cpu, &mask);
>
> - cpu_index = cpu % cpumask_weight(cpu_active_mask);
> -
> - cpu = cpumask_first(cpu_active_mask);
> - for (i = 0; i < cpu_index; i++)
> - cpu = cpumask_next(cpu, cpu_active_mask);
> -
> - *cb_cpu = cpu;
> + ctx->cb_cpu = cpu;
> + }

The above is not save against cpu hotplug. You initialize ctx->cb_cpu
to -1 when the transformation is initialized. So you choose your
callback cpu with the first call to pcrypt_do_parallel() and then never
again. What if the coosen callback cpu goes offline?

Also I don't understand why you changed the prototype of pcrypt_do_parallel
and added a 'count' to pcrypt_aead_ctx. The only change you have to do, is to
check whether the choosen callback cpu is still in the cpu_active_mask _and_
in your new padata callback cpumask and update the callback cpu accordingly
if not. Checking whether the callback cpu is in the callback cpumask needs
some special care in your other patch of course, because you can change this
cpumask from userspace then.

Putting the cpumask on the stack is also not a good idea. This can lead to
stack corruption on systems with many cpus.

>
> -out:
> return padata_do_parallel(pinst, padata, cpu);
> }
>
> @@ -142,7 +144,7 @@ static int pcrypt_aead_encrypt(struct aead_request *req)
> req->cryptlen, req->iv);
> aead_request_set_assoc(creq, req->assoc, req->assoclen);
>
> - err = pcrypt_do_parallel(padata, &ctx->cb_cpu, pcrypt_enc_padata);
> + err = pcrypt_do_parallel(padata, ctx, pcrypt_enc_padata);
> if (err)
> return err;
> else
> @@ -186,7 +188,7 @@ static int pcrypt_aead_decrypt(struct aead_request *req)
> req->cryptlen, req->iv);
> aead_request_set_assoc(creq, req->assoc, req->assoclen);
>
> - err = pcrypt_do_parallel(padata, &ctx->cb_cpu, pcrypt_dec_padata);
> + err = pcrypt_do_parallel(padata, ctx, pcrypt_dec_padata);
> if (err)
> return err;
> else
> @@ -232,7 +234,7 @@ static int pcrypt_aead_givencrypt(struct
> aead_givcrypt_request *req)
> aead_givcrypt_set_assoc(creq, areq->assoc, areq->assoclen);
> aead_givcrypt_set_giv(creq, req->giv, req->seq);
>
> - err = pcrypt_do_parallel(padata, &ctx->cb_cpu, pcrypt_enc_padata);
> + err = pcrypt_do_parallel(padata, ctx, pcrypt_enc_padata);
> if (err)
> return err;
> else
> @@ -243,20 +245,13 @@ static int pcrypt_aead_givencrypt(struct
> aead_givcrypt_request *req)
>
> static int pcrypt_aead_init_tfm(struct crypto_tfm *tfm)
> {
> - int cpu, cpu_index;
> struct crypto_instance *inst = crypto_tfm_alg_instance(tfm);
> - struct pcrypt_instance_ctx *ictx = crypto_instance_ctx(inst);
> struct pcrypt_aead_ctx *ctx = crypto_tfm_ctx(tfm);
> + struct pcrypt_instance_ctx *ictx = crypto_instance_ctx(inst);
> struct crypto_aead *cipher;
>
> - ictx->tfm_count++;
> -
> - cpu_index = ictx->tfm_count % cpumask_weight(cpu_active_mask);
> -
> - ctx->cb_cpu = cpumask_first(cpu_active_mask);
> - for (cpu = 0; cpu < cpu_index; cpu++)
> - ctx->cb_cpu = cpumask_next(ctx->cb_cpu, cpu_active_mask);
> -
> + ctx->count = ++ictx->tfm_count;
> + ctx->cb_cpu = -1;
> cipher = crypto_spawn_aead(crypto_instance_ctx(inst));

Same as above, just choose a cpu that is active and in your cpumask.

[snip]

> -static int padata_cpu_hash(struct padata_priv *padata)
> +/**
> + * padata_cpu_hash: Select cpu from cpumask @mask using simple
> + * hash function by integer @seq_nr.
> + *
> + * @mask: A pointer to cpumask that will be used for cpu selection.
> + * @seq_nr: An integer to hash.
> + */
> +static int padata_cpu_hash(const struct cpumask *mask, int seq_nr)
> {
> int cpu_index;
> - struct parallel_data *pd;
>
> - pd = padata->pd;
>
> /*
> * Hash the sequence numbers to the cpus by taking
> * seq_nr mod. number of cpus in use.
> */
> - cpu_index = padata->seq_nr % cpumask_weight(pd->cpumask);
> + cpu_index = seq_nr % cpumask_weight(mask);
>
> - return padata_index_to_cpu(pd, cpu_index);
> + return padata_index_to_cpu(mask, cpu_index);
> }

Why you changed the prototype of padata_cpu_hash?
I think it's better to keep it as it was. We pass in the padata_priv
of the object that wants parallel processing and you get the cpu
on which the thing must run.

>
> -static void padata_parallel_worker(struct work_struct *work)
> +/**
> + * padata_get_cb_cpumask: Fetch cpumask containing cpus that may be used
> for
> + * serialization callback from parallel data and
> copy it
> + * into the @mask.
> + *
> + * @pinst: A pointer to padata instance
> + * @mask: A pointer to cpumask structure where
> + * cpumask for callbacks will be copied.
> + */
> +void padata_get_cb_cpumask(struct padata_instance *pinst, struct cpumask
> *mask)
> {
> - struct padata_queue *queue;
> + struct parallel_data *pd;
> +
> + rcu_read_lock_bh();
> + pd = rcu_dereference(pinst->pd);
> + cpumask_copy(mask, pd->cpumask.cbcpu);
> + rcu_read_unlock_bh();
> +}
> +EXPORT_SYMBOL(padata_get_cb_cpumask);
> +
> +static void padata_parallel_worker(struct work_struct *parallel_work)
> +{
> + struct padata_parallel_queue *pqueue;
> struct parallel_data *pd;
> struct padata_instance *pinst;
> LIST_HEAD(local_list);
>
> local_bh_disable();
> - queue = container_of(work, struct padata_queue, pwork);
> - pd = queue->pd;
> + pqueue = container_of(parallel_work,
> + struct padata_parallel_queue,
> + work);
> +
> + pd = pqueue->pd;
> pinst = pd->pinst;
>
> - spin_lock(&queue->parallel.lock);
> - list_replace_init(&queue->parallel.list, &local_list);
> - spin_unlock(&queue->parallel.lock);
> + spin_lock(&pqueue->parallel.lock);
> + list_replace_init(&pqueue->parallel.list, &local_list);
> + spin_unlock(&pqueue->parallel.lock);
>
> while (!list_empty(&local_list)) {
> struct padata_priv *padata;
> @@ -93,8 +121,8 @@ static void padata_parallel_worker(struct work_struct
> *work)
> *
> * @pinst: padata instance
> * @padata: object to be parallelized
> - * @cb_cpu: cpu the serialization callback function will run on,
> - * must be in the cpumask of padata.
> + * @cb_cpu: cpu the serialization callback function will run on.
> + * NOTE: @cb_cpu *must* be in serial cpumask(i.e.
> pinst->cpumask.cbcpu)
> *
> * The parallelization callback function will run with BHs off.
> * Note: Every object which is parallelized by padata_do_parallel
> @@ -104,7 +132,7 @@ int padata_do_parallel(struct padata_instance *pinst,
> struct padata_priv *padata, int cb_cpu)
> {
> int target_cpu, err;
> - struct padata_queue *queue;
> + struct padata_parallel_queue *pqueue;
> struct parallel_data *pd;
>
> rcu_read_lock_bh();
> @@ -123,27 +151,27 @@ int padata_do_parallel(struct padata_instance *pinst,
> goto out;
>
> err = -EINVAL;
> - if (!cpumask_test_cpu(cb_cpu, pd->cpumask))
> + if (!cpumask_test_cpu(cb_cpu, pd->cpumask.cbcpu))
> goto out;
>
> err = -EINPROGRESS;
> atomic_inc(&pd->refcnt);
> padata->pd = pd;
> - padata->cb_cpu = cb_cpu;
>
> if (unlikely(atomic_read(&pd->seq_nr) == pd->max_seq_nr))
> atomic_set(&pd->seq_nr, -1);
>
> + padata->cb_cpu = cb_cpu;

Why you moved this line below the sequence number check?
It's a minor issue, but there are several places where you shuffle code
arround for no real reason. It is good to read through your patches,
so you see what you actually changed. I'll do so usually, this helps
to find unneeded and accidentally introduced changes.

I know, I'm picky. But I hope this helps to get your patches
on the right way :)

Thanks,

Steffen
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Steffen Klassert on 2 Jul 2010 05:20

On Fri, Jul 02, 2010 at 01:07:44PM +0400, Dan Kruchinin wrote:
> >
> > The above is not save against cpu hotplug. You initialize ctx->cb_cpu
> > to -1 when the transformation is initialized. So you choose your
> > callback cpu with the first call to pcrypt_do_parallel() and then never
> > again. What if the coosen callback cpu goes offline?
> >
> > Also I don't understand why you changed the prototype of pcrypt_do_parallel
> > and added a 'count' to pcrypt_aead_ctx. The only change you have to do, is
> > to
> > check whether the choosen callback cpu is still in the cpu_active_mask
> > _and_
> > in your new padata callback cpumask and update the callback cpu accordingly
> > if not. Checking whether the callback cpu is in the callback cpumask needs
> > some special care in your other patch of course, because you can change
> > this
> > cpumask from userspace then.
> >
>
> Well, my point was to reduce the code and select callback CPU only once
> according to
> serial cpumask of given data instance. In case when I modify serial cpumask
> of given padata
> instance I have to do callback cpu calculation twice in worst case. The
> first time in pcrypt_aead_init_tfm
> and the second time in pcrypt_do_parallel if cb_cpu is not set in my serial
> cpumask.
> So I decided to do it only once in pcrypt_do_parallel.
>

But the active cpumask, and now also your serial cpumask might change.
We need to catch this changes somehow, that's why I checked the active
cpumask against the callback cpu.

Steffen

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Dan Kruchinin on 2 Jul 2010 05:40

On Fri, Jul 2, 2010 at 1:32 PM, Dan Kruchinin <dan.kruchinin(a)gmail.com> wrote:
> On Fri, Jul 2, 2010 at 1:17 PM, Steffen Klassert
> <steffen.klassert(a)secunet.com> wrote:
>> On Fri, Jul 02, 2010 at 01:07:44PM +0400, Dan Kruchinin wrote:
>>> >
>>> ...
>>
>> But the active cpumask, and now also your serial cpumask might change.
>> We need to catch this changes somehow, that's why I checked the active
>> cpumask against the callback cpu.
>
> You're right, now I get it. Hence the right solution is to check if
> callback CPU is set in serial cpumask every time we do
> padata_do_serial and if it's not, recalculate its value.

padata_do_parallel of course.

> The only thing that embarrasses me in this scheme is the fact that we
> have to allocate cpumask_var_t in pcrypt_do_parallel every time we
> call it then copy serial cpumask into allocated one and then check the
> cb_cpu.
> I think it would be better if we somehow could avoid dynamic cpumask
> allocation. I see the following solutions:
>
> 1) Do the check and cb_cpu value recalculation in padata_do_parallel.
> It may check if cb_cpu is in serial_cpumask and recalculate its value
> if it isn't. The drawback of this scheme is that padata_do_parallel
> now doesn't guaranty it will forward serialization job to the same
> callback CPU we passed to it. If passed CPU is not in serial cpumask
> it will forward serialization to another CPU and we won't know its
> number. The only thing we'll know is that this CPU is in the
> serial_cpumask.
> 2) Create new structure describing pcrypt instance in pcrypt.c which
> will include waitqueue, padata instance and preallocated cpumask which
> will be used for getting padata instance serial cpumsak. It'll help to
> avoid dynamic cpumask allocation but it looks a bit awkward.
>
>>
>> Steffen
>>
>>
>
>
>
> --
> W.B.R.
> Dan Kruchinin
>

--
W.B.R.
Dan Kruchinin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Dan Kruchinin on 2 Jul 2010 05:40

On Fri, Jul 2, 2010 at 1:17 PM, Steffen Klassert
<steffen.klassert(a)secunet.com> wrote:
> On Fri, Jul 02, 2010 at 01:07:44PM +0400, Dan Kruchinin wrote:
>> >
>> > The above is not save against cpu hotplug. You initialize ctx->cb_cpu
>> > to -1 when the transformation is initialized. So you choose your
>> > callback cpu with the first call to pcrypt_do_parallel() and then never
>> > again. What if the coosen callback cpu goes offline?
>> >
>> > Also I don't understand why you changed the prototype of pcrypt_do_parallel
>> > and added a 'count' to pcrypt_aead_ctx. The only change you have to do, is
>> > to
>> > check whether the choosen callback cpu is still in the cpu_active_mask
>> > _and_
>> > in your new padata callback cpumask and update the callback cpu accordingly
>> > if not. Checking whether the callback cpu is in the callback cpumask needs
>> > some special care in your other patch of course, because you can change
>> > this
>> > cpumask from userspace then.
>> >
>>
>> Well, my point was to reduce the code and select callback CPU only once
>> according to
>> serial cpumask of given data instance. In case when I modify serial cpumask
>> of given padata
>> instance I have to do callback cpu calculation twice in worst case. The
>> first time in pcrypt_aead_init_tfm
>> and the second time in pcrypt_do_parallel if cb_cpu is not set in my serial
>> cpumask.
>> So I decided to do it only once in pcrypt_do_parallel.
>>
>
> But the active cpumask, and now also your serial cpumask might change.
> We need to catch this changes somehow, that's why I checked the active
> cpumask against the callback cpu.

You're right, now I get it. Hence the right solution is to check if
callback CPU is set in serial cpumask every time we do
padata_do_serial and if it's not, recalculate its value.
The only thing that embarrasses me in this scheme is the fact that we
have to allocate cpumask_var_t in pcrypt_do_parallel every time we
call it then copy serial cpumask into allocated one and then check the
cb_cpu.
I think it would be better if we somehow could avoid dynamic cpumask
allocation. I see the following solutions:

1) Do the check and cb_cpu value recalculation in padata_do_parallel.
It may check if cb_cpu is in serial_cpumask and recalculate its value
if it isn't. The drawback of this scheme is that padata_do_parallel
now doesn't guaranty it will forward serialization job to the same
callback CPU we passed to it. If passed CPU is not in serial cpumask
it will forward serialization to another CPU and we won't know its
number. The only thing we'll know is that this CPU is in the
serial_cpumask.
2) Create new structure describing pcrypt instance in pcrypt.c which
will include waitqueue, padata instance and preallocated cpumask which
will be used for getting padata instance serial cpumsak. It'll help to
avoid dynamic cpumask allocation but it looks a bit awkward.

>
> Steffen
>
>

--
W.B.R.
Dan Kruchinin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Steffen Klassert on 2 Jul 2010 07:20

On Fri, Jul 02, 2010 at 01:32:29PM +0400, Dan Kruchinin wrote:
> >
> > But the active cpumask, and now also your serial cpumask might change.
> > We need to catch this changes somehow, that's why I checked the active
> > cpumask against the callback cpu.
>
> You're right, now I get it. Hence the right solution is to check if
> callback CPU is set in serial cpumask every time we do
> padata_do_serial and if it's not, recalculate its value.
> The only thing that embarrasses me in this scheme is the fact that we
> have to allocate cpumask_var_t in pcrypt_do_parallel every time we
> call it then copy serial cpumask into allocated one and then check the
> cb_cpu.
> I think it would be better if we somehow could avoid dynamic cpumask
> allocation. I see the following solutions:
>
> 1) Do the check and cb_cpu value recalculation in padata_do_parallel.
> It may check if cb_cpu is in serial_cpumask and recalculate its value
> if it isn't. The drawback of this scheme is that padata_do_parallel
> now doesn't guaranty it will forward serialization job to the same
> callback CPU we passed to it. If passed CPU is not in serial cpumask
> it will forward serialization to another CPU and we won't know its
> number. The only thing we'll know is that this CPU is in the
> serial_cpumask.
> 2) Create new structure describing pcrypt instance in pcrypt.c which
> will include waitqueue, padata instance and preallocated cpumask which
> will be used for getting padata instance serial cpumsak. It'll help to
> avoid dynamic cpumask allocation but it looks a bit awkward.
>

I think the cleanest way to do it, is to maintain notifier chains
for parallel/serial cpumask changes in padata. Users can register to
these notifier chains if they are interestet in these events.
pcrypt is probaply just in changes of the serial cpumsk interested,
so you could alloc and initialize such a cpumask in pcrypt_aead_init_tfm
and add a pointer to it to pcrypt_aead_ctx.
Then you could update the cpumask with the notifier callback function.
cpumask changes are rare and slow anyway, so copying the cpumask there does
not matter that much. Since cpumask changes are rare, you can protect
pcrypt_do_parallel with RCU against cpumask changes.

Steffen
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

| Next | Last
Pages: 1 2 3
Prev: [PATCH 4/4] workqueue: fix worker management invocation without pending works
Next: pcrypt: sysfs interface