From: Steffen Klassert on
On Thu, Jul 01, 2010 at 06:28:34PM +0400, Dan Kruchinin wrote:
> >
> > These statistic counters add a lot of atomic operations to the fast-path.
> > Would'nt it be better to have these statistics in a percpu manner?
> > This would avoid the atomic operations and we would get some additional
> > information on the distribution of the queued objects.
> >
>
> If I understood you correctly the resulting sysfs hierarchy would look like
> this one:
> pcrypt/
> |- serial_cpumask
> |- parallel_cpumask
> |- w0/
> +--- parallel_objects
> +--- serial_objects
> +--- reorder_objects
> |- w1/
> ...
> |- wN/
>
> right? If so I think it won't be very convenient to monitor summary number
> of parallel, serial and reorder objects.

Yes, I thought about something like this. You can still take the sum
over the percpu objects when you output the statistics.


> Anyway I think these atomic operations take very small time in comparison
> with other operations in padata. So small that it can be ignored.

I have a patch in queue that simplifies the serialization mechanism and
reduces the accesses of foreign and global memory as much as possible
in the parallel codepath. Adding atomic operations to global memory
(just to collect statistics) to the parallel codepath would go in the
opposite direction.

Steffen
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Dan Kruchinin on
On Fri, Jul 2, 2010 at 1:08 PM, Steffen Klassert
<steffen.klassert(a)secunet.com> wrote:
> On Thu, Jul 01, 2010 at 06:28:34PM +0400, Dan Kruchinin wrote:
>> >
>> > These statistic counters add a lot of atomic operations to the fast-path.
>> > Would'nt it be better to have these statistics in a percpu manner?
>> > This would avoid the atomic operations and we would get some additional
>> > information on the distribution of the queued objects.
>> >
>>
>> If I understood you correctly the resulting sysfs hierarchy would look like
>> this one:
>> pcrypt/
>> |- serial_cpumask
>> |- parallel_cpumask
>> |- w0/
>> +--- parallel_objects
>> +--- serial_objects
>> +--- reorder_objects
>> |- w1/
>> ...
>> |- wN/
>>
>> right? If so I think it won't be very convenient to monitor summary number
>> of parallel, serial and reorder objects.
>
> Yes, I thought about something like this. You can still take the sum
> over the percpu objects when you output the statistics.

But summation can not be clear without some kind of lock because
while we're summing another CPU can increase or decrease its percpu statistic
counters. Then each statistic percpu counter must be modified under lock, right?

>
>
>> Anyway I think these atomic operations take very small time in comparison
>> with other operations in padata. So small that it can be ignored.
>
> I have a patch in queue that simplifies the serialization mechanism and
> reduces the accesses of foreign and global memory as much as possible
> in the parallel codepath. Adding atomic operations to global memory
> (just to collect statistics) to the parallel codepath would go in the
> opposite direction.
>
> Steffen
>



--
W.B.R.
Dan Kruchinin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Steffen Klassert on
On Fri, Jul 02, 2010 at 02:20:15PM +0400, Dan Kruchinin wrote:
> >
> > Yes, I thought about something like this. You can still take the sum
> > over the percpu objects when you output the statistics.
>
> But summation can not be clear without some kind of lock because
> while we're summing another CPU can increase or decrease its percpu statistic
> counters. Then each statistic percpu counter must be modified under lock, right?
>

Yes, the counters must accessed under lock. In the fastpath functions you
hold the appropriate lock anyway. Modifying a local percpu value should
not be too painfull there.

The expensive thing is to access the percpu statistics, but this happens
on demand and is probaply a rare event.

Steffen
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Steffen Klassert on
On Fri, Jul 02, 2010 at 02:20:15PM +0400, Dan Kruchinin wrote:
> On Fri, Jul 2, 2010 at 1:08 PM, Steffen Klassert
> <steffen.klassert(a)secunet.com> wrote:
> > On Thu, Jul 01, 2010 at 06:28:34PM +0400, Dan Kruchinin wrote:
> >> >
> >> > These statistic counters add a lot of atomic operations to the fast-path.
> >> > Would'nt it be better to have these statistics in a percpu manner?
> >> > This would avoid the atomic operations and we would get some additional
> >> > information on the distribution of the queued objects.
> >> >
> >>
> >> If I understood you correctly the resulting sysfs hierarchy would look like
> >> this one:
> >> pcrypt/
> >> |- serial_cpumask
> >> |- parallel_cpumask
> >> |- w0/
> >> +--- parallel_objects
> >> +--- serial_objects
> >> +--- reorder_objects
> >> |- w1/
> >> ...
> >> |- wN/
> >>
> >> right? If so I think it won't be very convenient to monitor summary number
> >> of parallel, serial and reorder objects.
> >
> > Yes, I thought about something like this. You can still take the sum
> > over the percpu objects when you output the statistics.
>
> But summation can not be clear without some kind of lock because
> while we're summing another CPU can increase or decrease its percpu statistic
> counters. Then each statistic percpu counter must be modified under lock, right?
>

Thinking a bit longer about this statistics, this statistics work should be
an extra patch. We should focus on the cpumask separation now and think
about this statistics later.

Steffen
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/