From: Steven Rostedt on
Hi Kosaki,

FYI, could you send emails to my goodmis account. I can easily miss
emails sent to my RH account since it is usually flooded with RH
Bugzilla reports.

(more below)

On Wed, 2010-06-30 at 12:06 +0900, KOSAKI Motohiro wrote:
> Documentation/trace/ftrace.txt says
>
> buffer_size_kb:
>
> This sets or displays the number of kilobytes each CPU
> buffer can hold. The tracer buffers are the same size
> for each CPU. The displayed number is the size of the
> CPU buffer and not total size of all buffers. The
> trace buffers are allocated in pages (blocks of memory
> that the kernel uses for allocation, usually 4 KB in size).
> If the last page allocated has room for more bytes
> than requested, the rest of the page will be used,
> making the actual allocation bigger than requested.
> ( Note, the size may not be a multiple of the page size
> due to buffer management overhead. )
>
> This can only be updated when the current_tracer
> is set to "nop".
>
> But it's incorrect. currently total memory consumption is
> 'buffer_size_kb x CPUs x 2'.
>
> Why two times difference is there? because ftrace implicitly allocate
> the buffer for max latency too.
>
> That makes sad result when admin want to use large buffer. (If admin
> want full logging and makes detail analysis). example, If admin
> have 24 CPUs machine and write 200MB to buffer_size_kb, the system
> consume ~10GB memory (200MB x 24 x 2). umm.. 5GB memory waste is
> usually unacceptable.
>
> Fortunatelly, almost all users don't use max latency feature.
> The max latency buffer can be disabled easily.
>
> This patch shrink buffer size of the max latency buffer if
> unnecessary.

Actually, what would be better is to add a "use_max_tr" field to the
struct tracer in trace.h. Then the latency tracers (irqsoff,
preemptoff, preemptirqsoff, wakeup, and wakeup_rt) can have this field
set.

Then, we can resize or even remove the max ring buffer when the
"use_max_tr" is not set (and on bootup). On enabling a latency tracer,
we can allocate the buffer. When we enable another tracer (or nop) if
the use_max_tr is not set, then we can remove the buffer.

Would you be able to do something like that?

Thanks,

-- Steve


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Lai Jiangshan on
KOSAKI Motohiro wrote:
>
> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro(a)jp.fujitsu.com>
> ---
> kernel/trace/trace.c | 39 ++++++++++++++++++++++++++++++------
> kernel/trace/trace.h | 1 +
> kernel/trace/trace_irqsoff.c | 3 ++
> kernel/trace/trace_sched_wakeup.c | 2 +
> 4 files changed, 38 insertions(+), 7 deletions(-)
>

Reviewed-by: Lai Jiangshan <laijs(a)cn.fujitsu.com


> -
> + if (current_trace && current_trace->use_max_tr) {
> + /*
> + * We don't free the ring buffer. instead, resize it because
> + * The max_tr ring buffer has some state (e.g. ring->clock) and
> + * we want preserve it.
> + */
> + ring_buffer_resize(max_tr.buffer, 1);
> + max_tr.entries = 1;
> + }
> destroy_trace_option_files(topts);
>
> current_trace = t;
>
> topts = create_trace_option_files(current_trace);

I think we can skip the two resize when current_trace->use_max_tr==1 && t->use_max_tr==1

> + if (current_trace->use_max_tr) {
> + ret = ring_buffer_resize(max_tr.buffer, global_trace.entries);
> + if (ret < 0)
> + goto out;
> + max_tr.entries = global_trace.entries;
> + }
>
> if (t->init) {
> ret = tracer_init(t, tr);

Does we need to shrink it when tracer_init() fails?
Although tracer_init() hardly fails, and there is no bad effect even we don't shrink it.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: KOSAKI Motohiro on
2010/7/1 Lai Jiangshan <laijs(a)cn.fujitsu.com>:
> KOSAKI Motohiro wrote:
>>
>> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro(a)jp.fujitsu.com>
>> ---
>> �kernel/trace/trace.c � � � � � � �| � 39 ++++++++++++++++++++++++++++++------
>> �kernel/trace/trace.h � � � � � � �| � �1 +
>> �kernel/trace/trace_irqsoff.c � � �| � �3 ++
>> �kernel/trace/trace_sched_wakeup.c | � �2 +
>> �4 files changed, 38 insertions(+), 7 deletions(-)
>>
>
> Reviewed-by: Lai Jiangshan <laijs(a)cn.fujitsu.com
>
>
>> -
>> + � � if (current_trace && current_trace->use_max_tr) {
>> + � � � � � � /*
>> + � � � � � � �* We don't free the ring buffer. instead, resize it because
>> + � � � � � � �* The max_tr ring buffer has some state (e.g. ring->clock) and
>> + � � � � � � �* we want preserve it.
>> + � � � � � � �*/
>> + � � � � � � ring_buffer_resize(max_tr.buffer, 1);
>> + � � � � � � max_tr.entries = 1;
>> + � � }
>> � � � destroy_trace_option_files(topts);
>>
>> � � � current_trace = t;
>>
>> � � � topts = create_trace_option_files(current_trace);
>
> I think we can skip the two resize when current_trace->use_max_tr==1 && t->use_max_tr==1

Yup. but I don't think it's worthful because it's rarely operation.


>
>> + � � if (current_trace->use_max_tr) {
>> + � � � � � � ret = ring_buffer_resize(max_tr.buffer, global_trace.entries);
>> + � � � � � � if (ret < 0)
>> + � � � � � � � � � � goto out;
>> + � � � � � � max_tr.entries = global_trace.entries;
>> + � � }
>>
>> � � � if (t->init) {
>> � � � � � � � ret = tracer_init(t, tr);
>
> Does we need to shrink it when tracer_init() fails?
> Although tracer_init() hardly fails, and there is no bad effect even we don't shrink it.

Nope. brief code of tracing_set_tracer() is here

========================================
if (current_trace && current_trace->reset)
current_trace->reset(tr);

destroy_trace_option_files(topts);

current_trace = t;

topts = create_trace_option_files(current_trace);

if (t->init) {
ret = tracer_init(t, tr);
if (ret)
goto out;
}
========================================

That's mean, if t->init fail, we can't rollback old tracer. so your
suggested micro optimization
doesn't makes observable improvement, I think.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/