From: Steven Rostedt on
On Thu, 2010-05-13 at 18:10 -0400, Steven Rostedt wrote:
> plain text document attachment
> (0012-ring-buffer-Add-cached-pages-when-freeing-reader-pag.patch)
> From: Steven Rostedt <srostedt(a)redhat.com>
>
> When the pages are removed from the ring buffer for things like
> splice they are freed with ring_buffer_free_read_page().
> They are also allocated with ring_buffer_alloc_read_page().
>
> Currently the ring buffer does not take advantage of this situation.
> Every time the page is freed, the ring buffer simply frees it.
> When a new page is needed, it allocates it. This means that reading
> several pages with splice will cause a page to be freed and allocated
> several times. This is simply a waste.
>
> This patch adds a cache of the pages freed (16 max). This allows
> the pages to be reused quickly without need to go back to the memory
> pool.

Ah,

Silly me forgot to add locking. I was thinking that these were called
under a lock. They are in the above layers, but we can't rely on that
here.

Since this is the last patch, I'll just remove it from the branch. So
you can pull this branch still, and it will not include this patch.

Thanks!

-- Steve

> Reported-by: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
> Signed-off-by: Steven Rostedt <rostedt(a)goodmis.org>
> ---
> kernel/trace/ring_buffer.c | 41 +++++++++++++++++++++++++++++++++++------
> 1 files changed, 35 insertions(+), 6 deletions(-)
>
> diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
> index 7f6059c..40667b2 100644
> --- a/kernel/trace/ring_buffer.c
> +++ b/kernel/trace/ring_buffer.c
> @@ -157,6 +157,8 @@ static unsigned long ring_buffer_flags __read_mostly = RB_BUFFERS_ON;
>
> #define BUF_PAGE_HDR_SIZE offsetof(struct buffer_data_page, data)
>
> +#define RB_MAX_FREE_PAGES 16
> +
> /**
> * tracing_on - enable all tracing buffers
> *
> @@ -325,7 +327,10 @@ EXPORT_SYMBOL_GPL(ring_buffer_event_data);
> #define RB_MISSED_STORED (1 << 30)
>
> struct buffer_data_page {
> - u64 time_stamp; /* page time stamp */
> + union {
> + struct buffer_data_page *next; /* for free pages */
> + u64 time_stamp; /* page time stamp */
> + };
> local_t commit; /* write committed index */
> unsigned char data[]; /* data of buffer page */
> };
> @@ -472,6 +477,9 @@ struct ring_buffer {
> atomic_t record_disabled;
> cpumask_var_t cpumask;
>
> + struct buffer_data_page *free_pages;
> + int nr_free_pages;
> +
> struct lock_class_key *reader_lock_key;
>
> struct mutex mutex;
> @@ -1184,6 +1192,7 @@ EXPORT_SYMBOL_GPL(__ring_buffer_alloc);
> void
> ring_buffer_free(struct ring_buffer *buffer)
> {
> + struct buffer_data_page *bpage;
> int cpu;
>
> get_online_cpus();
> @@ -1200,6 +1209,11 @@ ring_buffer_free(struct ring_buffer *buffer)
> kfree(buffer->buffers);
> free_cpumask_var(buffer->cpumask);
>
> + while (buffer->free_pages) {
> + bpage = buffer->free_pages;
> + buffer->free_pages = bpage->next;
> + free_page((unsigned long)bpage);
> + };
> kfree(buffer);
> }
> EXPORT_SYMBOL_GPL(ring_buffer_free);
> @@ -3717,11 +3731,17 @@ void *ring_buffer_alloc_read_page(struct ring_buffer *buffer)
> struct buffer_data_page *bpage;
> unsigned long addr;
>
> - addr = __get_free_page(GFP_KERNEL);
> - if (!addr)
> - return NULL;
> + if (!buffer->free_pages) {
> + addr = __get_free_page(GFP_KERNEL);
> + if (!addr)
> + return NULL;
>
> - bpage = (void *)addr;
> + bpage = (void *)addr;
> + } else {
> + bpage = buffer->free_pages;
> + buffer->free_pages = bpage->next;
> + buffer->nr_free_pages--;
> + }
>
> rb_init_page(bpage);
>
> @@ -3738,7 +3758,16 @@ EXPORT_SYMBOL_GPL(ring_buffer_alloc_read_page);
> */
> void ring_buffer_free_read_page(struct ring_buffer *buffer, void *data)
> {
> - free_page((unsigned long)data);
> + struct buffer_data_page *bpage = data;
> +
> + if (buffer->nr_free_pages >= RB_MAX_FREE_PAGES) {
> + free_page((unsigned long)data);
> + return;
> + }
> +
> + bpage->next = buffer->free_pages;
> + buffer->free_pages = bpage;
> + buffer->nr_free_pages++;
> }
> EXPORT_SYMBOL_GPL(ring_buffer_free_read_page);
>


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Steven Rostedt on
[ Added Peter Zijlsta since he had an interest in this topic ]

On Fri, 2010-05-14 at 18:26 -0400, Mathieu Desnoyers wrote:
> * Steven Rostedt (rostedt(a)goodmis.org) wrote:

> Trying to understand the effect of splice() putting the page in the page
> cache and how it affects this patch.
>
> Basically, how do you know you can call free_page() from splice() in the
> first place ? Answering this question will probably help us see if this
> page reuse is OK.

Actually, looking at the code, I'm not sure this is safe. I was thinking
that free_page() would always free the page. But I forgot that pages
have a ref count. And a get_page() will prevent a page from freeing.

So I need to revert this patch, or rather, make a core-5 that removes
this patch and fixes the spelling in the comment.

I can later add something that checks the page ref count, but that can
wait till .36.

-- Steve




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/