From: Andi Kleen on
Miklos Szeredi <miklos(a)szeredi.hu> writes:
> +
> + /*
> + * We can shrink the pipe, if arg >= pipe->nrbufs. Since we don't
> + * expect a lot of shrink+grow operations, just free and allocate
> + * again like we would do for growing. If the pipe currently
> + * contains more buffers than arg, then return busy.
> + */
> + if (arg < pipe->nrbufs)
> + return -EBUSY;
> +
> + bufs = kcalloc(arg, sizeof(struct pipe_buffer), GFP_KERNEL);

While this is conceptually like socket buffers, socket buffers
have sophisticated mechanisms to throttle their memory use
under low memory conditions. That's not there here?

It means every user could pin a lot of memory.

-Andi

--
ak(a)linux.intel.com -- Speaking for myself only.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Jens Axboe on
On Thu, May 20 2010, Linus Torvalds wrote:
>
>
> On Thu, 20 May 2010, Jens Axboe wrote:
> >
> > The main reason I didn't push it before is that I wasn't completely sure
> > that we wanted to use fcntl() for this. But Linus doesn't seem to object
> > on that side at least.
>
> Well, I don't object as long as there is some real use-case. If it's a
> "feature just for the sake of the feature" kind of theoretical "shouldn't
> we be able to do this" thing, I don't think we should do it.
>
> But it sounded like Miklos actually had a reason for the request.

Yeah, I'd say his 4x results pretty much speak for themselves. I've
observed splice being slower as well here for high bandwidth testing,
the 64kb max buffer size ends up hurting there. So I'm very convinced
that this is a good idea. I have been for a while, was just waiting for
someone else to pop up with an independent need and results to help
shove it in :-)

--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Linus Torvalds on


On Thu, 20 May 2010, Jens Axboe wrote:
>
> The main reason I didn't push it before is that I wasn't completely sure
> that we wanted to use fcntl() for this. But Linus doesn't seem to object
> on that side at least.

Well, I don't object as long as there is some real use-case. If it's a
"feature just for the sake of the feature" kind of theoretical "shouldn't
we be able to do this" thing, I don't think we should do it.

But it sounded like Miklos actually had a reason for the request.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Michael Kerrisk on
Hi all,

I see that this patch has hit Linus's git, so some questions

On Wed, May 19, 2010 at 6:49 PM, Linus Torvalds
<torvalds(a)linux-foundation.org> wrote:
>
>
> On Wed, 19 May 2010, Miklos Szeredi wrote:
>>
>> One issue I see is that it's possible to grow pipes indefinitely.
>> Should this be restricted to privileged users?
>
> Yes. But perhaps only if it grows past the default (or perhaps "default*2"
> or similar). That way a normal user could shrink the pipe buffers, and
> then grow them again if he wants to.
>
> Oh, and I think you need to also require that there be at least two
> buffers. Otherwise we can't guarantee POSIX behavior, I think.

Is there any documentation (e.g., a man-pages patch) for these changes?

The argument of the fcntl() operations is expressed in pages. I take
it that this means that the semantics of the argument will very
depending on the system page size? So for example, 2 on x86 will mean
8192 bytes, but will mean 32768 of ia64? That seems very weird. (And
what about architectures where the page size is switchable?) Such
changes in semantics should not be silent for the use, IMO.

I'm not so sure about Linus's assertion above about needing at least
two buffers (pages?) to guarantee POSIX behavior. Back in 2.16.10 and
earlier, the buffer size was a page (4096 bytes) on x86-32, and we had
POSIX-compliant behavior.

Cheers,

Michael


--
Michael Kerrisk Linux man-pages maintainer;
http://www.kernel.org/doc/man-pages/
Author of "The Linux Programming Interface", http://blog.man7.org/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Andrew Morton on
On Sun, 23 May 2010 07:30:01 +0200 Michael Kerrisk <mtk.manpages(a)gmail.com> wrote:

> Hi all,
>
> I see that this patch has hit Linus's git, so some questions
>
> On Wed, May 19, 2010 at 6:49 PM, Linus Torvalds
> <torvalds(a)linux-foundation.org> wrote:
> >
> >
> > On Wed, 19 May 2010, Miklos Szeredi wrote:
> >>
> >> One issue I see is that it's possible to grow pipes indefinitely.
> >> Should this be restricted to privileged users?
> >
> > Yes. But perhaps only if it grows past the default (or perhaps "default*2"
> > or similar). That way a normal user could shrink the pipe buffers, and
> > then grow them again if he wants to.
> >
> > Oh, and I think you need to also require that there be at least two
> > buffers. Otherwise we can't guarantee POSIX behavior, I think.
>
> Is there any documentation (e.g., a man-pages patch) for these changes?
>
> The argument of the fcntl() operations is expressed in pages. I take
> it that this means that the semantics of the argument will very
> depending on the system page size? So for example, 2 on x86 will mean
> 8192 bytes, but will mean 32768 of ia64? That seems very weird. (And
> what about architectures where the page size is switchable?) Such
> changes in semantics should not be silent for the use, IMO.

Well, there is getpagesize(). But I agree - this interface is just
asking (x86) people to write non-portable code.

otoh, if the arg was in bytes, they'd just hard-code "8192". They're
clever like that.

But we have gone to some lengths to avoid exposing things like
PAGE_SIZE and HZ in procfs, so it makes sense to take the same approach
to syscalls.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/