From: Peter Zijlstra on
On Mon, 2010-05-24 at 13:31 -0700, Michel Lespinasse wrote:
> This is version 4 of my rwsem changes.
>
> Changes since V3:
>
> - Split first x86 rwsem change in two smaller parts
> - Added David's rwsem_waiter_type enum suggestion into
> 'rwsem: down_read_critical infrastructure support'
> - Lots of minor style fixes and comments clarified
>
> I would hope the entire series can be considered for inclusion;
> however if we can not agree on the down_read_critical() bits I would
> still like patches 1-7 to be independently considered as I think they
> still have merit on their own.
>
> Michel Lespinasse (11):
> x86 rwsem: stay on fast path when count>0 in __up_write()
> x86 rwsem: minor cleanups
> rwsem: fully separate code pathes to wake writers vs readers
> rwsem: lighter active count checks when waking up readers
> rwsem: let RWSEM_WAITING_BIAS represent any number of waiting threads
> rwsem: wake queued readers when writer blocks on active read lock
> rwsem: smaller wrappers around rwsem_down_failed_common
> generic rwsem: implement down_read_critical() / up_read_critical()
> rwsem: down_read_critical infrastructure support
> x86 rwsem: down_read_critical implementation
> Use down_read_critical() for /proc/<pid>/exe and /proc/<pid>/maps files


So what happened to those patches that dropped mmap_sem during I/O?

I really don't like people tinkering with the lock implementations like
this. Nor do I like the naming, stats are in no way _critical_.

I really think adding something like this utterly defeats the purpose of
having a fair lock.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Michel Lespinasse on
On Tue, May 25, 2010 at 1:47 AM, Peter Zijlstra <peterz(a)infradead.org> wrote:
> So what happened to those patches that dropped mmap_sem during I/O?

Yes, we do have patches trying to release the mmap_sem when a page
fault for a file backed VMA blocks on accessing the corresponding
file. We have not given up on these, and we intend to try submitting
them again. However, these patches do *not* address the case of a page
fault blocking while trying to get a free page (i.e. when you get
under high memory pressure).

> I really don't like people tinkering with the lock implementations like
> this. Nor do I like the naming, stats are in no way _critical_.

Critical here refers to the fact that you're not allowed to block
while holding the unfairly acquired rwsem.

--
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Peter Zijlstra on
On Tue, 2010-05-25 at 02:12 -0700, Michel Lespinasse wrote:
> On Tue, May 25, 2010 at 1:47 AM, Peter Zijlstra <peterz(a)infradead.org> wrote:
> > So what happened to those patches that dropped mmap_sem during I/O?
>
> Yes, we do have patches trying to release the mmap_sem when a page
> fault for a file backed VMA blocks on accessing the corresponding
> file. We have not given up on these, and we intend to try submitting
> them again. However, these patches do *not* address the case of a page
> fault blocking while trying to get a free page (i.e. when you get
> under high memory pressure).

But I guess they could, right? Simply make the allocation under mmap_sem
be __GFP_HARDWALL|__GFP_HIGHMEM|__GFP_MOVABLE__GFP_NOWARN or
(GFP_HUGHUSER_MOVABLE & ~(__GFP_WAIT|__GFP_IO|__GFP_FS))|__GFP_NOWARN

and drop the mmap_sem when that fails.

> > I really don't like people tinkering with the lock implementations like
> > this. Nor do I like the naming, stats are in no way _critical_.
>
> Critical here refers to the fact that you're not allowed to block
> while holding the unfairly acquired rwsem.

We usually call that atomic, your 0/n patch didn't explain any of that.

Also, do you really think doing something like:

/*
* Check the vma index is within the range and do
* sequential scan until m_index.
*/
vma = NULL;
if ((unsigned long)l < mm->map_count) {
vma = mm->mmap;
while (l-- && vma)
vma = vma->vm_next;
goto out;
}

with preemption disabled is a _good_ thing?

People were talking about raising our vma limit of 64k...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: KAMEZAWA Hiroyuki on
On Tue, 25 May 2010 11:27:55 +0200
Peter Zijlstra <peterz(a)infradead.org> wrote:

> On Tue, 2010-05-25 at 02:12 -0700, Michel Lespinasse wrote:
> > On Tue, May 25, 2010 at 1:47 AM, Peter Zijlstra <peterz(a)infradead.org> wrote:
> > > So what happened to those patches that dropped mmap_sem during I/O?
> >
> > Yes, we do have patches trying to release the mmap_sem when a page
> > fault for a file backed VMA blocks on accessing the corresponding
> > file. We have not given up on these, and we intend to try submitting
> > them again. However, these patches do *not* address the case of a page
> > fault blocking while trying to get a free page (i.e. when you get
> > under high memory pressure).
>
> But I guess they could, right? Simply make the allocation under mmap_sem
> be __GFP_HARDWALL|__GFP_HIGHMEM|__GFP_MOVABLE__GFP_NOWARN or
> (GFP_HUGHUSER_MOVABLE & ~(__GFP_WAIT|__GFP_IO|__GFP_FS))|__GFP_NOWARN
>
> and drop the mmap_sem when that fails.
>
> > > I really don't like people tinkering with the lock implementations like
> > > this. Nor do I like the naming, stats are in no way _critical_.
> >
> > Critical here refers to the fact that you're not allowed to block
> > while holding the unfairly acquired rwsem.
>
> We usually call that atomic, your 0/n patch didn't explain any of that.
>
> Also, do you really think doing something like:
>
> /*
> * Check the vma index is within the range and do
> * sequential scan until m_index.
> */
> vma = NULL;
> if ((unsigned long)l < mm->map_count) {
> vma = mm->mmap;
> while (l-- && vma)
> vma = vma->vm_next;
> goto out;
> }
>
> with preemption disabled is a _good_ thing?
>
> People were talking about raising our vma limit of 64k...

Yes, at least, my customers want over 64k vmas ;)

Hmm..can't we use something speculative lookup for reading maps ?
(as we played in several months ago...)

Thanks,
-Kame


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Michel Lespinasse on
On Tue, May 25, 2010 at 11:27:55AM +0200, Peter Zijlstra wrote:
> On Tue, 2010-05-25 at 02:12 -0700, Michel Lespinasse wrote:
> > Yes, we do have patches trying to release the mmap_sem when a page
> > fault for a file backed VMA blocks on accessing the corresponding
> > file. We have not given up on these, and we intend to try submitting
> > them again. However, these patches do *not* address the case of a page
> > fault blocking while trying to get a free page (i.e. when you get
> > under high memory pressure).
>
> But I guess they could, right? Simply make the allocation under mmap_sem
> be __GFP_HARDWALL|__GFP_HIGHMEM|__GFP_MOVABLE__GFP_NOWARN or
> (GFP_HUGHUSER_MOVABLE & ~(__GFP_WAIT|__GFP_IO|__GFP_FS))|__GFP_NOWARN
>
> and drop the mmap_sem when that fails.

It's not clear to me if this can lead to a clean uncontroversial solution.
Doing this for file backed VMAs does not sound any harder in principle,
but we could not get it past linus's NACK last time. I think it's worth
exploring again, but I don't expect it to be so easy :)

> > > I really don't like people tinkering with the lock implementations like
> > > this. Nor do I like the naming, stats are in no way _critical_.
> >
> > Critical here refers to the fact that you're not allowed to block
> > while holding the unfairly acquired rwsem.
>
> We usually call that atomic, your 0/n patch didn't explain any of that.

Would replacing the 'critical' name with 'atomic' address your concern
though, or would you remain fundamentally opposed to anything that involves
an unfair acquire path ?

What about patches 1-7 which don't deal with the critical/atomic API;
can we agree to get these in before we we figure out what to do with
the the last 4 ?

--
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/