From: Chris Wilson on
On Tue, 11 May 2010 10:48:18 -0400, Andrew Morton <akpm(a)linux-foundation.org> wrote:
>
> On Tue, 11 May 2010 17:10:53 +0100 Chris Wilson <chris(a)chris-wilson.co.uk> wrote:
>
> > On Tue, 11 May 2010 20:30:07 +0530, Jaswinder Singh Rajput <jaswinderlinux(a)gmail.com> wrote:
> > > Hello,
> > >
> > > With latest git kernel, I am getting following DRM error and not
> > > getting XWindows :
> >
> > [snip]
> >
> > Hmm, there are still patches for capturing error state that haven't gone
> > upstream, shame on me.
> >
> > That error is a secondary issue to the GPU hang that is being reported. If
> > it is a regression caused by a kernel update it would be very useful if
> > you could bisect to the erroneous commit.
>
> It helps if one reads the code and the trace...
>
> i915_error_object_create() is using KM_USER0 from softirq context.
> That's a bug, and a pretty serious one. If some innocent civilian is
> writing highmem data to disk and this timer interrupt fires and trashes
> his KM_USER0 slot, the disk contents will be corrupted.
>
> Something like this...
>
> --- a/drivers/gpu/drm/i915/i915_irq.c~a
> +++ a/drivers/gpu/drm/i915/i915_irq.c
> @@ -456,11 +456,15 @@ i915_error_object_create(struct drm_devi
>
> for (page = 0; page < page_count; page++) {
> void *s, *d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
> + unsigned long flags;
> +
> if (d == NULL)
> goto unwind;
> - s = kmap_atomic(src_priv->pages[page], KM_USER0);
> + local_irq_save(flags);
> + s = kmap_atomic(src_priv->pages[page], KM_IRQ0);
> memcpy(d, s, PAGE_SIZE);
> - kunmap_atomic(s, KM_USER0);
> + kunmap_atomic(s, KM_IRQ0);
> + local_irq_restore(flags);
> dst->pages[page] = d;
> }
> dst->page_count = page_count;
> _
>
> Please let's get a tested fix for this into 2.6.34.

The change that I actually want is to replace the kmap_atomic(cpu_page) with an
io_mapping_map_atomic_wc(gtt_page), in case there is a incoherency between
the CPU and the GPU, we want to record what the GPU executed. Do you know
how if similar precautions are required with io_mapping_map_atomic_wc()?

--
Chris Wilson, Intel Open Source Technology Centre
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Chris Wilson on
On Tue, 11 May 2010 11:35:55 -0400, Andrew Morton <akpm(a)linux-foundation.org> wrote:
> No, io_mapping_map_atomic_wc() cannot be used from [soft]irq context:
> it hardwires use of KM_USER0. I suggest that io_mapping_create_wc(),
> io_mapping_map_atomic_wc() etc be changed so that the caller passes in the
> KM_foo kmap slot index.

Argh, sorry for the noise, read the mail in the wrong order. Thanks for
the review. It would be sensible to go with your simpler patch whilst
io_mapping_map_atomic_wc() is improved.

--
Chris Wilson, Intel Open Source Technology Centre
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Andrew Morton on
On Tue, 11 May 2010 19:52:31 +0100
Chris Wilson <chris(a)chris-wilson.co.uk> wrote:

> On Tue, 11 May 2010 11:35:55 -0400, Andrew Morton <akpm(a)linux-foundation.org> wrote:
> > No, io_mapping_map_atomic_wc() cannot be used from [soft]irq context:
> > it hardwires use of KM_USER0. I suggest that io_mapping_create_wc(),
> > io_mapping_map_atomic_wc() etc be changed so that the caller passes in the
> > KM_foo kmap slot index.
>
> Argh, sorry for the noise, read the mail in the wrong order. Thanks for
> the review. It would be sensible to go with your simpler patch whilst
> io_mapping_map_atomic_wc() is improved.

OK. I'll be sending a bunch of fixes Linuswards in an hour or two.
Should I include this?


Subject: drivers/gpu/drm/i915/i915_irq.c:i915_error_object_create(): use correct kmap-atomic slot
From: Andrew Morton <akpm(a)linux-foundation.org>

i915_error_object_create() is called from the timer interrupt and hence
can corrupt the KM_USER0 slot. Use KM_IRQ0 instead.

Reported-by: Jaswinder Singh Rajput <jaswinderlinux(a)gmail.com>
Tested-by: Jaswinder Singh Rajput <jaswinderlinux(a)gmail.com>
Cc: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: Dave Airlie <airlied(a)linux.ie>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---

drivers/gpu/drm/i915/i915_irq.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)

diff -puN drivers/gpu/drm/i915/i915_irq.c~drivers-gpu-drm-i915-i915_irqc-i915_error_object_create-use-correct-kmap-atomic-slot drivers/gpu/drm/i915/i915_irq.c
--- a/drivers/gpu/drm/i915/i915_irq.c~drivers-gpu-drm-i915-i915_irqc-i915_error_object_create-use-correct-kmap-atomic-slot
+++ a/drivers/gpu/drm/i915/i915_irq.c
@@ -461,11 +461,15 @@ i915_error_object_create(struct drm_devi

for (page = 0; page < page_count; page++) {
void *s, *d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
+ unsigned long flags;
+
if (d == NULL)
goto unwind;
- s = kmap_atomic(src_priv->pages[page], KM_USER0);
+ local_irq_save(flags);
+ s = kmap_atomic(src_priv->pages[page], KM_IRQ0);
memcpy(d, s, PAGE_SIZE);
- kunmap_atomic(s, KM_USER0);
+ kunmap_atomic(s, KM_IRQ0);
+ local_irq_restore(flags);
dst->pages[page] = d;
}
dst->page_count = page_count;
_

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Chris Wilson on
On Tue, 11 May 2010 12:10:01 -0700, Andrew Morton <akpm(a)linux-foundation.org> wrote:
> On Tue, 11 May 2010 19:52:31 +0100
> Chris Wilson <chris(a)chris-wilson.co.uk> wrote:
>
> > On Tue, 11 May 2010 11:35:55 -0400, Andrew Morton <akpm(a)linux-foundation.org> wrote:
> > > No, io_mapping_map_atomic_wc() cannot be used from [soft]irq context:
> > > it hardwires use of KM_USER0. I suggest that io_mapping_create_wc(),
> > > io_mapping_map_atomic_wc() etc be changed so that the caller passes in the
> > > KM_foo kmap slot index.
> >
> > Argh, sorry for the noise, read the mail in the wrong order. Thanks for
> > the review. It would be sensible to go with your simpler patch whilst
> > io_mapping_map_atomic_wc() is improved.
>
> OK. I'll be sending a bunch of fixes Linuswards in an hour or two.
> Should I include this?

Yes.

Acked-by: Chris Wilson <chris(a)chris-wilson.co.uk>

--
Chris Wilson, Intel Open Source Technology Centre
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Dave Airlie on
On Wed, May 12, 2010 at 5:57 AM, Chris Wilson <chris(a)chris-wilson.co.uk> wrote:
> On Tue, 11 May 2010 12:10:01 -0700, Andrew Morton <akpm(a)linux-foundation.org> wrote:
>> On Tue, 11 May 2010 19:52:31 +0100
>> Chris Wilson <chris(a)chris-wilson.co.uk> wrote:
>>
>> > On Tue, 11 May 2010 11:35:55 -0400, Andrew Morton <akpm(a)linux-foundation.org> wrote:
>> > > No, io_mapping_map_atomic_wc() cannot be used from [soft]irq context:
>> > > it hardwires use of KM_USER0. �I suggest that io_mapping_create_wc(),
>> > > io_mapping_map_atomic_wc() etc be changed so that the caller passes in the
>> > > KM_foo kmap slot index.
>> >
>> > Argh, sorry for the noise, read the mail in the wrong order. Thanks for
>> > the review. It would be sensible to go with your simpler patch whilst
>> > io_mapping_map_atomic_wc() is improved.
>>
>> OK. �I'll be sending a bunch of fixes Linuswards in an hour or two.
>> Should I include this?
>
> Yes.
>
> Acked-by: Chris Wilson <chris(a)chris-wilson.co.uk>
>

I'm not sure pushing this in at this point is a good idea, if I'm
reading it correctly we've no idea what KM_IRQ is being used for, and
this codepath is called from non-irq contexts just as much as irq
contexts.

I'd rather we just backout the hangcheck stuff touching copies at all
at this point, and try again doing it properly with a slow work or
something for later.

Dave.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/