From: Benjamin Herrenschmidt on
On Wed, 2010-03-03 at 12:47 +0900, FUJITA Tomonori wrote:
> The ways to improve the approach (introducing PG_arch_2 or marking a
> page clean on dma_unmap_* with DMA_FROM_DEVICE like ia64 does) is up
> to architectures.

How does the above work ? IE, the dma unmap will flush the D side but
not the I side ... or is the ia64 flush primitive magic enough to do
both ?

Cheers,
Ben.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: James Bottomley on
On Wed, 2010-03-03 at 16:10 +1100, Benjamin Herrenschmidt wrote:
> On Wed, 2010-03-03 at 12:47 +0900, FUJITA Tomonori wrote:
> > The ways to improve the approach (introducing PG_arch_2 or marking a
> > page clean on dma_unmap_* with DMA_FROM_DEVICE like ia64 does) is up
> > to architectures.
>
> How does the above work ? IE, the dma unmap will flush the D side but
> not the I side ... or is the ia64 flush primitive magic enough to do
> both ?

The point is that in a well regulated system, the I cache shouldn't need
extra flushing in the kernel. We should only be faulting in R-X pages.
If we're operating on RWX pages (i.e. self modifying code), it's the job
of userspace to keep I/D coherency.

So the only case the kernel needs to worry about is the R-X fault case
for executable text code.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: FUJITA Tomonori on
On Wed, 03 Mar 2010 16:10:32 +1100
Benjamin Herrenschmidt <benh(a)kernel.crashing.org> wrote:

> On Wed, 2010-03-03 at 12:47 +0900, FUJITA Tomonori wrote:
> > The ways to improve the approach (introducing PG_arch_2 or marking a
> > page clean on dma_unmap_* with DMA_FROM_DEVICE like ia64 does) is up
> > to architectures.
>
> How does the above work ? IE, the dma unmap will flush the D side but
> not the I side ... or is the ia64 flush primitive magic enough to do
> both ?

On ia64 platform, I (and D) cache is coherent with the memory that you
did DMA to, I think. But better to ask an ia64 guru. :)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Russell King - ARM Linux on
On Wed, Mar 03, 2010 at 11:10:09AM +0530, James Bottomley wrote:
> On Wed, 2010-03-03 at 16:10 +1100, Benjamin Herrenschmidt wrote:
> > On Wed, 2010-03-03 at 12:47 +0900, FUJITA Tomonori wrote:
> > > The ways to improve the approach (introducing PG_arch_2 or marking a
> > > page clean on dma_unmap_* with DMA_FROM_DEVICE like ia64 does) is up
> > > to architectures.
> >
> > How does the above work ? IE, the dma unmap will flush the D side but
> > not the I side ... or is the ia64 flush primitive magic enough to do
> > both ?
>
> The point is that in a well regulated system, the I cache shouldn't need
> extra flushing in the kernel. We should only be faulting in R-X pages.

James, that's a pipedream. If you have a processor which doesn't support
NX, then the kernel marks all regions executable, even if the app only
asks for RW protection.

You end up with the protection masks always having VM_EXEC set in them,
so there's no way to distinguish from the kernel POV which pages are
going to be executed and those which aren't.

And if you can't do that, you have to _always_ flush the I cache for
every page fault, because you don't know if the I cache is out of sync
with the page that you've just read in from disk - and therefore you
may end up executing bad code instead of the glibc text that was
intended.

So here's the question: in a system where the responsibility for I-cache
flushing is in userspace, how do you ensure that you can execute code
in userspace to do this I-cache flushing without first having flushed
the (speculatively prefetching) I-cache?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: James Bottomley on
On Wed, 2010-03-03 at 09:36 +0000, Russell King - ARM Linux wrote:
> On Wed, Mar 03, 2010 at 11:10:09AM +0530, James Bottomley wrote:
> > On Wed, 2010-03-03 at 16:10 +1100, Benjamin Herrenschmidt wrote:
> > > On Wed, 2010-03-03 at 12:47 +0900, FUJITA Tomonori wrote:
> > > > The ways to improve the approach (introducing PG_arch_2 or marking a
> > > > page clean on dma_unmap_* with DMA_FROM_DEVICE like ia64 does) is up
> > > > to architectures.
> > >
> > > How does the above work ? IE, the dma unmap will flush the D side but
> > > not the I side ... or is the ia64 flush primitive magic enough to do
> > > both ?
> >
> > The point is that in a well regulated system, the I cache shouldn't need
> > extra flushing in the kernel. We should only be faulting in R-X pages.
>
> James, that's a pipedream. If you have a processor which doesn't support
> NX, then the kernel marks all regions executable, even if the app only
> asks for RW protection.

I'm not talking about what the processor supports ... I'm talking about
what the user sets on the VMA. My point is that the kernel only has
responsibility in specific situations ... it's those paths we do the I/D
coherency on.

> You end up with the protection masks always having VM_EXEC set in them,
> so there's no way to distinguish from the kernel POV which pages are
> going to be executed and those which aren't.

I think you're talking about the pte page flags, I'm talking about the
VMA ones above.

> And if you can't do that, you have to _always_ flush the I cache for
> every page fault, because you don't know if the I cache is out of sync
> with the page that you've just read in from disk - and therefore you
> may end up executing bad code instead of the glibc text that was
> intended.

If you're doing a not present, fault in a VMA executable region, I
agree ... since that's the start of the lifecycle where we have to begin
with I/D coherent.

> So here's the question: in a system where the responsibility for I-cache
> flushing is in userspace, how do you ensure that you can execute code
> in userspace to do this I-cache flushing without first having flushed
> the (speculatively prefetching) I-cache?

I'm not saying the common path (faulting in text sections) is the
responsibility of user space. I'm saying the uncommon path, write
modification of binaries, is. So the kernel only needs to worry about
the ordinary text fault path.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/