From: James Bottomley on
On Thu, 2010-03-04 at 14:51 +0100, Pavel Machek wrote:
> > On Wed, 2010-03-03 at 21:54 +0000, Pavel Machek wrote:
> > > > With some drivers (those doing PIO) or subsystems (SCSI mass storage
> > > > over USB HCD), there is no call to flush_dcache_page() for page cache
> > > > pages, hence the ARM implementation of update_mmu_cache() doesn't flush
> > > > the D-cache (and only invalidating the I-cache doesn't help).
> > > >
> > > > The viable solutions so far:
> > > >
> > > > 1. Implement a PIO mapping API similar to the DMA API which takes
> > > > care of the D-cache flushing. This means that PIO drivers would
> > > > need to be modified to use an API like pio_kmap()/pio_kunmap()
> > > > before writing to a page cache page.
> > > > 2. Invert the meaning of PG_arch_1 to denote a clean page. This
> > > > means that by default newly allocated page cache pages are
> > > > considered dirty and even if there isn't a call to
> > > > flush_dcache_page(), update_mmu_cache() would flush the D-cache.
> > > > This is the PowerPC approach.
> > >
> > > What about option
> > >
> > > 3. Forget about PG_arch_1 and always do the flush?
> > >
> > > How big is the performance impact? Note that current code does not
> > > even *work* so working, 10% slower code will be an improvement.
> >
> > The driver fix is as simple as calling a flush_dcache_page() and I've
> > been carrying such patches in my tree for some time now. The question is
> > whether we need to do it in the driver or not (would need to update
> > Documentation/cachetlb.txt as well).
> >
> > The reason I'm not in favour always doing the flush is that we penalise
> > DMA drivers where there is no need for extra D-cache flushing (already
> > handled by the DMA API; option 1 above is similar, just that it is meant
> > for PIO usage). An ARM patch I proposed for inverting the meaning of
> > PG_arch_1 also marks a page as clean in the dma_map_* functions.
>
> But you are not fixing driver bug, are you?

Technically, he is. In the old days, most VI architectures were high
end enough not to require PIO transfers. The only exception was an IDE
driver used by sparc, which lead to the arch specific ide in/out string
instructions, in which sparc actually did all the necessary flushing.

So no other drivers than old IDE grew up with cache flushing in the PIO
case (and almost no high end VI hardware had an IDE interface, so they
rarely got implemented in the arch layer). However, recently, with the
transition from old IDE to libata and the prevalence of ARM with more
commodity hardware, the deficiency is becoming exposed. Even the PA8000
workstations now come with an IDE CD, which means we're starting to have
problems with them as well.

> Seems like ARM has requirement other architectures do not, that is
> a) not documented anywhere
> b) causes problems
>
> You could argue that performance improvement (how big is it, anyway?)
> is worth it, but this should be agreed to by wider community...

Performance is always worth it provided we don't sacrifice correctness.
The thing which was discovered in this thread is basically that ARM is
handling deferred flushing (for D/I coherency) in a slightly different
way from everyone else ... once that's fixed, ARM will likely not have
the D/I problem, but we'll still have the libata (and other PIO systems)
D flushing issue.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Russell King - ARM Linux on
On Thu, Mar 04, 2010 at 07:51:52PM +0530, James Bottomley wrote:
> On Thu, 2010-03-04 at 14:51 +0100, Pavel Machek wrote:
> > Seems like ARM has requirement other architectures do not, that is
> > a) not documented anywhere
> > b) causes problems
> >
> > You could argue that performance improvement (how big is it, anyway?)
> > is worth it, but this should be agreed to by wider community...
>
> Performance is always worth it provided we don't sacrifice correctness.
> The thing which was discovered in this thread is basically that ARM is
> handling deferred flushing (for D/I coherency) in a slightly different
> way from everyone else ... once that's fixed, ARM will likely not have
> the D/I problem, but we'll still have the libata (and other PIO systems)
> D flushing issue.

I think you've got that backwards.

Reversing the meaning of PG_arch_1 will probably fix the D aliasing issue -
since we'll interpret '0' to mean "page is dirty, it needs flushing before
hitting userspace", whereas '1' means "page has been cleaned; there are no
aliases."

This doesn not address the I/D coherency issue, where the Icache needs
attention to get rid of speculatively loaded cache lines while old data
was present in the cache.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Catalin Marinas on
On Thu, 2010-03-04 at 14:27 +0000, Russell King - ARM Linux wrote:
> On Thu, Mar 04, 2010 at 07:51:52PM +0530, James Bottomley wrote:
> > On Thu, 2010-03-04 at 14:51 +0100, Pavel Machek wrote:
> > > Seems like ARM has requirement other architectures do not, that is
> > > a) not documented anywhere
> > > b) causes problems
> > >
> > > You could argue that performance improvement (how big is it, anyway?)
> > > is worth it, but this should be agreed to by wider community...
> >
> > Performance is always worth it provided we don't sacrifice correctness.
> > The thing which was discovered in this thread is basically that ARM is
> > handling deferred flushing (for D/I coherency) in a slightly different
> > way from everyone else ... once that's fixed, ARM will likely not have
> > the D/I problem, but we'll still have the libata (and other PIO systems)
> > D flushing issue.
>
> I think you've got that backwards.
>
> Reversing the meaning of PG_arch_1 will probably fix the D aliasing issue -
> since we'll interpret '0' to mean "page is dirty, it needs flushing before
> hitting userspace", whereas '1' means "page has been cleaned; there are no
> aliases."
>
> This doesn not address the I/D coherency issue, where the Icache needs
> attention to get rid of speculatively loaded cache lines while old data
> was present in the cache.

The I-cache flushing is already handled in update_mmu_cache (or
set_pte_at in a future patch; I'm not talking about other issues on
ARM11MPCore here).

We always invalidate the I-cache currently (since we may have DMA
transfers and the page's D-cache is clean). As an optimisation, we could
use PG_arch_2 for I-cache but I don't think there is much performance
benefit compared to always invalidating the I-cache flushing.

My understanding from this long discussion is that we cannot get the
kernel modifying a page cache page which is already mapped in user space
(well, ptrace does this but we flush the cache there already).

--
Catalin

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Catalin Marinas on
On Thu, 2010-03-04 at 13:51 +0000, Pavel Machek wrote:
> > On Wed, 2010-03-03 at 21:54 +0000, Pavel Machek wrote:
> > > > With some drivers (those doing PIO) or subsystems (SCSI mass storage
> > > > over USB HCD), there is no call to flush_dcache_page() for page cache
> > > > pages, hence the ARM implementation of update_mmu_cache() doesn't flush
> > > > the D-cache (and only invalidating the I-cache doesn't help).
> > > >
> > > > The viable solutions so far:
> > > >
> > > > 1. Implement a PIO mapping API similar to the DMA API which takes
> > > > care of the D-cache flushing. This means that PIO drivers would
> > > > need to be modified to use an API like pio_kmap()/pio_kunmap()
> > > > before writing to a page cache page.
> > > > 2. Invert the meaning of PG_arch_1 to denote a clean page. This
> > > > means that by default newly allocated page cache pages are
> > > > considered dirty and even if there isn't a call to
> > > > flush_dcache_page(), update_mmu_cache() would flush the D-cache.
> > > > This is the PowerPC approach.
> > >
> > > What about option
> > >
> > > 3. Forget about PG_arch_1 and always do the flush?
> > >
> > > How big is the performance impact? Note that current code does not
> > > even *work* so working, 10% slower code will be an improvement.
> >
> > The driver fix is as simple as calling a flush_dcache_page() and I've
> > been carrying such patches in my tree for some time now. The question is
> > whether we need to do it in the driver or not (would need to update
> > Documentation/cachetlb.txt as well).
> >
> > The reason I'm not in favour always doing the flush is that we penalise
> > DMA drivers where there is no need for extra D-cache flushing (already
> > handled by the DMA API; option 1 above is similar, just that it is meant
> > for PIO usage). An ARM patch I proposed for inverting the meaning of
> > PG_arch_1 also marks a page as clean in the dma_map_* functions.
>
> But you are not fixing driver bug, are you?

Some drivers I fixed already: db8516f61b481e8, 2d68b7fe55d9e19.

> Seems like ARM has requirement other architectures do not, that is
> a) not documented anywhere
> b) causes problems

Well, ARM is pretty similar to other architectures in this respect. And
I'm sure other architectures have similar problems, only that they only
become visible in some circumstances they may not have encountered (i.e.
PIO drivers + filesystem that doesn't call flush_dcache_page like ext*).
Some other architectures may do heavier flushing

Of course, a Documentation/arm/cachetlb.txt file would make sense.

--
Catalin

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Russell King - ARM Linux on
On Thu, Mar 04, 2010 at 03:25:23PM +0000, Catalin Marinas wrote:
> On Thu, 2010-03-04 at 14:27 +0000, Russell King - ARM Linux wrote:
> > On Thu, Mar 04, 2010 at 07:51:52PM +0530, James Bottomley wrote:
> > > On Thu, 2010-03-04 at 14:51 +0100, Pavel Machek wrote:
> > > > Seems like ARM has requirement other architectures do not, that is
> > > > a) not documented anywhere
> > > > b) causes problems
> > > >
> > > > You could argue that performance improvement (how big is it, anyway?)
> > > > is worth it, but this should be agreed to by wider community...
> > >
> > > Performance is always worth it provided we don't sacrifice correctness.
> > > The thing which was discovered in this thread is basically that ARM is
> > > handling deferred flushing (for D/I coherency) in a slightly different
> > > way from everyone else ... once that's fixed, ARM will likely not have
> > > the D/I problem, but we'll still have the libata (and other PIO systems)
> > > D flushing issue.
> >
> > I think you've got that backwards.
> >
> > Reversing the meaning of PG_arch_1 will probably fix the D aliasing issue -
> > since we'll interpret '0' to mean "page is dirty, it needs flushing before
> > hitting userspace", whereas '1' means "page has been cleaned; there are no
> > aliases."
> >
> > This doesn not address the I/D coherency issue, where the Icache needs
> > attention to get rid of speculatively loaded cache lines while old data
> > was present in the cache.
>
> The I-cache flushing is already handled in update_mmu_cache (or
> set_pte_at in a future patch; I'm not talking about other issues on
> ARM11MPCore here).

You may not have been; my message was addressed to James to correct
his message, which seems to have the issues confused.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/