mm: iommu: An API to unify IOMMU, CPU and device memory management [Kernel]

Prev: [PATCH] Add a pair of system calls to make extended file stats available [ver #3]
Next: [PATCHv3 3/9] dspbridge: rename bridge_brd_mem_map/unmap to a proper name

From: Zach Pfeffer on 19 Jul 2010 03:00

On Thu, Jul 15, 2010 at 09:55:35AM +0100, Russell King - ARM Linux wrote:
> On Wed, Jul 14, 2010 at 06:29:58PM -0700, Zach Pfeffer wrote:
> > The VCM ensures that all mappings that map a given physical buffer:
> > IOMMU mappings, CPU mappings and one-to-one device mappings all map
> > that buffer using the same (or compatible) attributes. At this point
> > the only attribute that users can pass is CACHED. In the absence of
> > CACHED all accesses go straight through to the physical memory.
>
> So what you're saying is that if I have a buffer in kernel space
> which I already have its virtual address, I can pass this to VCM and
> tell it !CACHED, and it'll setup another mapping which is not cached
> for me?

Not quite. The existing mapping will be represented by a reservation
from the prebuilt VCM of the VM. This reservation has been marked
non-cached. Another reservation on a IOMMU VCM, also marked non-cached
will be backed with the same physical memory. This is legal in ARM,
allowing the vcm_back call to succeed. If you instead passed cached on
the second mapping, the first mapping would be non-cached and the
second would be cached. If the underlying architecture supported this
than the vcm_back would go through.

>
> You are aware that multiple V:P mappings for the same physical page
> with different attributes are being outlawed with ARMv6 and ARMv7
> due to speculative prefetching. The cache can be searched even for
> a mapping specified as 'normal, uncached' and you can get cache hits
> because the data has been speculatively loaded through a separate
> cached mapping of the same physical page.

I didn't know that. Thanks for the heads up.

> FYI, during the next merge window, I will be pushing a patch which makes
> ioremap() of system RAM fail, which should be the last core code creator
> of mappings with different memory types. This behaviour has been outlawed
> (as unpredictable) in the architecture specification and does cause
> problems on some CPUs.

That's fair enough, but it seems like it should only be outlawed for
those processors on which it breaks.

>
> We've also the issue of multiple mappings with differing cache attributes
> which needs addressing too...

The VCM has been architected to handle these things.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Michael Bohan on 19 Jul 2010 14:00

On 7/16/2010 12:58 AM, Russell King - ARM Linux wrote:

> As the patch has been out for RFC since early April on the linux-arm-kernel
> mailing list (Subject: [RFC] Prohibit ioremap() on kernel managed RAM),
> and no comments have come back from Qualcomm folk.

Would it be unreasonable to allow a map request to succeed if the
requested attributes matched that of the preexisting mapping?

Michael

--
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Zach Pfeffer on 20 Jul 2010 16:50

On Fri, Jul 16, 2010 at 08:58:56AM +0100, Russell King - ARM Linux wrote:
> On Thu, Jul 15, 2010 at 08:48:36PM -0400, Tim HRM wrote:
> > Interesting, since I seem to remember the MSM devices mostly conduct
> > IO through regions of normal RAM, largely accomplished through
> > ioremap() calls.
> >
> > Without more public domain documentation of the MSM chips and AMSS
> > interfaces I wouldn't know how to avoid this, but I can imagine it
> > creates a bit of urgency for Qualcomm developers as they attempt to
> > upstream support for this most interesting SoC.
>
> As the patch has been out for RFC since early April on the linux-arm-kernel
> mailing list (Subject: [RFC] Prohibit ioremap() on kernel managed RAM),
> and no comments have come back from Qualcomm folk.
>
> The restriction on creation of multiple V:P mappings with differing
> attributes is also fairly hard to miss in the ARM architecture
> specification when reading the sections about caches.

As you mention in your patch the things that can't conflict are memory
type (strongly- ordered/device/normal), cache policy
(cacheable/non-cacheable, copy- back/write-through), and coherency
realm (non-shareable/inner- shareable/outer-shareable). You can
conflict in allocation preferences (write-allocate/write-no-allocate),
as those are just "hints".

You can also conflict in access permissions which can and do conflict
(which are what multiple mappings are all about...some buffer can get
some access, while others get different access).

The VCM API allows the same memory to be mapped as long as it makes
sense and allows those attributes that can change to be specified. It
could be the alternative, globally applicable approach, your looking
for and request in your patch.

Without the VCM API (or something like it) there will just be a bunch
of duplicated code that's basically doing ioremap. This code will
probably fail to configure its mappings correctly, in which case your
patch is a bad idea because it'll spawn bugs all over the place
instead of at a know location. We could instead change ioremap to
match the attributes of System RAM if that's what its mapping.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Zach Pfeffer on 20 Jul 2010 18:00

On Tue, Jul 20, 2010 at 09:54:33PM +0100, Russell King - ARM Linux wrote:
> On Tue, Jul 20, 2010 at 01:45:17PM -0700, Zach Pfeffer wrote:
> > You can also conflict in access permissions which can and do conflict
> > (which are what multiple mappings are all about...some buffer can get
> > some access, while others get different access).
>
> Access permissions don't conflict between mappings - each mapping has
> unique access permissions.

Yes. Bad choice of words.

> > The VCM API allows the same memory to be mapped as long as it makes
> > sense and allows those attributes that can change to be specified. It
> > could be the alternative, globally applicable approach, your looking
> > for and request in your patch.
>
> I very much doubt it - there's virtually no call for creating an
> additional mapping of existing kernel memory with different permissions.
> The only time kernel memory gets remapped is with vmalloc(), where we
> want to create a virtually contiguous mapping from a collection of
> (possibly) non-contiguous pages. Such allocations are always created
> with R/W permissions.
>
> There are some cases where the vmalloc APIs are used to create mappings
> with different memory properties, but as already covered, this has become
> illegal with ARMv6 and v7 architectures.
>
> So no, VCM doesn't help because there's nothing that could be solved here.
> Creating read-only mappings is pointless, and creating mappings with
> different memory type, sharability or cache attributes is illegal.

I don't think its pointless; it may have limited utility but things
like read-only mappings can be useful.

> > Without the VCM API (or something like it) there will just be a bunch
> > of duplicated code that's basically doing ioremap. This code will
> > probably fail to configure its mappings correctly, in which case your
> > patch is a bad idea because it'll spawn bugs all over the place
> > instead of at a know location. We could instead change ioremap to
> > match the attributes of System RAM if that's what its mapping.
>
> And as I say, what is the point of creating another identical mapping to
> the one we already have?

As you say probably not much. We do still have a problem (and other
people have it as well) we need to map in large contiguous buffers
with various attributes and point the kernel and various engines at
them. This seems like something that would be globally useful. The
feedback I've gotten is that we should just keep our usage private to
our mach-msm branch.

I've got a couple of questions:

Do you think a global solution to this problem is appropriate?

What would that solution need to look like, transparent huge pages?

How should people change various mapping attributes for these large
sections of memory?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Zach Pfeffer on 20 Jul 2010 18:30

On Mon, Jul 19, 2010 at 09:22:13AM +0100, Russell King - ARM Linux wrote:
> On Wed, Jul 14, 2010 at 06:41:48PM -0700, Zach Pfeffer wrote:
> > On Thu, Jul 15, 2010 at 08:07:28AM +0900, FUJITA Tomonori wrote:
> > > Why we we need a new abstraction layer to solve the problem that the
> > > current API can handle?
> >
> > The current API can't really handle it because the DMA API doesn't
> > separate buffer allocation from buffer mapping.
>
> That's not entirely correct. The DMA API provides two things:
>
> 1. An API for allocating DMA coherent buffers
> 2. An API for mapping streaming buffers
>
> Some implementations of (2) end up using (1) to work around broken
> hardware - but that's a separate problem (and causes its own set of
> problems.)
>
> > For instance: I need 10, 1 MB physical buffers and a 64 KB physical
> > buffer. With the DMA API I need to allocate 10*1MB/PAGE_SIZE + 64
> > KB/PAGE_SIZE scatterlist elements, fix them all up to follow the
> > chaining specification and then go through all of them again to fix up
> > their virtual mappings for the mapper that's mapping the physical
> > buffer.
>
> You're making it sound like extremely hard work.
>
> struct scatterlist *sg;
> int i, nents = 11;
>
> sg = kmalloc(sizeof(*sg) * nents, GFP_KERNEL);
> if (!sg)
> return -ENOMEM;
>
> sg_init_table(sg, nents);
> for (i = 0; i < nents; i++) {
> if (i != nents - 1)
> len = 1048576;
> else
> len = 64*1024;
> buf = alloc_buffer(len);
> sg_set_buf(&sg[i], buf, len);
> }
>
> There's no need to split the scatterlist elements up into individual
> pages - the block layer doesn't do that when it passes scatterlists
> down to block device drivers.

Okay. Thank you for the example.

>
> I'm not saying that it's reasonable to pass (or even allocate) a 1MB
> buffer via the DMA API.

But given a bunch of large chunks of memory, is there any API that can
manage them (asked this on the other thread as well)?

> > If I want to share the buffer with another device I have to
> > make a copy of the entire thing then fix up the virtual mappings for
> > the other device I'm sharing with.
>
> This is something the DMA API doesn't do - probably because there hasn't
> been a requirement for it.
>
> One of the issues for drivers is that by separating the mapped scatterlist
> from the input buffer scatterlist, it creates something else for them to
> allocate, which causes an additional failure point - and as all users sit
> well with the current API, there's little reason to change especially
> given the number of drivers which would need to be updated.
>
> What you can do is:
>
> struct map {
> dma_addr_t addr;
> size_t len;
> };
>
> int map_sg(struct device *dev, struct scatterlist *list,
> unsigned int nents, struct map *map, enum dma_data_direction dir)
> {
> struct scatterlist *sg;
> unsigned int i, j = 0;
>
> for_each_sg(list, sg, nents, i) {
> map[j]->addr = dma_map_page(dev, sg_page(sg), sg->offset,
> sg->length, dir);
> map[j]->len = length;
> if (dma_mapping_error(map[j]->addr))
> break;
> j++;
> }
>
> return j;
> }
>
> void unmap(struct device *dev, struct map *map, unsigned int nents,
> enum dma_data_direction dir)
> {
> while (nents) {
> dma_unmap_page(dev, map->addr, map->len, dir);
> map++;
> nents--;
> }
> }
>
> Note: this may not be portable to all architectures. It may also break
> if there's something like the dmabounce or swiotlb code remapping buffers
> which don't fit the DMA mask for the device - that's a different problem.

True but given a higher-level "map(virtual_range, physical_chunks)"
wouldn't break on all architectures.

> You can then map the same scatterlist into multiple different 'map'
> arrays for several devices simultaneously. What you can't do is access
> the buffers from the CPU while they're mapped to any device.

Which is considered a feature ;)

> I'm not saying that you should do the above - I'm just proving that it's
> not as hard as you seem to be making out.

That's fair. I didn't mean to say things were hard, just that using
the DMA API for big buffer management and mapping was not ideal since
our goals are to allocate big buffers using a device specific
algorithm, give them various attributes and share them. What we
created looked generally useful.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

First | Prev | Next | Last
Pages: 1 2 3 4 5 6
Prev: [PATCH] Add a pair of system calls to make extended file stats available [ver #3]
Next: [PATCHv3 3/9] dspbridge: rename bridge_brd_mem_map/unmap to a proper name