mm: iommu: An API to unify IOMMU, CPU and device memory management [Kernel]

Prev: [PATCH] Add a pair of system calls to make extended file stats available [ver #3]
Next: [PATCHv3 3/9] dspbridge: rename bridge_brd_mem_map/unmap to a proper name

From: Zach Pfeffer on 20 Jul 2010 20:50

On Mon, Jul 19, 2010 at 05:21:35AM -0400, Tim HRM wrote:
> On Fri, Jul 16, 2010 at 8:01 PM, Larry Bassel <lbassel(a)codeaurora.org> wrote:
> > On 16 Jul 10 08:58, Russell King - ARM Linux wrote:
> >> On Thu, Jul 15, 2010 at 08:48:36PM -0400, Tim HRM wrote:
> >> > Interesting, since I seem to remember the MSM devices mostly conduct
> >> > IO through regions of normal RAM, largely accomplished through
> >> > ioremap() calls.
> >> >
> >> > Without more public domain documentation of the MSM chips and AMSS
> >> > interfaces I wouldn't know how to avoid this, but I can imagine it
> >> > creates a bit of urgency for Qualcomm developers as they attempt to
> >> > upstream support for this most interesting SoC.
> >>
> >> As the patch has been out for RFC since early April on the linux-arm-kernel
> >> mailing list (Subject: [RFC] Prohibit ioremap() on kernel managed RAM),
> >> and no comments have come back from Qualcomm folk.
> >
> > We are investigating the impact of this change on us, and I
> > will send out more detailed comments next week.
> >
> >>
> >> The restriction on creation of multiple V:P mappings with differing
> >> attributes is also fairly hard to miss in the ARM architecture
> >> specification when reading the sections about caches.
> >>
> >
> > Larry Bassel
> >
> > --
> > Sent by an employee of the Qualcomm Innovation Center, Inc.
> > The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.
> >
>
> Hi Larry and Qualcomm people.
> I'm curious what your reason for introducing this new api (or adding
> to dma) is. Specifically how this would be used to make the memory
> mapping of the MSM chip dynamic in contrast to the fixed _PHYS defines
> in the Android and Codeaurora trees.

The MSM has many integrated engines that allow offloading a variety of
workloads. These engines have always addressed memory using physical
addresses, because of this we had to reserve large (10's MB) buffers
at boot. These buffers are never freed regardless of whether an engine
is actually using them. As you can imagine, needing to reserve memory
for all time on a device that doesn't have a lot of memory in the
first place is not ideal because that memory could be used for other
things, running apps, etc.

To solve this problem we put IOMMUs in front of a lot of the
engines. IOMMUs allow us to map physically discontiguous memory into a
virtually contiguous address range. This means that we could ask the
OS for 10 MB of pages and map all of these into our IOMMU space and
the engine would still see a contiguous range.

In reality, limitations in the hardware meant that we needed to map
memory using larger mappings to minimize the number of TLB
misses. This, plus the number of IOMMUs and the extreme use cases we
needed to design for led us to a generic design.

This generic design solved our problem and the general mapping
problem. We thought other people, who had this same big-buffer
interoperation problem would also appreciate a common API that was
built with their needs in mind so we pushed our idea up.

>
> I'm also interested in how this ability to map memory regions as files
> for devices like KGSL/DRI or PMEM might work and why this is better
> suited to that purpose than existing methods, where this fits into
> camera preview and other issues that have been dealt with in these
> trees in novel ways (from my perspective).

The file based approach was driven by Android's buffer passing scheme
and the need to write userspace drivers for multimedia, etc...

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: FUJITA Tomonori on 20 Jul 2010 21:50

On Tue, 20 Jul 2010 15:20:01 -0700
Zach Pfeffer <zpfeffer(a)codeaurora.org> wrote:

> > I'm not saying that it's reasonable to pass (or even allocate) a 1MB
> > buffer via the DMA API.
>
> But given a bunch of large chunks of memory, is there any API that can
> manage them (asked this on the other thread as well)?

What is the problem about mapping a 1MB buffer with the DMA API?

Possibly, an IOMMU can't find space for 1MB but it's not the problem
of the DMA API.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Zach Pfeffer on 22 Jul 2010 00:10

On Tue, Jul 20, 2010 at 09:44:12PM -0400, Timothy Meade wrote:
> On Tue, Jul 20, 2010 at 8:44 PM, Zach Pfeffer <zpfeffer(a)codeaurora.org> wrote:
> > On Mon, Jul 19, 2010 at 05:21:35AM -0400, Tim HRM wrote:
> >> On Fri, Jul 16, 2010 at 8:01 PM, Larry Bassel <lbassel(a)codeaurora.org> wrote:
> >> > On 16 Jul 10 08:58, Russell King - ARM Linux wrote:
> >> >> On Thu, Jul 15, 2010 at 08:48:36PM -0400, Tim HRM wrote:
> >> >> > Interesting, since I seem to remember the MSM devices mostly conduct
> >> >> > IO through regions of normal RAM, largely accomplished through
> >> >> > ioremap() calls.
> >> >> >
> >> >> > Without more public domain documentation of the MSM chips and AMSS
> >> >> > interfaces I wouldn't know how to avoid this, but I can imagine it
> >> >> > creates a bit of urgency for Qualcomm developers as they attempt to
> >> >> > upstream support for this most interesting SoC.
> >> >>
> >> >> As the patch has been out for RFC since early April on the linux-arm-kernel
> >> >> mailing list (Subject: [RFC] Prohibit ioremap() on kernel managed RAM),
> >> >> and no comments have come back from Qualcomm folk.
> >> >
> >> > We are investigating the impact of this change on us, and I
> >> > will send out more detailed comments next week.
> >> >
> >> >>
> >> >> The restriction on creation of multiple V:P mappings with differing
> >> >> attributes is also fairly hard to miss in the ARM architecture
> >> >> specification when reading the sections about caches.
> >> >>
> >> >
> >> > Larry Bassel
> >> >
> >> > --
> >> > Sent by an employee of the Qualcomm Innovation Center, Inc.
> >> > The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.
> >> >
> >>
> >> Hi Larry and Qualcomm people.
> >> I'm curious what your reason for introducing this new api (or adding
> >> to dma) is. ?Specifically how this would be used to make the memory
> >> mapping of the MSM chip dynamic in contrast to the fixed _PHYS defines
> >> in the Android and Codeaurora trees.
> >
> > The MSM has many integrated engines that allow offloading a variety of
> > workloads. These engines have always addressed memory using physical
> > addresses, because of this we had to reserve large (10's MB) buffers
> > at boot. These buffers are never freed regardless of whether an engine
> > is actually using them. As you can imagine, needing to reserve memory
> > for all time on a device that doesn't have a lot of memory in the
> > first place is not ideal because that memory could be used for other
> > things, running apps, etc.
> >
> > To solve this problem we put IOMMUs in front of a lot of the
> > engines. IOMMUs allow us to map physically discontiguous memory into a
> > virtually contiguous address range. This means that we could ask the
> > OS for 10 MB of pages and map all of these into our IOMMU space and
> > the engine would still see a contiguous range.
> >
>
>
> I see. Much like I suspected, this is used to replace the static
> regime of the earliest Android kernel. You mention placing IOMMUs in
> front of the A11 engines, you are involved in this architecture as an
> engineer or similar?

I'm involved to the extent of designing and implementing VCM and,
finding it useful for this class of problems, trying push it upstream.

> Is there a reason a cooperative approach using
> RPC or another mechanism is not used for memory reservation, this is
> something that can be accomplished fully on APPS side?

It can be accomplished a few ways. At this point we let the
application processor manage the buffers. Other cooperative approaches
have been talked about. As you can see in the short, but voluminous
cannon of MSM Linux support there is a degree of RPC used to
communicate with other nodes in the system. As time progresses the
cannon of code shows this usage going down.

>
> > In reality, limitations in the hardware meant that we needed to map
> > memory using larger mappings to minimize the number of TLB
> > misses. This, plus the number of IOMMUs and the extreme use cases we
> > needed to design for led us to a generic design.
> >
> > This generic design solved our problem and the general mapping
> > problem. We thought other people, who had this same big-buffer
> > interoperation problem would also appreciate a common API that was
> > built with their needs in mind so we pushed our idea up.
> >
> >>
> >> I'm also interested in how this ability to map memory regions as files
> >> for devices like KGSL/DRI or PMEM might work and why this is better
> >> suited to that purpose than existing methods, where this fits into
> >> camera preview and other issues that have been dealt with in these
> >> trees in novel ways (from my perspective).
> >
> > The file based approach was driven by Android's buffer passing scheme
> > and the need to write userspace drivers for multimedia, etc...
> >
> >
> So the Android file backed approach is obiviated by GEM and other mechanisms?

Aye.

>
> Thanks you for you help,
> Timothy Meade
> -tmzt #htc-linux (facebook.com/HTCLinux)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Zach Pfeffer on 22 Jul 2010 00:30

On Mon, Jul 19, 2010 at 12:44:49AM -0700, Eric W. Biederman wrote:
> Zach Pfeffer <zpfeffer(a)codeaurora.org> writes:
>
> > On Thu, Jul 15, 2010 at 09:55:35AM +0100, Russell King - ARM Linux wrote:
> >> On Wed, Jul 14, 2010 at 06:29:58PM -0700, Zach Pfeffer wrote:
> >> > The VCM ensures that all mappings that map a given physical buffer:
> >> > IOMMU mappings, CPU mappings and one-to-one device mappings all map
> >> > that buffer using the same (or compatible) attributes. At this point
> >> > the only attribute that users can pass is CACHED. In the absence of
> >> > CACHED all accesses go straight through to the physical memory.
> >>
> >> So what you're saying is that if I have a buffer in kernel space
> >> which I already have its virtual address, I can pass this to VCM and
> >> tell it !CACHED, and it'll setup another mapping which is not cached
> >> for me?
> >
> > Not quite. The existing mapping will be represented by a reservation
> > from the prebuilt VCM of the VM. This reservation has been marked
> > non-cached. Another reservation on a IOMMU VCM, also marked non-cached
> > will be backed with the same physical memory. This is legal in ARM,
> > allowing the vcm_back call to succeed. If you instead passed cached on
> > the second mapping, the first mapping would be non-cached and the
> > second would be cached. If the underlying architecture supported this
> > than the vcm_back would go through.
>
> How does this compare with the x86 pat code?

First, thanks for asking this question. I wasn't aware of the x86 pat
code and I got to read about it. From my initial read the VCM differs in 2 ways:

1. The attributes are explicitly set on virtual address ranges. These
reservations can then map physical memory with these attributes.

2. We explicitly allow multiple mappings (as long as the attributes are
compatible). One such mapping may come from a IOMMU's virtual address
space while another comes from the CPUs virtual address space. These
mappings may exist at the same time.

>
> >> You are aware that multiple V:P mappings for the same physical page
> >> with different attributes are being outlawed with ARMv6 and ARMv7
> >> due to speculative prefetching. The cache can be searched even for
> >> a mapping specified as 'normal, uncached' and you can get cache hits
> >> because the data has been speculatively loaded through a separate
> >> cached mapping of the same physical page.
> >
> > I didn't know that. Thanks for the heads up.
> >
> >> FYI, during the next merge window, I will be pushing a patch which makes
> >> ioremap() of system RAM fail, which should be the last core code creator
> >> of mappings with different memory types. This behaviour has been outlawed
> >> (as unpredictable) in the architecture specification and does cause
> >> problems on some CPUs.
> >
> > That's fair enough, but it seems like it should only be outlawed for
> > those processors on which it breaks.
>
> To my knowledge mismatch of mapping attributes is a problem on most
> cpus on every architecture. I don't see it making sense to encourage
> coding constructs that will fail in the strangest most difficult to
> debug ways.

Yes it is a problem, as Russell has brought up, but there's something
I probably haven't communicated well. I'll use the following example:

There are 3 devices: A CPU, a decoder and a video output device. All 3
devices need to map the same 12 MB buffer at the same time. Once this
buffer has served its purpose it gets freed and goes back into the
pool of big buffers. When the same usage case exists again the buffer
needs to get reallocated and the same devices need to map to it.

This usage case does exist, not only for Qualcomm but for all of these
SoC media engines that have started running Linux. The VCM API
attempts to cover this case for the Linux kernel.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Zach Pfeffer on 22 Jul 2010 00:40

On Wed, Jul 21, 2010 at 10:44:37AM +0900, FUJITA Tomonori wrote:
> On Tue, 20 Jul 2010 15:20:01 -0700
> Zach Pfeffer <zpfeffer(a)codeaurora.org> wrote:
>
> > > I'm not saying that it's reasonable to pass (or even allocate) a 1MB
> > > buffer via the DMA API.
> >
> > But given a bunch of large chunks of memory, is there any API that can
> > manage them (asked this on the other thread as well)?
>
> What is the problem about mapping a 1MB buffer with the DMA API?
>
> Possibly, an IOMMU can't find space for 1MB but it's not the problem
> of the DMA API.

This goes to the nub of the issue. We need a lot of 1 MB physically
contiguous chunks. The system is going to fragment and we'll never get
our 12 1 MB chunks that we'll need, since the DMA API allocator uses
the system pool it will never succeed. For this reason we reserve a
pool of 1 MB chunks (and 16 MB, 64 KB etc...) to satisfy our
requests. This same use case is seen on most embedded "media" engines
that are getting built today.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

First | Prev | Next | Last
Pages: 1 2 3 4 5 6
Prev: [PATCH] Add a pair of system calls to make extended file stats available [ver #3]
Next: [PATCHv3 3/9] dspbridge: rename bridge_brd_mem_map/unmap to a proper name