From: Frank van Eijkelenburg on
On Jul 2, 9:04 pm, Charles Gardiner <charles.gardi...(a)invalid.invalid>
wrote:
> Hi Frank,
>
>
>
> > I am not sure if we understand each other.
>
> Yes, it certainly sounds like that.
>
> > What do you mean by
> > completing the request with IoCompleteRequest? There is no request
> > from software point of view.
>
> I think this might clear up the reason why your data is missing. (See also below
> about the type of DMA). I don't think the S/G list you are getting is describing
> your application buffer. This is best done by specifying DO_DIRECT_IO as the DMA
> method for your device. If you specify DO_BUFFERED_IO you will get an S/G List
> describing an intermediate buffer in kernel space and this probably never gets
> copied over to your application space buffer unless you terminate the request.
> I've never done the 'neither' method myself and from what I hear, it's a
> complicated beast.
>
> > The FPGA will do a DMA write (data from
> > FPGA to PC memory) at its own initiative. The allocated memory is used
> > as long as the software is running. I do not allocate new memory for
> > each new DMA transfer, but at startup a large piece of memory is
> > allocated and the physical addresses are written to the FPGA by the
> > driver software.
>
> Sounds like you are doing something like a circular buffer in memory which stays
> alive as long as your device does?
>
>
>
> > And yes, we use a DMA adapter in combination with the
> > GetScatterGatherList method. We already used this in another project
> > but that was PCI and DMA read (data from PC memory to FPGA).
>
> > By the way, where can I set the type of DMA?
>
> Typically, you set the DMA buffering method in your AddDevice function after you
> create your device object. Quoting from Oney's book,
>
> NTSTATUS AddDevice(..) {
>    PDEVICE_OBJECT    fdo;
>
>    IoCreateDevice(....., &fdo);
>    fdo->Flags |= DO_BUFFERED_IO;
>             <or>
>    fdo->Flags |= DO_DIRECT_IO;
>             <or>
>    fdo->Flags |= 0;  // i.e. neither Direct nor Buffered
>
> And, you can't change your mind afterwards.
>
> By the way if my assumption about the circular buffer in your design is correct,
> there is a slightly more standard solution (standard in the sense that everybody
> on the microsoft drivers newgroup seems to do it). It however requires two threads
> in your application. The first one requests a buffer (using new or malloc) and
> sets up an I/O Request ReadFile, WriteFile or DeviceIoControl referencing this
> buffer. This is performed as an asynchronous request.
>
> The driver recognises this request and pends it indefinitely, (typically terminate
> it when your driver is shutting down, otherwise windows will probably hang).
> Pending the request has the nice side effect that the buffer now becomes locked
> down permanently.
>
> Assuming you have set up your driver to use DO_DIRECT_IO DMA, you can get the S/G
> list describing the application space buffer as you are currently doing and feed
> this to your FPGA.
>
> Using the second thread in your application you can constantly read data from the
> locked down pages (you app. space buffer) that are being written by your FPGA.
>
> Assuming the DO_DIRECT_IO solves your problem (I think there is a good chance), I
> would however still consider migrating to a KMDF based driver, particularily if
> you are writing a new one. It's much easier to maintain and is probably more
> portable for future MS versions.
>
>
>
> > best regards,
>
> > Frank
>
> best regards,
> Charles

Hi Charles,

We tried your suggestion (we were using BUFFERED_IO). Unfortunately it
was not the (final) solution. Perhaps there are more causes for the
problem. Anyway, thanks for your suggestion. We are almost out of
ideas of what we can test. Do you have other ideas or tests we can do
to find the cause? I hope to fix the problem before my vacation (only
one day left :)

best regards,

Frank
From: Frank van Eijkelenburg on
On Jul 2, 9:04 pm, Charles Gardiner <charles.gardi...(a)invalid.invalid>
wrote:
> Hi Frank,
>
>
>
> > I am not sure if we understand each other.
>
> Yes, it certainly sounds like that.
>
> > What do you mean by
> > completing the request with IoCompleteRequest? There is no request
> > from software point of view.
>
> I think this might clear up the reason why your data is missing. (See also below
> about the type of DMA). I don't think the S/G list you are getting is describing
> your application buffer. This is best done by specifying DO_DIRECT_IO as the DMA
> method for your device. If you specify DO_BUFFERED_IO you will get an S/G List
> describing an intermediate buffer in kernel space and this probably never gets
> copied over to your application space buffer unless you terminate the request.
> I've never done the 'neither' method myself and from what I hear, it's a
> complicated beast.
>
> > The FPGA will do a DMA write (data from
> > FPGA to PC memory) at its own initiative. The allocated memory is used
> > as long as the software is running. I do not allocate new memory for
> > each new DMA transfer, but at startup a large piece of memory is
> > allocated and the physical addresses are written to the FPGA by the
> > driver software.
>
> Sounds like you are doing something like a circular buffer in memory which stays
> alive as long as your device does?
>
>
>
> > And yes, we use a DMA adapter in combination with the
> > GetScatterGatherList method. We already used this in another project
> > but that was PCI and DMA read (data from PC memory to FPGA).
>
> > By the way, where can I set the type of DMA?
>
> Typically, you set the DMA buffering method in your AddDevice function after you
> create your device object. Quoting from Oney's book,
>
> NTSTATUS AddDevice(..) {
>    PDEVICE_OBJECT    fdo;
>
>    IoCreateDevice(....., &fdo);
>    fdo->Flags |= DO_BUFFERED_IO;
>             <or>
>    fdo->Flags |= DO_DIRECT_IO;
>             <or>
>    fdo->Flags |= 0;  // i.e. neither Direct nor Buffered
>
> And, you can't change your mind afterwards.
>
> By the way if my assumption about the circular buffer in your design is correct,
> there is a slightly more standard solution (standard in the sense that everybody
> on the microsoft drivers newgroup seems to do it). It however requires two threads
> in your application. The first one requests a buffer (using new or malloc) and
> sets up an I/O Request ReadFile, WriteFile or DeviceIoControl referencing this
> buffer. This is performed as an asynchronous request.
>
> The driver recognises this request and pends it indefinitely, (typically terminate
> it when your driver is shutting down, otherwise windows will probably hang).
> Pending the request has the nice side effect that the buffer now becomes locked
> down permanently.
>
> Assuming you have set up your driver to use DO_DIRECT_IO DMA, you can get the S/G
> list describing the application space buffer as you are currently doing and feed
> this to your FPGA.
>
> Using the second thread in your application you can constantly read data from the
> locked down pages (you app. space buffer) that are being written by your FPGA.
>
> Assuming the DO_DIRECT_IO solves your problem (I think there is a good chance), I
> would however still consider migrating to a KMDF based driver, particularily if
> you are writing a new one. It's much easier to maintain and is probably more
> portable for future MS versions.
>
>
>
> > best regards,
>
> > Frank
>
> best regards,
> Charles

Hi Charles,

We tried your suggestion (we were using BUFFERED_IO). Unfortunately it
was not the (final) solution. Perhaps there are more causes for the
problem. Anyway, thanks for your suggestion. We are almost out of
ideas of what we can test. Do you have other ideas or tests we can do
to find the cause? I hope to fix the problem before my vacation (only
one day left :)

best regards,

Frank
From: Charles Gardiner on
Frank van Eijkelenburg schrieb:

> We tried your suggestion (we were using BUFFERED_IO). Unfortunately it
> was not the (final) solution.

Was there any noticeable change in the behaviour at all?

Is it still valid that your FPGA can _read_ data from the buffer when your
application writes it there?

With DO_DIRECT_IO specified, it's not clear to me off-hand why you are not seeing
the memory locations in both directions now.

Perhaps there are more causes for the
> problem. Anyway, thanks for your suggestion. We are almost out of
> ideas of what we can test. Do you have other ideas or tests we can do
> to find the cause? I hope to fix the problem before my vacation (only
> one day left :)

Oops, thats tight. I'm just on the way to a customers so I don't have my usual
references at hand. Have you tried the flush (zero length read from FPGA) after a
write to memory. Although, to be honest I don't think that's the solution (just a
straw to grab for in case your system has some caching behaviour I haven't seen
before). My last (KMDF based) design was similar to yours. The FPGA was streaming
to memory and the SW application reading from the buffer shared between
application memory and kernel memory. I never had any data loss, even without the
zero length read.

If you can send me as much relevant info as possible, I'll have another look this
evening.

Regards,
Charles
From: Michael S on
On Jul 6, 11:00 am, Frank van Eijkelenburg
<fei.technolut...(a)gmail.com> wrote:
> On Jul 2, 9:04 pm, Charles Gardiner <charles.gardi...(a)invalid.invalid>
> wrote:
>
>
>
> > Hi Frank,
>
> > > I am not sure if we understand each other.
>
> > Yes, it certainly sounds like that.
>
> > > What do you mean by
> > > completing the request with IoCompleteRequest? There is no request
> > > from software point of view.
>
> > I think this might clear up the reason why your data is missing. (See also below
> > about the type of DMA). I don't think the S/G list you are getting is describing
> > your application buffer. This is best done by specifying DO_DIRECT_IO as the DMA
> > method for your device. If you specify DO_BUFFERED_IO you will get an S/G List
> > describing an intermediate buffer in kernel space and this probably never gets
> > copied over to your application space buffer unless you terminate the request.
> > I've never done the 'neither' method myself and from what I hear, it's a
> > complicated beast.
>
> > > The FPGA will do a DMA write (data from
> > > FPGA to PC memory) at its own initiative. The allocated memory is used
> > > as long as the software is running. I do not allocate new memory for
> > > each new DMA transfer, but at startup a large piece of memory is
> > > allocated and the physical addresses are written to the FPGA by the
> > > driver software.
>
> > Sounds like you are doing something like a circular buffer in memory which stays
> > alive as long as your device does?
>
> > > And yes, we use a DMA adapter in combination with the
> > > GetScatterGatherList method. We already used this in another project
> > > but that was PCI and DMA read (data from PC memory to FPGA).
>
> > > By the way, where can I set the type of DMA?
>
> > Typically, you set the DMA buffering method in your AddDevice function after you
> > create your device object. Quoting from Oney's book,
>
> > NTSTATUS AddDevice(..) {
> >    PDEVICE_OBJECT    fdo;
>
> >    IoCreateDevice(....., &fdo);
> >    fdo->Flags |= DO_BUFFERED_IO;
> >             <or>
> >    fdo->Flags |= DO_DIRECT_IO;
> >             <or>
> >    fdo->Flags |= 0;  // i.e. neither Direct nor Buffered
>
> > And, you can't change your mind afterwards.
>
> > By the way if my assumption about the circular buffer in your design is correct,
> > there is a slightly more standard solution (standard in the sense that everybody
> > on the microsoft drivers newgroup seems to do it). It however requires two threads
> > in your application. The first one requests a buffer (using new or malloc) and
> > sets up an I/O Request ReadFile, WriteFile or DeviceIoControl referencing this
> > buffer. This is performed as an asynchronous request.
>
> > The driver recognises this request and pends it indefinitely, (typically terminate
> > it when your driver is shutting down, otherwise windows will probably hang).
> > Pending the request has the nice side effect that the buffer now becomes locked
> > down permanently.
>
> > Assuming you have set up your driver to use DO_DIRECT_IO DMA, you can get the S/G
> > list describing the application space buffer as you are currently doing and feed
> > this to your FPGA.
>
> > Using the second thread in your application you can constantly read data from the
> > locked down pages (you app. space buffer) that are being written by your FPGA.
>
> > Assuming the DO_DIRECT_IO solves your problem (I think there is a good chance), I
> > would however still consider migrating to a KMDF based driver, particularily if
> > you are writing a new one. It's much easier to maintain and is probably more
> > portable for future MS versions.
>
> > > best regards,
>
> > > Frank
>
> > best regards,
> > Charles
>
> Hi Charles,
>
> We tried your suggestion (we were using BUFFERED_IO).

If you were using BUFFERED_IO why was your driver locking the pages?
In case of BUFFERED_IO the pages come from kernel non-paged pool and
don't have to be specifically locked. The only case where the driver
is responsible for locking/unlocking pages is NEITHER I/O.

>Unfortunately it
> was not the (final) solution. Perhaps there are more causes for the
> problem. Anyway, thanks for your suggestion. We are almost out of
> ideas of what we can test. Do you have other ideas or tests we can do
> to find the cause? I hope to fix the problem before my vacation (only
> one day left :)
>
> best regards,
>
> Frank

Another typical mistake is driver forgets to call IoMarkIrpPending().
KMDF does it automatically, but in plain WDM it's responsibility of
your driver. However forgotten IoMarkIrpPending() normally shows
different symptoms.


From: Michael S on
On Jul 6, 11:00 am, Frank van Eijkelenburg
<fei.technolut...(a)gmail.com> wrote:

> I hope to fix the problem before my vacation (only one day left :)
>

Something, I most certainly DO NOT RECOMMEND for final solution, but
it could help to go to vacation in better mood.
Scrap all the schoolbook nice&complex Windows DMA API stuff. Instead,
take your Irp->MdlAddress, do MmGetMdlPfnArray() and access physical
addresses directly. It's wrong, it's immoral but on simple x86/x64 PC
or on small dual-processor server it always work.
Just don't forget to bring back the official DMA API when you are back
from vocation and have more time than a few hours.