From: Brad Smallridge on
> You'll also find that changes (like switching the Nobl SRAM to DRAM as an
> example) can be accomodated without having to change *everything*.

That has been on my mind because there is a DRAM on my board. Not only
will the DRAM require more cycles but perhaps too a varying number of
cycles depending on the sequentiality or randomness of the addressing.

I have seen controllers on the Xilinx site, but nothing, that talks about
several ports, and how the hand shaking is handled. My FAE has said that
some multiport examples are availble.

Brad Smallridge
AiVision




From: KJ on
On May 5, 12:13 pm, "Brad Smallridge" <bradsmallri...(a)dslextreme.com>
wrote:
> > You'll also find that changes (like switching the Nobl SRAM to DRAM as an
> > example) can be accomodated without having to change *everything*.
>
> That has been on my mind because there is a DRAM on my board. Not only
> will the DRAM require more cycles but perhaps too a varying number of
> cycles depending on the sequentiality or randomness of the addressing.
>

Except for the most special case examples, DRAM access will be a
variable delay because of page changes and memory refresh.

Trying to design a state machine that is simply trying to *access*
memory for some algorithmic purpose would likely result in a difficult
to maintain design.

Designing a request/acknowledge interface to some other process or
entity (in this case the 'other' being a DRAM controller) results in a
much easier to maintain design.

Using the exact same interface signal functionality whether one is
talking to internal FPGA memory, NoBL or SDRAM or SPI results in a
design that can be reused, retargeted and improved upon if necessary.

Using the same signal naming functionality as an existing documented
specification (i.e. Avalon, Wishbone) allows others to (re)use your
design without getting bogged down in details that they are not
currently interested in and allows them (and you when you re-use the
design) to be more productive.

Figure out where you are and where you want to be in the design
productivity chain. The synthesis cost in terms of logic resource is
zero, the upfront learning cost will start to pay back in the form of
quicker debug and reusable designs.

Kevin Jennings
From: Kevin Neilson on
Eric Smith wrote:
> Kevin Neilson wrote:
>> Having two bits hot in a one-hot FSM would normally be a bad thing.
>> But I was wondering if anybody does this purposely, in order to fork,
>> which might be a syntactically nicer way to have a concurrent FSM.
>
> DEC used that style of design in the PDP-16 Register Transfer Modules.
> Possibly also in the control units of some of their asynchronous
> processors such as the PDP-6 and KA10.

That's interesting--I'm not even familiar with an "asynchronous
processor". What does that mean? -Kevin
From: Kevin Neilson on
KJ wrote:
> On May 5, 12:13 pm, "Brad Smallridge" <bradsmallri...(a)dslextreme.com>
> wrote:
>>> You'll also find that changes (like switching the Nobl SRAM to DRAM as an
>>> example) can be accomodated without having to change *everything*.
....
> Designing a request/acknowledge interface to some other process or
> entity (in this case the 'other' being a DRAM controller) results in a
> much easier to maintain design.
>
> Using the exact same interface signal functionality whether one is
> talking to internal FPGA memory, NoBL or SDRAM or SPI results in a
> design that can be reused, retargeted and improved upon if necessary.
....
> Kevin Jennings

This is a great example, because switching from one type of RAM to
another means you *do* have to change everything, if you want the
controller to be good. You can certainly modularlize the code and make
concurrent SMs with handshaking and this is easy to maintain. And a lot
of DRAM controllers are designed this way. But here is the problem:
while you are waiting around for acknowledges, you have just wasted a
bunch of memory bandwidth. If you want to make better use of your
bandwidth, you can't use handshaking. You have to start another burst
while one is in the pipe. You have to look ahead in the command FIFO to
see if the next request is going to be in the same row/bank to see if
you need to close the row during this burst and precharge or if you can
continue in the same open row in a different bank, etc. If I do all
that with handshaking, I'm frittering away cycles. And to do this in a
way that doesn't fritter away cycles with standard methodology means
everything is so tightly bound together that to change from SDRAM to
some other type of RAM means I have to tear up most of the design.

Another issue I came up with today in the design of my current SM is
that I updated a value x and then in the next cycle realized I wanted
the old value of x. But I hadn't really updated x; I had issued a
request that gets put into a matching delay line and then goes to a
concurrent FSM which then updates x. So even though I had "updated" x,
I could still used the old value for a few cycles and didn't need a
temporary storage register. Again, I can't just send the request to
update x and then wait for an ack because the SM has to keep on
trucking. This is confusing, and I'd like to have some sort of
methodology that would be as efficient as what I'm doing but somewhat
more abstract.
-Kevin
From: KJ on

"Kevin Neilson" <kevin_neilson(a)removethiscomcast.net> wrote in message
news:fvo2o9$p0m1(a)cnn.xsj.xilinx.com...
> KJ wrote:
>> On May 5, 12:13 pm, "Brad Smallridge" <bradsmallri...(a)dslextreme.com>
>> wrote:
>>>> You'll also find that changes (like switching the Nobl SRAM to DRAM as
>>>> an
>>>> example) can be accomodated without having to change *everything*.
> ...
>> Designing a request/acknowledge interface to some other process or
>> entity (in this case the 'other' being a DRAM controller) results in a
>> much easier to maintain design.
>>
>> Using the exact same interface signal functionality whether one is
>> talking to internal FPGA memory, NoBL or SDRAM or SPI results in a
>> design that can be reused, retargeted and improved upon if necessary.
> ...
>> Kevin Jennings
>
> This is a great example, because switching from one type of RAM to another
> means you *do* have to change everything, if you want the controller to be
> good.

The methodology I use makes use of every clock cycle, DRAMs are running full
tilt, transfers from fast FPGA through a PCI bus to some other processor,
etc., the whole 9 yards.

> You can certainly modularlize the code and make concurrent SMs with
> handshaking and this is easy to maintain. And a lot of DRAM controllers
> are designed this way. But here is the problem: while you are waiting
> around for acknowledges, you have just wasted a bunch of memory bandwidth.

Then you're waiting for the wrong acknowledgement. Taking the DRAM again as
an example, every data transfer consists of two parts: address/command and
data. During a memory write, all of this happens on the same clock cycle.
When the controller 'fills up' it sets the wait request to hold off until it
can accept more commands (reads or writes).

During a read though, the address/command portion happens on one clock
cycle, the actual delivery of the data back to the requestor occurs sometime
later. The state machine that requests the read does not necessarily have
to wait for the data to come back before starting up the next read. The
acknowledge that comes back from a 'memory read' command is that the request
to read has been accepted, another command (read or write) can now be
started. There are also situations where one really does need to wait until
the data is returned to continue on, but in many data processing
applications, the data can lag significantly with no real impact on
performance, the read requests can be queued up as fast as the controller
can accept them.

Although I've been using the DRAM as an example, nothing in the handshaking
or methodology is 'DRAM specific', it is simply having to do with
transmitting information (i.e. writing) and requesting information (i.e.
reading) and having a protocol that separates the request for information
from the delivery of that information (i.e. specifically allowing for
latency and allowing multiple commands to be queued up).

> If you want to make better use of your bandwidth, you can't use
> handshaking.

I disagree.

> You have to start another burst while one is in the pipe.

That's correct...but you can't start one if the pipe is full (which can
happen when a memory refresh or a page hit occurs and the pipe fills up
waiting while those things get serviced). The handshake tells you that the
pipe is full and you absolutely need to have it. The 'pipe full' signal is
a handshake, when it is full, it says 'wait', when it is not full, it says
'got it'.

> You have to look ahead in the command FIFO to see if the next request is
> going to be in the same row/bank to see if you need to close the row
> during this burst and precharge or if you can continue in the same open
> row in a different bank, etc.

OK

> If I do all that with handshaking, I'm frittering away cycles.

Then you're not doing it properly. It's called pipelining, not frittering.

> And to do this in a way that doesn't fritter away cycles with standard
> methodology means everything is so tightly bound together that to change
> from SDRAM to some other type of RAM means I have to tear up most of the
> design.
>

Latency can matter in certain situations, in others it doesn't. If there is
some situation where latency mattered, one would have to come up with a way
where the requestor could start up the read cycle earlier...but if there is
such a way to start it up earlier, then that change could be applied equally
well to the lower latency situation as well which means that you could have
a common design

Kevin Jennings