From: Andy on
Coding clock enables in a combinatorial process requires an additional
assignment for the clock-disable case (otherwise you get a latch,
regardless of your naming convention). Only one assignment is required
(the enabled assignment) in a clocked process, and it is the
assignment you had to make anyway.

KJ has already well stated the problems with latches (requiring
additional assignments) in combinatorial processes/blocks, regardless
of the naming convention employed.

Any decent simulator (maybe not a half-decent one) will merge
processes or always blocks that share the same sensitivity list. Since
they are usually identical for all synchronous processes clocked by
the same clock, they get merged, thus improving performance by
avoiding duplicative process-related overhead. Since combinatorial
processes rarely share the same sensitivity list, they don't get
merged, and performance suffers.

Andy

From: Kim Enkovaara on
glen herrmannsfeldt wrote:
> combinations. One that I have heard of, though haven't actually
> tried, is having a logic block where the delay is greater than
> one clock cycle, but less than two. Maybe some tools can do that,
> but I don't believe that all can.

Just normal multicycle path, has been normal thing in tools for
a long time. At least Altera, Xilinx, Synplify, Primetime and
Precision support it.

--Kim
From: Kim Enkovaara on
rickman wrote:
> But you can't verify timing by testing. You can never have any level
> of certainty that you have tested all the ways the timing can fail.

Especially with ASIC you can't verify the design by testing. There are
so many signoff corners and modes in the timing analysis. The old
worst/best case in normal and testmode are long gone. Even 6 corner
analysis in 2+ modes is for low end processes with big extra margins.
With multiple adjustable internal voltage areas, powerdown areas etc.
the analysis is hard even with STA.

--Kim
From: Patrick Maupin on
On Apr 23, 10:34 am, Andy <jonesa...(a)comcast.net> wrote:

> Coding clock enables in a combinatorial process requires an additional
> assignment for the clock-disable case (otherwise you get a latch,
> regardless of your naming convention). Only one assignment is required
> (the enabled assignment) in a clocked process, and it is the
> assignment you had to make anyway.

Oh, I see the point you're trying to make. Two points I tried (and
obviously failed) to make are that (1) I don't mind extra typing,
because it's really all about readability (obviously, from the
discussion, my opinion of what is readable may differ from others);
and (2) With the canonical two process technique, the sequential
process becomes boilerplate (even to the point of being able to be
generated by a script or editor macro, in most cases) that just
assigns a bunch of 'x <= next_x' statements. The top of the
combinatorial process becomes boilerplate as well, with corresponding
'next_x = x' statements (for some variables, it could be other things,
e.g. 'next_x = 0'. But you can just glance at those and not think
about it. So, when reading, you aren't really looking at that, or the
register declarations.

Once you accept that the sequential block, and the top of the
combinatorial block, are both boilerplate that you don't even need to
look at, then it's no more work than anything else. (In fact, if you
can type faster than 20 wpm and/or know how to write scripts or editor
macros, it's less work overall.)

> KJ has already well stated the problems with latches (requiring
> additional assignments) in combinatorial processes/blocks, regardless
> of the naming convention employed.

I understand the issue with latches. I just never see them. The
coding style makes it easy to check and avoid them. It can even be
completely automatic if you have a script write your boilerplate.

> Any decent simulator (maybe not a half-decent one) will merge
> processes or always blocks that share the same sensitivity list. Since
> they are usually identical for all synchronous processes clocked by
> the same clock, they get merged, thus improving performance by
> avoiding duplicative process-related overhead. Since combinatorial
> processes rarely share the same sensitivity list, they don't get
> merged, and performance suffers.

I'm pretty sure that verilator is smart enough to figure all this
out. That's the simulator I use if I care about execution time.

Regards,
Pat
From: Jan Decaluwe on
On Apr 21, 4:34 pm, Patrick Maupin <pmau...(a)gmail.com> wrote:
> On Apr 21, 8:19 am, Jan Decaluwe <j...(a)jandecaluwe.com> wrote:

> > Your coding style provides a very verbose workaround for temporary
> > variables. I just can't imagine this is how you do test benches, that
> > are presumably much more complex than your RTL code. Presumably
> > there you use temporary variables directly where you need them without
> > great difficulty. Why would it have to be so different for
> > synthesizable
> > RTL?
>
> You're right.  Testbenches do not suffer from this limitation.  But,
> in point of fact, I can use any sort of logic in my testbench.  I use
> constructs all the time that aren't realistically synthesizable, so
> comparing how I code synthesizable RTL vs how I code testbenches would
> turn up a lot more differences than just this.

As you say, synthesizable RTL has a lot of inherent restrictions.
I just don't see the logic in adding artificial restricions on top of
those.

> > Most importantly: your coding style doesn't support non-temporary
> > variables. In other words, register inferencing from variables is not
> > supported and therefore ignored as technique. In this sense, this is
> > actually a good illustration of the point I'm trying to make.
>
> Well, it may be a good illustration to you, but now you're waxing
> philosophical again.  Care to show an example (preferably in verilog)
> of how not using this coding style supports your preferred technique?

In my experience, we are talking about a paradigm shift here.
Easy once you "get it", but apparently confusing to many
engineers in the mean time.

Therefore, I now think that a meaningful discussion must be
more elaborate than a typical newsgroup post can bear :-)
What I can offer you is a rather lengthy discussion of two
design variants that highlight the issues through their (subtle)
differences. The case is based on a real ambiguity that I once
detected in the Xilinx ISE examples.

Unfortunately, the source code is in Python :-) (MyHDL).
However, there is equivalent converted Verilog available
in the article.

http://www.myhdl.org/doku.php/cookbook:jc2

Jan