From: John_H on
On Jan 22, 5:21 am, Martin Thompson <martin.j.thomp...(a)trw.com> wrote:
>
> For what definition of "poor" - do you mean code that would have been
> sub-optimal in terms of frequency or LUT usage *in the past*?
>
> "Poor" and "Better" (in my book) are about readability,
> maintainability and debuggability.  Usually that is in direct
> opposition to clever synthesis tricks.  So if I can write readable
> code and the synthesizer does a good job on my "poor" code, then I
> regard that as a win!
>
> Or did I misunderstand your comment?

Your definition of "better" fits pretty well but there's a little more
to it. When there's little consideration made for how the hardware
behaves with the Verilog tossed towards it, even very readable code
can be a problem. Memory inferences want a properly clocked and
enabled assignment for a single memory write and proper references for
the memory reads; registers on the read address and/or read values are
critical for proper implementation in a given target hardware. Flip-
flops like single edges and usually either asynchronous set/reset
controls or synchronous versions, not a mix (though a mix can be
supported by pushing the synchronous controls behind LUTs into the
logic).

It's people who have never designed to the hardware level either with
old TTL chips or coded with a dog-eared, printed copy of an FPGA
architecture chapter that can produce something that looks okay from a
software viewpoint - even readable - but will cause problems in clean
synthesis.

Clever synthesis tricks can be read, maintained, and debugged with
ease when done well. Clever tricks not done well can be "poor" code,
indeed.


> Personally, I'm impressed that I can write a simple description of my
> logic and get a complicated near-optimal synthesis result out in a
> large number of cases - I wonder if I'm alone? :)

A "simple description" in my book is one that's clean relative to the
hardware. When I see code that's trying to use dual-clocked
registers, case statements with (unintended) unpopulated states,
unintended combinatorial latches, abuses of asynchronous data
transfers or ugly workarounds to match pipelining, things get tough
for compilers. But they usually still produce results.

Clean, readable code designed to favor using a heavily loaded signal
as close to the register as possible too often ends up with several
levels of LUTs in the critical path despite synthesis timing
constraints designed to keep those paths short. Synthesizers decide
to do things different than the implied coding intent because they're
smarter. Really? Too often I've had to manually piece-up a logic
cone so the critical paths do what they should have from the beginning.
From: Mike Treseler on
Martin Thompson wrote:

> Personally, I'm impressed that I can write a simple description of my
> logic and get a complicated near-optimal synthesis result out in a
> large number of cases - I wonder if I'm alone? :)

I agree.
If it sims and meets timing I'm usually done.
But then, I don't need that last LUT and that last nS.

-- Mike Treseler
From: Martin Thompson on
John_H <newsgroup(a)johnhandwork.com> writes:

> On Jan 22, 5:21�am, Martin Thompson <martin.j.thomp...(a)trw.com> wrote:
>
> Your definition of "better" fits pretty well but there's a little more
> to it. When there's little consideration made for how the hardware
> behaves with the Verilog tossed towards it, even very readable code
> can be a problem. Memory inferences want a properly clocked and
> enabled assignment for a single memory write and proper references for
> the memory reads; registers on the read address and/or read values are
> critical for proper implementation in a given target hardware. Flip-
> flops like single edges and usually either asynchronous set/reset
> controls or synchronous versions, not a mix (though a mix can be
> supported by pushing the synchronous controls behind LUTs into the
> logic).
>
> It's people who have never designed to the hardware level either with
> old TTL chips or coded with a dog-eared, printed copy of an FPGA
> architecture chapter that can produce something that looks okay from a
> software viewpoint - even readable - but will cause problems in clean
> synthesis.
>


Agreed - that's a description I can go with... you have to give the
synth a fighting chance. "Poor code" can also include describing
behaviour that really doesn't match what you have available. A bit
like using floating-point code on a fixed-point processor and not
understanding the FP library shenanegins that's going to have to
support you.

Heh - when I look back to my early days VHDLing I use to forget it was
going to end up as hardware - and I used to solder 4000 series CMOS in
my bedroom as a teenager :) So, yes, code like that

> Clever synthesis tricks can be read, maintained, and debugged with
> ease when done well. Clever tricks not done well can be "poor" code,
> indeed.
>

Agreed!

>
>> Personally, I'm impressed that I can write a simple description of my
>> logic and get a complicated near-optimal synthesis result out in a
>> large number of cases - I wonder if I'm alone? :)
>
> A "simple description" in my book is one that's clean relative to the
> hardware. When I see code that's trying to use dual-clocked
> registers, case statements with (unintended) unpopulated states,
> unintended combinatorial latches, abuses of asynchronous data
> transfers or ugly workarounds to match pipelining, things get tough
> for compilers. But they usually still produce results.
>

Yes, I had not interpreted your "poor code" to cover that sort of
thing at all, but I can entirely see your point there.

> Clean, readable code designed to favor using a heavily loaded signal
> as close to the register as possible too often ends up with several
> levels of LUTs in the critical path despite synthesis timing
> constraints designed to keep those paths short. Synthesizers decide
> to do things different than the implied coding intent because they're
> smarter. Really? Too often I've had to manually piece-up a logic
> cone so the critical paths do what they should have from the beginning.

Maybe I've got luckier there then :) Apart from that it sounds like
we don't differ as wildly as I implied, sorry!

Cheers,
Martin

--
martin.j.thompson(a)trw.com
TRW Conekt - Consultancy in Engineering, Knowledge and Technology
http://www.conekt.net/electronics.html
From: Andy on
On Jan 22, 11:19 am, Mike Treseler <mtrese...(a)gmail.com> wrote:
> Martin Thompson wrote:
> > Personally, I'm impressed that I can write a simple description of my
> > logic and get a complicated near-optimal synthesis result out in a
> > large number of cases - I wonder if I'm alone? :)
>
> I agree.
> If it sims and meets timing I'm usually done.
> But then, I don't need that last LUT and that last nS.
>
>      -- Mike Treseler

I think the most common problem when designers try to be smarter than
their synthesizer is that they don't let the synthesizer try it first;
they just assume that they have to code this up-then-down count
sequence to get an acceptable implementation.

I'm as guilty of this as most, but I'm getting better at it. The tools
keep getting better (often faster than I'm getting better!), so I
continually have to recalibrate my estimate of when I will need to get
more creative.

In practice, this usually means that I (should) first write a fairly
straight forward behavioral (cycle accurate) description of what I
want, along with the testbench, which is often easier to debug with a
simple, functional test case. Then if I need to "improve" my code to
compensate for "weaknesses" in the synthesis tool, at least I have a
testbench that is already debugged and functionally correct.

And like Mike said, if it works fine as originally written, I'm done.

I suppose where this gets tricky is when you are writing IP that is
intended to be used by others, where you do not know how much
performance or space they will need (i.e. the definition of "works
fine" is a bit fuzzy). Then you simply have to look at what can be
optimized in the time and budget available that will give you the most
bang for the buck. But getting it working first, with a good
testbench, then optimizing later is still a good route to success.

Andy