From: Ehsan on
Hi fellows,

I have been working on a Virtex5 LX110 design using VHDL and ISE
tools. My problem is that I cannot meet my timing constraints or the
desired clock frequency. When I look at the timing report, I realized
that the large delay is due to a signal with a large fanout. This has
caused the delay to be dominated by routing (82%). Here is the portion
of timing report:

Maximum Data Path: dut_inst/userlogic/CM1/d3_tem_0 to dut_inst/
userlogic/YC1/Y3_out_n_4
Location Delay type Delay(ns) Physical
Resource

Logical Resource(s)
-------------------------------------------------
-------------------
SLICE_X20Y56.DQ Tcko 0.471
dut_inst/userlogic/CM1/d3_tem<0>

dut_inst/userlogic/CM1/d3_tem_0 SLICE_X14Y142.AX net
(fanout=77) 6.659 dut_inst/userlogic/CM1/d3_tem<0>
SLICE_X14Y142.COUT Taxcy 0.439 dut_inst/
userlogic/YC1/Madd_Y3_n_0_addsub0000_cy

dut_inst/userlogic/YC1/Madd_Y3_n_0_addsub0000_cy
SLICE_X14Y143.CIN net (fanout=1) 0.000 dut_inst/
userlogic/YC1/Madd_Y3_n_0_addsub0000_cy<3>
SLICE_X14Y143.COUT Tbyp 0.104 dut_inst/
userlogic/YC1/Madd_Y3_n_0_addsub0000_cy<7>

dut_inst/userlogic/YC1/Madd_Y3_n_0_addsub0000_cy<7>
SLICE_X14Y144.CIN net (fanout=1) 0.000 dut_inst/
userlogic/YC1/Madd_Y3_n_0_addsub0000_cy<7>
SLICE_X14Y144.CMUX Tcinc 0.334 dut_inst/
userlogic/YC1/Y3_n_0_addsub0000<11>

dut_inst/userlogi/YC1/Madd_Y3_n_0_addsub0000_xor<11>
SLICE_X14Y149.B1 net (fanout=2) 1.078 dut_inst/userlogic/
YC1/Y3_n_0_addsub0000<10>
SLICE_X14Y149.BMUX Topbb 0.613 dut_inst/userlogic/
YC1/Y3_n_0_cmp_le0000

dut_inst/userlogi/YC1/Mcompar_Y3_n_0_cmp_le0000_lut<5>

dut_inst/userlogic/YC1/Mcompar_Y3_n_0_cmp_le0000_cy<5>
SLICE_X16Y146.A1 net (fanout=13) 1.106 dut_inst/userlogic/
YC1/Y3_n_0_cmp_le0000
SLICE_X16Y146.A Tilo 0.094 fifo_in_inst/
bram_fifo_36x512_gen.bram_fifo_36x512_inst/BU2/U0/grf.rf/gcx.clkx/
wr_pntr_gc_asreg<8>

dut_inst/userlogic/YC1/Y3_out_n_mux0002<4>_SW2 SLICE_X17Y147.A2
net (fanout=1) 0.867 N1189
SLICE_X17Y147.CLK Tas 0.026 dut_inst/
userlogic/YC1/Y3_out_n<7>

dut_inst/userlogic/YC1/Y3_out_n_mux0002<4>

dut_inst/userlogic/YC1/Y3_out_n_4
-------------------------------------------------
---------------------------
Total 11.791ns (2.081ns logic,
9.710ns route)
(17.6% logic,
82.4% route)

##########################################################################

The signal with the large fanout is the output of a flip-flop. The
Synthesizer, on the other hand, finds another critical path with much
less delay. So, I guess the timing error happens because the
synthesizer cannot detect/optimize this path. But I don't know how to
fix this problem. Do I need to change my VHDL code or use other
constraints in my project. I appreciate your help.

-ehsan
From: General Schvantzkoph on
On Thu, 29 Apr 2010 09:09:54 -0700, Ehsan wrote:

> Hi fellows,
>
> I have been working on a Virtex5 LX110 design using VHDL and ISE tools.
> My problem is that I cannot meet my timing constraints or the desired
> clock frequency. When I look at the timing report, I realized that the
> large delay is due to a signal with a large fanout. This has caused the
> delay to be dominated by routing (82%). Here is the portion of timing
> report:
>
> Maximum Data Path: dut_inst/userlogic/CM1/d3_tem_0 to dut_inst/
> userlogic/YC1/Y3_out_n_4
> Location Delay type Delay(ns) Physical
> Resource
>
> Logical Resource(s)
> -------------------------------------------------
> -------------------
> SLICE_X20Y56.DQ Tcko 0.471
> dut_inst/userlogic/CM1/d3_tem<0>
>
> dut_inst/userlogic/CM1/d3_tem_0 SLICE_X14Y142.AX net (fanout=77)
> 6.659 dut_inst/userlogic/CM1/d3_tem<0> SLICE_X14Y142.COUT
> Taxcy 0.439 dut_inst/
> userlogic/YC1/Madd_Y3_n_0_addsub0000_cy
>
> dut_inst/userlogic/YC1/Madd_Y3_n_0_addsub0000_cy
> SLICE_X14Y143.CIN net (fanout=1) 0.000 dut_inst/
> userlogic/YC1/Madd_Y3_n_0_addsub0000_cy<3>
> SLICE_X14Y143.COUT Tbyp 0.104 dut_inst/
> userlogic/YC1/Madd_Y3_n_0_addsub0000_cy<7>
>
> dut_inst/userlogic/YC1/Madd_Y3_n_0_addsub0000_cy<7> SLICE_X14Y144.CIN
> net (fanout=1) 0.000 dut_inst/
> userlogic/YC1/Madd_Y3_n_0_addsub0000_cy<7> SLICE_X14Y144.CMUX Tcinc
> 0.334 dut_inst/ userlogic/YC1/Y3_n_0_addsub0000<11>
>
> dut_inst/userlogi/YC1/Madd_Y3_n_0_addsub0000_xor<11> SLICE_X14Y149.B1
> net (fanout=2) 1.078 dut_inst/userlogic/
> YC1/Y3_n_0_addsub0000<10>
> SLICE_X14Y149.BMUX Topbb 0.613 dut_inst/userlogic/
> YC1/Y3_n_0_cmp_le0000
>
> dut_inst/userlogi/YC1/Mcompar_Y3_n_0_cmp_le0000_lut<5>
>
> dut_inst/userlogic/YC1/Mcompar_Y3_n_0_cmp_le0000_cy<5> SLICE_X16Y146.A1
> net (fanout=13) 1.106 dut_inst/userlogic/
> YC1/Y3_n_0_cmp_le0000
> SLICE_X16Y146.A Tilo 0.094 fifo_in_inst/
> bram_fifo_36x512_gen.bram_fifo_36x512_inst/BU2/U0/grf.rf/gcx.clkx/
> wr_pntr_gc_asreg<8>
>
> dut_inst/userlogic/YC1/Y3_out_n_mux0002<4>_SW2 SLICE_X17Y147.A2 net
> (fanout=1) 0.867 N1189
> SLICE_X17Y147.CLK Tas 0.026 dut_inst/
> userlogic/YC1/Y3_out_n<7>
>
> dut_inst/userlogic/YC1/Y3_out_n_mux0002<4>
>
> dut_inst/userlogic/YC1/Y3_out_n_4
> -------------------------------------------------
> ---------------------------
> Total 11.791ns (2.081ns logic,
> 9.710ns route)
> (17.6% logic,
> 82.4% route)
>
>
##########################################################################
>
> The signal with the large fanout is the output of a flip-flop. The
> Synthesizer, on the other hand, finds another critical path with much
> less delay. So, I guess the timing error happens because the synthesizer
> cannot detect/optimize this path. But I don't know how to fix this
> problem. Do I need to change my VHDL code or use other constraints in my
> project. I appreciate your help.
>
> -ehsan

You can set a general max_fanout limit in the XST script,

-max_fanout 32

Or you can put a max_fan synthesis directive on the signal in your RTL
code.


From: Chris Maryan on
On Apr 29, 12:09 pm, Ehsan <ehsan.hosse...(a)gmail.com> wrote:
> Hi fellows,
>
> I have been working on a Virtex5 LX110 design using VHDL and ISE
> tools. My problem is that I cannot meet my timing constraints or the
> desired clock frequency. When I look at the timing report, I realized
> that the large delay is due to a signal with a large fanout. This has
> caused the delay to be dominated by routing (82%). Here is the portion
> of timing report:
>
>  Maximum Data Path: dut_inst/userlogic/CM1/d3_tem_0 to dut_inst/
> userlogic/YC1/Y3_out_n_4
>     Location             Delay type                Delay(ns)  Physical
> Resource
>
> Logical Resource(s)
>     -------------------------------------------------
> -------------------
> SLICE_X20Y56.DQ      Tcko                      0.471
> dut_inst/userlogic/CM1/d3_tem<0>
>
> dut_inst/userlogic/CM1/d3_tem_0 SLICE_X14Y142.AX     net
> (fanout=77)       6.659           dut_inst/userlogic/CM1/d3_tem<0>
> SLICE_X14Y142.COUT   Taxcy                 0.439           dut_inst/
> userlogic/YC1/Madd_Y3_n_0_addsub0000_cy
>
> dut_inst/userlogic/YC1/Madd_Y3_n_0_addsub0000_cy
>  SLICE_X14Y143.CIN    net (fanout=1)        0.000      dut_inst/
> userlogic/YC1/Madd_Y3_n_0_addsub0000_cy<3>
>  SLICE_X14Y143.COUT   Tbyp                  0.104      dut_inst/
> userlogic/YC1/Madd_Y3_n_0_addsub0000_cy<7>
>
> dut_inst/userlogic/YC1/Madd_Y3_n_0_addsub0000_cy<7>
> SLICE_X14Y144.CIN    net (fanout=1)        0.000     dut_inst/
> userlogic/YC1/Madd_Y3_n_0_addsub0000_cy<7>
> SLICE_X14Y144.CMUX   Tcinc                 0.334     dut_inst/
> userlogic/YC1/Y3_n_0_addsub0000<11>
>
> dut_inst/userlogi/YC1/Madd_Y3_n_0_addsub0000_xor<11>
> SLICE_X14Y149.B1     net (fanout=2)        1.078    dut_inst/userlogic/
> YC1/Y3_n_0_addsub0000<10>
> SLICE_X14Y149.BMUX   Topbb                 0.613   dut_inst/userlogic/
> YC1/Y3_n_0_cmp_le0000
>
> dut_inst/userlogi/YC1/Mcompar_Y3_n_0_cmp_le0000_lut<5>
>
> dut_inst/userlogic/YC1/Mcompar_Y3_n_0_cmp_le0000_cy<5>
> SLICE_X16Y146.A1     net (fanout=13)       1.106   dut_inst/userlogic/
> YC1/Y3_n_0_cmp_le0000
> SLICE_X16Y146.A      Tilo                  0.094          fifo_in_inst/
> bram_fifo_36x512_gen.bram_fifo_36x512_inst/BU2/U0/grf.rf/gcx.clkx/
> wr_pntr_gc_asreg<8>
>
> dut_inst/userlogic/YC1/Y3_out_n_mux0002<4>_SW2 SLICE_X17Y147.A2
> net (fanout=1)        0.867   N1189
> SLICE_X17Y147.CLK    Tas                   0.026     dut_inst/
> userlogic/YC1/Y3_out_n<7>
>
> dut_inst/userlogic/YC1/Y3_out_n_mux0002<4>
>
> dut_inst/userlogic/YC1/Y3_out_n_4
>     -------------------------------------------------
> ---------------------------
>     Total                                     11.791ns (2.081ns logic,
> 9.710ns route)
>                                                        (17.6% logic,
> 82.4% route)
>
> ##########################################################################
>
> The signal with the large fanout is the output of a flip-flop. The
> Synthesizer, on the other hand, finds another critical path with much
> less delay. So, I guess the timing error happens because the
> synthesizer cannot detect/optimize this path. But I don't know how to
> fix this problem. Do I need to change my VHDL code or use other
> constraints in my project. I appreciate your help.
>
> -ehsan

I think your problem has little or nothing to do with fanout, the high
fanout net just happens to be the one that's hard to meet timing on,
6.6ns is a very long net. Usually this is due to something along the
lines of different destinations on the fanout being spread around the
chip. Thus if timing is met on to one destination, it will fail to
destinations on the other side of the chip; generally a tough
situation for the router. So the problem is routing, forget the
fanout. Focus on trying to area group the logic that's on that fanout
into a tighter area. Alternatively, if you can tolerate the latency
and if it's easy to put in, you can add a register on that path
(retiming in synthesis can more or less do this too).

Chris
From: Brian Drummond on
On Thu, 29 Apr 2010 09:09:54 -0700 (PDT), Ehsan <ehsan.hosseini(a)gmail.com>
wrote:

>Hi fellows,
>
>I have been working on a Virtex5 LX110 design using VHDL and ISE
>tools. My problem is that I cannot meet my timing constraints or the
>desired clock frequency. When I look at the timing report, I realized
>that the large delay is due to a signal with a large fanout. This has
>caused the delay to be dominated by routing (82%). Here is the portion
>of timing report:

>SLICE_X20Y56.DQ Tcko 0.471
>(fanout=77) 6.659 dut_inst/userlogic/CM1/d3_tem<0>
>SLICE_X14Y142.COUT Taxcy 0.439 dut_inst/

77 is not a particularly large fanout, but 6.6ns is a very long time in a V5.

This looks like a pathologically bad placement between source (X20Y56) and
destination (X14Y142).

Three approaches:

(1) run MAP and PAR again with a different seed ( set the -t option to 2, 3, 4
etc) until you find a setting that passes timing. Note that MAP and PAR now both
need the -t flag, and both need the same value for it.

This requires no real thought, only CPU time, so is often an easy way to make
the problem go away until you change the design and MAP comes up with another
pathological placement (then you change the seed again)

(2) Using FPGA Editor, move the offending component (the source in this case,
since the destination appears to be a carry chain) closer to the destination,
then run a re-entrant routing pass from the command line to improve timings.
(ISE10 has a silly bug preventing reentrant routing from the GUI)

(3) Change the design somehow; setting the "max fanout" limit as the General
suggests, will force synthesis to use two or more copies of the signal source.
This will reduce the fanout but more importantly, it gives MAP a chance to place
one copy closer to the destination, to reduce the routing length.

- Brian
From: Symon on
On 4/29/2010 5:09 PM, Ehsan wrote:
> Hi fellows,
>
> I have been working on a Virtex5 LX110 design using VHDL and ISE
> tools. My problem is that I cannot meet my timing constraints or the
> desired clock frequency. When I look at the timing report, I realized
> that the large delay is due to a signal with a large fanout.

Dear Ehsan,
When the tools find a net which it cannot route and meet the timing it
gives up trying to meet the timing on all the other nets as well. Unless
your 'large fanout' net is the only net which is failing timing, the
problem may well lie with another net.
HTH., Syms.