very wide counter (42-bit) [FPGA]

Prev: Picoblaze bit file block ram remplacement
Next: spartan 3 and multiprocessor

From: Andy on 7 Dec 2009 11:34

I hope your comment on the declaration of WD is not what you really
wanted...

Also, en='0' disables the cnt increment, but not the prescaler (temp),
which will lead to problems if en is disabled at the wrong time or for
long enough.

Depending on how much latency you can tolerate (other posts regarding
register retiming/rebalancing), you may want to register the output of
the prescaler comparison, so that it's logic path does not add to the
counter path.

Andy

From: kendor on 9 Dec 2009 07:40

>I hope your comment on the declaration of WD is not what you really
>wanted...
>
>Also, en='0' disables the cnt increment, but not the prescaler (temp),
>which will lead to problems if en is disabled at the wrong time or for
>long enough.
>
>Depending on how much latency you can tolerate (other posts regarding
>register retiming/rebalancing), you may want to register the output of
>the prescaler comparison, so that it's logic path does not add to the
>counter path.
>
>Andy
>

thank you all for your follow ups!

In the comment I certainly mean prescaler - not divider ;)

I am using timespecs for high and low time - ISE11 manages to do its job
(however I have to increase its effort, which leads to quite some
processing time (30'+))
I believe to add a pipeline would be a good idea. I'm processing 4*1024
multiplexed signals and for each signal I have 10 clock cycles for my
algorithm to pass (I always switch between single incoming signals and then
to the processing and wait again for the next time the same signal is
selected... around 100us). Since I use the countervalue right from the
beginning I would need to increase the countertime at the time I switch to
the new signal. At the moment the data path needs 8 out of those 10 clock
cycles. So there's not a lot of margin to add in another pipeline stage
without having to add those in the whole algorithm (which works with
feedbacks and loops of different delays) - so I'd prefer to have the easy
way :)

I didn't think of the "from : to style timing constraint" since I was not
wanting to add 42 of those. But I'll give this a try.
Registering the prescaler comparison sounds good to.

Thanks!

---------------------------------------
This message was sent using the comp.arch.fpga web interface on
http://www.FPGARelated.com

From: Gabor on 9 Dec 2009 14:19

On Dec 9, 7:40 am, "kendor" <jonas.re...(a)bfh.ch> wrote:
> >I hope your comment on the declaration of WD is not what you really
> >wanted...
>
> >Also, en='0' disables the cnt increment, but not the prescaler (temp),
> >which will lead to problems if en is disabled at the wrong time or for
> >long enough.
>
> >Depending on how much latency you can tolerate (other posts regarding
> >register retiming/rebalancing), you may want to register the output of
> >the prescaler comparison, so that it's logic path does not add to the
> >counter path.
>
> >Andy
>
> thank you all for your follow ups!
>
> In the comment I certainly mean prescaler - not divider ;)
>
> I am using timespecs for high and low time - ISE11 manages to do its job
> (however I have to increase its effort, which leads to quite some
> processing time (30'+))
> I believe to add a pipeline would be a good idea. I'm processing 4*1024
> multiplexed signals and for each signal I have 10 clock cycles for my
> algorithm to pass (I always switch between single incoming signals and then
> to the processing and wait again for the next time the same signal is
> selected... around 100us). Since I use the countervalue right from the
> beginning I would need to increase the countertime at the time I switch to
> the new signal. At the moment the data path needs 8 out of those 10 clock
> cycles. So there's not a lot of margin to add in another pipeline stage
> without having to add those in the whole algorithm (which works with
> feedbacks and loops of different delays) - so I'd prefer to have the easy
> way :)
>
> I didn't think of the "from : to style timing constraint" since I was not
> wanting to add 42 of those. But I'll give this a try.
> Registering the prescaler comparison sounds good to.
>
> Thanks!
>
> ---------------------------------------
> This message was sent using the comp.arch.fpga web interface onhttp://www..FPGARelated.com

No need to add 42 constraints. You make a timing group
out of the counter bits. Then you have one constraint
from that group to itself using the clock multiplied by
the prescaler count as the delay. One good approach to
this is as mentioned to register the prescaler to create
a single cycle pulse at the prescale rate and write the
counter logic such that it only changes when that signal
is active (the "clock enable"). Then you can create
the timing group based on the clock enable signal and
perhaps catch some multicycle paths you didn't think of.

Regards,
Gabor

From: glen herrmannsfeldt on 11 Dec 2009 13:16

kendor <jonas.reber(a)bfh.ch> wrote:

> for a measuring utility (running @ 100MHZ) I need a counter of 42-bit width
> whose value is used by several sub blocks of my design. As a first, somehow
> dirty solution I have implemented this like follows. Since this approach
> needs quite a huge amount of FFs and leads to long delaytimes (bit 0 to 42)
> I am looking for an alternative. I was thinking about using Block RAM
> (Spartan3) to reduce routing effort and delaytimes. (see also
> http://courses.ece.illinois.edu/ece412/References/datasheets/xapp463.pdf)

Someone else suggested a LFSR which seems like it might work.

It depends somewhat on what you do with the count later.

I was just thinking that you could cascade counters with a latch
between the carry out of one and the carry in of the next.
That causes the carry to occur one cycle late, which results in
a strange count sequence, but fairly easy to correct externally.

Though propagating the value to other subblocks seems likely to
take about as long as getting the carry through 42 bits. That might
require more pipeline registers throughout the design.

Otherwise, 50MHz or 25MHz should be easy. A one or two bit
counter at 100MHz with the appropriate logic to generate
and latch a carry signal should also work.

-- glen

From: Peter Alfke on 11 Dec 2009 17:50

On Dec 4, 10:15 am, "kendor" <jonas.re...(a)bfh.ch> wrote:
> hello there
>
> for a measuring utility (running @ 100MHZ) I need a counter of 42-bit width
> whose value is used by several sub blocks of my design.
> kendor

The conventional design of a synchronous counter would concatenate 42
flip-flops, using the built-in dedicated carry chain. Its carry
propagation delay is extremely short, but the total delay might be too
long for 100 MHz operation.
You can maintain the synchronous nature of the design, but decode an
additional count enable from the first 2 flip-flops and route that
signal to all the remaining 40 flip-flops in parallel. That gives the
long carry chain not 10 ns, but 40 ns to stabilize, which is more than
sufficient. And you still have a totally synchronous counter where all
bits change on the same clock.

If you think that 42 flip-flops are too many, you can use BlockRAMs.
Each dual-ported 4K BlockRAM can implement an 8-bit counter per port,
easily concatenated to 16 bits per BRAM. (The two ports have the same
look-up functionality, just different addressing inputs, fed back from
the own outputs) Two BlockRAMs can thus form a 32-bit fully
synchronous counter, and a third BRAM can extend that to 48 bits.
There is some trickery in gating the carry signals, but it never
involves more than one level of combinatorial logic, no problem at 100
MHz. And you can also of course always use a pre-scaler, as described
above.

Now, if you use more modern FPGAs, like Spartan3DSP, or Spartan6, or
Virtex4,5,or 6, then you can use the ready-made 48-bit accumulator (an
accumulator that adds 1 per clock tick is a counter) without any
design effort at all, and a speed of up to 500 MHz.

Old FPGA families may sometimes look cheaper, but that may be
deceptive.
Would you today buy a car with drum brakes, no fuel injection, no CD
player, no airbags and no air conditioning ?

Peter Alfke

First | Prev | Next | Last
Pages: 1 2 3
Prev: Picoblaze bit file block ram remplacement
Next: spartan 3 and multiprocessor