From: John_H on
On Mar 26, 11:39 pm, Randy Yates <ya...(a)ieee.org> wrote:
> Hi,
>
> I'm looking for a device that will perform something on
> the order of hundreds of millions of 12x12 multiplies
> per second, and I need it small. I only need about
> 30-40 pins of I/O.
>
> Is the Xilinx CoolRunner series completely out of the
> picture? Any suggestions would be appreciated.
> --
> Randy Yates                      % "Bird, on the wing,
> Digital Signal Labs              %   goes floating by
> mailto://ya...(a)ieee.org          %   but there's a teardrop in his eye..."http://www.digitalsignallabs.com% 'One Summer Dream', *Face The Music*, ELO

How do you expect to get hundreds of millions of operands onto and off
the chip through 30-40 IO?

Depending on the operations you're doing, some tricks might be
available to you but your best bet is probably to get the smallest
FPGA-with-multipliers to do your dirty work. But to fully scope out
what part is needed you need to figure out not only I/O bandwidth and
multiplier throughput but also what you intend to do with all those
operands. Are you storing operands, constants, or results?

The coolrunner most probably won't do you any good. There are no
embedded multipliers and no storage beyond the macrocells (at least
per my recollection).
From: Randy Yates on
John_H <newsgroup(a)johnhandwork.com> writes:

> On Mar 26, 11:39 pm, Randy Yates <ya...(a)ieee.org> wrote:
>> Hi,
>>
>> I'm looking for a device that will perform something on
>> the order of hundreds of millions of 12x12 multiplies
>> per second, and I need it small. I only need about
>> 30-40 pins of I/O.
>>
>> Is the Xilinx CoolRunner series completely out of the
>> picture? Any suggestions would be appreciated.
>> --
>> Randy Yates                      % "Bird, on the wing,
>> Digital Signal Labs              %   goes floating by
>> mailto://ya...(a)ieee.org          %   but there's a teardrop in his eye..."http://www.digitalsignallabs.com% 'One Summer Dream', *Face The Music*, ELO
>
Hi John,

> How do you expect to get hundreds of millions of operands onto and off
> the chip through 30-40 IO?

It's a SOQPSK modulator, so the input data is 2 bits per baud. OK,
that takes 2 bits. The output is 14-bit I/Q. Ok, thats total 30 bits.
Add a few for control. Done.

> Depending on the operations you're doing, some tricks might be
> available to you but your best bet is probably to get the smallest
> FPGA-with-multipliers to do your dirty work. But to fully scope out
> what part is needed you need to figure out not only I/O bandwidth and
> multiplier throughput but also what you intend to do with all those
> operands. Are you storing operands, constants, or results?

It's basically a filter, with perhaps some FEC on top. I don't know
myself, and yes, I'm being way too premature.

> The coolrunner most probably won't do you any good. There are no
> embedded multipliers and no storage beyond the macrocells (at least
> per my recollection).

That was my feeling too, but I wanted to get a more professional
opinion.
--
Randy Yates % "Watching all the days go by...
Digital Signal Labs % Who are you and who am I?"
mailto://yates(a)ieee.org % 'Mission (A World Record)',
http://www.digitalsignallabs.com % *A New World Record*, ELO
From: Symon on
On 3/27/2010 2:07 PM, Randy Yates wrote:
>
>> The coolrunner most probably won't do you any good. There are no
>> embedded multipliers and no storage beyond the macrocells (at least
>> per my recollection).
>
> That was my feeling too, but I wanted to get a more professional
> opinion.

Hi Randy,

I don't have much experience of CPLDs, but you may well be able to get
the multiplier performance you need. Even though a CPLD probably won't
have dedicated multipliers, there's more than one way to skin a cat.
Check out distributed arithmetic solutions.

http://www.andraka.com/distribu.htm

Also, you mention FEC. This might use up a fair chunk of hardware, but
again you can serialise it, if timing permits.

As for your size requirements, I thought the smallest Coolrunner was 6x6
= 36mm², too big for your spec.

Cheers, Syms.

From: John_H on
On Mar 27, 10:07 am, Randy Yates <ya...(a)ieee.org> wrote:
>
> > How do you expect to get hundreds of millions of operands onto and off
> > the chip through 30-40 IO?
>
> It's a SOQPSK modulator, so the input data is 2 bits per baud. OK,
> that takes 2 bits. The output is 14-bit I/Q. Ok, thats total 30 bits.
> Add a few for control. Done.
>
<snip>
>
> It's basically a filter, with perhaps some FEC on top. I don't know
> myself, and yes, I'm being way too premature.

I'd recommend you prototype with an FPGA. Whether filters are the
right approach or simple lookups into an I/Q "response" table for a
short few codes of the sequence, you'll be able to trade off
complexity with results. If you want to do a receiver, you have work
to do. If you have a transmitter only, I'd think the implementation
could be much simpler than you're imagining. My quick glance at
SOQPSK with the various weightings suggests that the inter-symbol
interference from shaping the bits at the phase or frequency level is
very limited. Perhaps my glance sincerely simplifies the issue.

If you pursued a lookup approach, something as cheap as the MAX-II
might even give you the performance you need because it's LE based
rather than product term and can get you down to 25 mm^2. Some Actel
offerings might get you to a similar size/performance without
multipliers.

If your issues are centered on cost, the Spartan3A(N) might be a good
offering with the smaller XC3S50A(N) though the smallest packages are
huge by your square millimeter suggestion.

From: John Adair on
Randy

I am presuming you are meaning physically small and from some of the
other messages in the various. A part tp look at is the XC3S500E in
the CPG132 package. You can see an example of this on our Craignell1
product http://www.enterpoint.co.uk/component_replacements/craignell.html.
A XC3S500E has 36 multipliers that will run in the 100-200Mhz area.
The XC3SD3400A is also physically small in the CSG484 package. You can
see that on our monster Merrick1 and it has 126 multipliers available.
Going up the cost scale there also some small Virtex parts.

Spartan-6 is new and only certain parts like the XC6SLX16 are
shipping. The CSG324 package we that we use in our Drigmorn3
ihttp://www.enterpoint.co.uk/drigmorn/drigmorn3.html s a very nice
size and will give you 32 multipliers to use. The XC6SLX45 in the same
package will give 58 multipliers. The XC6SLX150 in the CSG484 package
isn't far off being available and will offer 180 multipliers.

There are many other solutions from Xilinx and other vendors but I
would need to know more of the application to be more accurate.

John Adair
Enterpoint Ltd.- Home of FPGA HPC solutions.


On 27 Mar, 03:39, Randy Yates <ya...(a)ieee.org> wrote:
> Hi,
>
> I'm looking for a device that will perform something on
> the order of hundreds of millions of 12x12 multiplies
> per second, and I need it small. I only need about
> 30-40 pins of I/O.
>
> Is the Xilinx CoolRunner series completely out of the
> picture? Any suggestions would be appreciated.
> --
> Randy Yates                      % "Bird, on the wing,
> Digital Signal Labs              %   goes floating by
> mailto://ya...(a)ieee.org          %   but there's a teardrop in his eye..."http://www.digitalsignallabs.com% 'One Summer Dream', *Face The Music*, ELO