From: Nial Stewart on
> Ok, good idea. Tks, will try.
> Not sure what you mean by top/bottom but I presume it is something like
> having the long 1 bit 128bit SR with outputs at 0, 16, 32 etc while the parallel 128bit word has
> bit reordering such as each shift would produce the next 16bit word to come out. If I'm missing
> something let me know.

Something like this, with a load value....

signal shift_reg : std_logic_vector(127 downto 0);
signal output : std_logic_vector(15 downto 0);
signal load_value : std_logic_vector(127 downto 0);


:
:
:

process(clk,rst)
begin
if(rst = '1') then
shift_reg <= (others => '0');
output <= (others => '0');
elsif(rising_edge(clk)) then
if(load = '1') then
shift_reg <= load_value;
else
shift_reg(127 downto 112) <= shift_reg(111 downto 96;
shift_reg(111 downto 96) <= shift_reg(95 downto 80);
shift_reg(79 downto 64) <= shift_reg(63 downto 48);
:
:
shift_reg(31 downto 16) <= shift_reg(15 downto 0);

end if;

output <= shift_reg(127 downto 112);

end if;
end process;


Nial


From: LC on
Nial Stewart wrote:
>> Ok, good idea. Tks, will try.
>> Not sure what you mean by top/bottom but I presume it is something like
>> having the long 1 bit 128bit SR with outputs at 0, 16, 32 etc while the parallel 128bit word has
>> bit reordering such as each shift would produce the next 16bit word to come out. If I'm missing
>> something let me know.
>
> Something like this, with a load value....
>
> signal shift_reg : std_logic_vector(127 downto 0);
> signal output : std_logic_vector(15 downto 0);
> signal load_value : std_logic_vector(127 downto 0);
>
>
> :
> :
> :
>
> process(clk,rst)
> begin
> if(rst = '1') then
> shift_reg <= (others => '0');
> output <= (others => '0');
> elsif(rising_edge(clk)) then
> if(load = '1') then
> shift_reg <= load_value;
> else
> shift_reg(127 downto 112) <= shift_reg(111 downto 96;
> shift_reg(111 downto 96) <= shift_reg(95 downto 80);
> shift_reg(79 downto 64) <= shift_reg(63 downto 48);
> :
> :
> shift_reg(31 downto 16) <= shift_reg(15 downto 0);
>
> end if;
>
> output <= shift_reg(127 downto 112);
>
> end if;
> end process;
>
>
> Nial
>
>

Thanks for the clarification.

Yes, Now I've tested both: the 1 bit SR with 128 with bit
reordering on both sides (which is just messing up with the bit order
must not consume precious time)
And the SR in 16bit chunks approach you suggested.

Both resulted identical (as expected... they are after all not too
different if we think of the data path delay).
Both were indeed a bit faster than my previous counter/mux approach.

Now I'm closer to 400MHz...

I believe that is what I could do with this technology.

Again, tks,
Luis C.
From: Symon on
On 6/15/2010 12:57 PM, LC wrote:
> Symon wrote:
>> On 6/14/2010 1:45 PM, LC wrote:
>>>
>>> Should I expect that this would be the right up limit I could do it ?
>>> Is there any clever design of this frontend to allow higher speed ?
>>>
>> Does XAPP265 give you any architectural hints that you can use in your
>> Altera part?
>> HTH., Syms.
>
> Tks, Symon,
> Indeed there are some variations induced by this reading that I'll try.
> Thanks.
>
> Luis C.

Hi Luis,
You might want to pay particular attention to the DDR registers in the
IOBs. I expect your Altera part has the same features, but I dunno for
sure. The registers mean that your internal logic can run at half the
speed of the external signals. Which is nice.
HTH, Syms.
From: rickman on
On Jun 16, 9:31 am, Symon <symon_bre...(a)hotmail.com> wrote:
> On 6/15/2010 12:57 PM, LC wrote:
>
> > Symon wrote:
> >> On 6/14/2010 1:45 PM, LC wrote:
>
> >>> Should I expect that this would be the right up limit I could do it ?
> >>> Is there any clever design of this frontend to allow higher speed ?
>
> >> Does XAPP265 give you any architectural hints that you can use in your
> >> Altera part?
> >> HTH., Syms.
>
> > Tks, Symon,
> > Indeed there are some variations induced by this reading that I'll try.
> > Thanks.
>
> > Luis C.
>
> Hi Luis,
> You might want to pay particular attention to the DDR registers in the
> IOBs. I expect your Altera part has the same features, but I dunno for
> sure. The registers mean that your internal logic can run at half the
> speed of the external signals. Which is nice.
> HTH, Syms.

That's what I would suggest. By using the DDR registers, the data
stream can be split into odd/even words with parallel paths. Then
each stream would only need to run at half the rate on the I/O pins.
Since you already have the 500 MHz clock you can just divide that by
two to generate two enables, one for the odd and one for the even data
streams. I've never used the DDR registers. You probably want to
look closely at the example code that Altera provides.

Rick
From: LC on
rickman wrote:
> On Jun 16, 9:31 am, Symon <symon_bre...(a)hotmail.com> wrote:
>> On 6/15/2010 12:57 PM, LC wrote:
>>
>>> Symon wrote:
>>>> On 6/14/2010 1:45 PM, LC wrote:
>>>>> Should I expect that this would be the right up limit I could do it ?
>>>>> Is there any clever design of this frontend to allow higher speed ?
>>>> Does XAPP265 give you any architectural hints that you can use in your
>>>> Altera part?
>>>> HTH., Syms.
>>> Tks, Symon,
>>> Indeed there are some variations induced by this reading that I'll try.
>>> Thanks.
>>> Luis C.
>> Hi Luis,
>> You might want to pay particular attention to the DDR registers in the
>> IOBs. I expect your Altera part has the same features, but I dunno for
>> sure. The registers mean that your internal logic can run at half the
>> speed of the external signals. Which is nice.
>> HTH, Syms.
>
> That's what I would suggest. By using the DDR registers, the data
> stream can be split into odd/even words with parallel paths. Then
> each stream would only need to run at half the rate on the I/O pins.
> Since you already have the 500 MHz clock you can just divide that by
> two to generate two enables, one for the odd and one for the even data
> streams. I've never used the DDR registers. You probably want to
> look closely at the example code that Altera provides.
>
> Rick

Many thaks Folks,
Very good tips.

tks,
Luis C.