From: Philip Pemberton on
Hi,
I'm trying to tidy up the "sdram_wb" SDRAM controller IP core -- and my
first target is its excessive use of device resources in the cache
control logic. Specifically, it's setting up a 32x32 multi-port read/
write RAM which Xst doesn't recognise, and thus gets implemented as 1024
separate flipflops. This is rather annoying, as it's eating up slices
that I need for other things (like, say, the LCD controller).

Platform is a Xilinx Spartan-3A XC3S700A on an Enterpoint Drigmorn2
development board. Software is Xilinx ISE 12.1 on Ubuntu 10.04 x86_64.

I've reduced the logic down so that it "only" needs two write ports and
two read ports. My problem now is that Xst refuses to synthesize it; I
get this in the error log:

Synthesizing Unit <sdram_wb_cacheline_ram>.
Related source file is "sdram_wb_cacheline_ram.v".
WARNING:Xst:647 - Input <wr0_mask> is never used. This port will be
preserved and left unconnected if it belongs to a top-level block or it
belongs to a sub-block and the hierarchy of this sub-block is preserved.
Found 32x32-bit quad-port RAM <Mram_cacheline_data> for signal
<cacheline_data>.
Summary:
inferred 1 RAM(s).
Unit <sdram_wb_cacheline_ram> synthesized.

[...]

Synthesizing (advanced) Unit <sdram_wb_cacheline_ram>.
ERROR:Xst - You are apparently trying to describe a RAM with several
write ports for signal <Mram_cacheline_data>. This RAM cannot be
implemented using distributed resources.

What am I doing wrong here? My code is an almost like-for-like copy of
Xilinx's examples (from the Xst manual), except that I've set it up for
asynchronous reads. Even making the reads synchronous (by converting the
assignments into nonblocking writes in the respective always@ blocks)
doesn't appease the demon that is Xst...

I really can't think of anything else to try...

Here's the code I'm using:

module sdram_wb_cacheline_ram #(
// This is the width of the SDRAM in bits.
parameter SDRAM_DAT_BITS = 32,
// The number of cacheline words
parameter CACHELINE_WORDS = 32)
(
input clk_i, // Clock input
input [log2(CACHELINE_WORDS)-1:0] wr0_adr, wr1_adr, // Write port 0 and 1 address
input [SDRAM_DAT_BITS-1:0] wr0_dat, wr1_dat, // Write port 0 and 1 data
input [(SDRAM_DAT_BITS/8)-1:0] wr0_mask, // Mask bits for write port 0
input we0, we1, // Write enable 0 and 1

input [log2(CACHELINE_WORDS)-1:0] rd0_adr, rd1_adr, // Read port 0 and 1 address
output [SDRAM_DAT_BITS-1:0] rd0_dat, rd1_dat // Read port 0 and 1 data
);

function integer log2;
input [31:0] value;
for (log2=-1; value>0; log2=log2+1)
value = value>>1;
endfunction

// Cacheline memory array
reg [SDRAM_DAT_BITS-1:0] cacheline_data [CACHELINE_WORDS-1:0];

// Dual-in/dual-out RAM with asynchronous read
always @(posedge clk_i) begin
if (we0) begin
// TODO: masking
cacheline_data[wr0_adr] <= wr0_dat;
end
end

always @(posedge clk_i) begin
if (we1) begin
// TODO: masking
cacheline_data[wr1_adr] <= wr1_dat;
end
end

assign rd0_dat = cacheline_data[rd0_adr];
assign rd1_dat = cacheline_data[rd1_adr];

endmodule


Thanks,
--
Phil.
usenet10(a)philpem.me.uk
http://www.philpem.me.uk/
If mail bounces, replace "10" with the last two digits of the current year
From: John McCaskill on
On May 19, 12:15 pm, Philip Pemberton <usene...(a)philpem.me.uk> wrote:
> Hi,
> I'm trying to tidy up the "sdram_wb" SDRAM controller IP core -- and my
> first target is its excessive use of device resources in the cache
> control logic. Specifically, it's setting up a 32x32 multi-port read/
> write RAM which Xst doesn't recognise, and thus gets implemented as 1024
> separate flipflops. This is rather annoying, as it's eating up slices
> that I need for other things (like, say, the LCD controller).
>
> Platform is a Xilinx Spartan-3A XC3S700A on an Enterpoint Drigmorn2
> development board. Software is Xilinx ISE 12.1 on Ubuntu 10.04 x86_64.
>
> I've reduced the logic down so that it "only" needs two write ports and
> two read ports. My problem now is that Xst refuses to synthesize it; I
> get this in the error log:
>
> Synthesizing Unit <sdram_wb_cacheline_ram>.
>     Related source file is "sdram_wb_cacheline_ram.v".
> WARNING:Xst:647 - Input <wr0_mask> is never used. This port will be
> preserved and left unconnected if it belongs to a top-level block or it
> belongs to a sub-block and the hierarchy of this sub-block is preserved.
>     Found 32x32-bit quad-port RAM <Mram_cacheline_data> for signal
> <cacheline_data>.
>     Summary:
>         inferred   1 RAM(s).
> Unit <sdram_wb_cacheline_ram> synthesized.
>
> [...]
>
> Synthesizing (advanced) Unit <sdram_wb_cacheline_ram>.
> ERROR:Xst - You are apparently trying to describe a RAM with several
> write ports for signal <Mram_cacheline_data>. This RAM cannot be
> implemented using distributed resources.
>
> What am I doing wrong here? My code is an almost like-for-like copy of
> Xilinx's examples (from the Xst manual), except that I've set it up for
> asynchronous reads. Even making the reads synchronous (by converting the
> assignments into nonblocking writes in the respective always@ blocks)
> doesn't appease the demon that is Xst...
>
> I really can't think of anything else to try...
>
> Here's the code I'm using:
>
> module sdram_wb_cacheline_ram  #(
>   // This is the width of the SDRAM in bits.
>   parameter SDRAM_DAT_BITS = 32,
>   // The number of cacheline words
>   parameter CACHELINE_WORDS = 32)
> (
>   input                               clk_i,              // Clock input
>   input  [log2(CACHELINE_WORDS)-1:0]  wr0_adr, wr1_adr,   // Write port 0 and 1 address
>   input  [SDRAM_DAT_BITS-1:0]         wr0_dat, wr1_dat,   // Write port 0 and 1 data
>   input  [(SDRAM_DAT_BITS/8)-1:0]     wr0_mask,           // Mask bits for write port 0
>   input                               we0, we1,           // Write enable 0 and 1
>
>   input  [log2(CACHELINE_WORDS)-1:0]  rd0_adr, rd1_adr,   // Read port 0 and 1 address
>   output [SDRAM_DAT_BITS-1:0]         rd0_dat, rd1_dat    // Read port 0 and 1 data
> );
>
> function integer log2;
>   input [31:0] value;
>   for (log2=-1; value>0; log2=log2+1)
>     value = value>>1;
> endfunction
>
> // Cacheline memory array
> reg [SDRAM_DAT_BITS-1:0] cacheline_data [CACHELINE_WORDS-1:0];
>
> // Dual-in/dual-out RAM with asynchronous read
> always @(posedge clk_i) begin
>   if (we0) begin
>     // TODO: masking
>     cacheline_data[wr0_adr] <= wr0_dat;
>   end
> end
>
> always @(posedge clk_i) begin
>   if (we1) begin
>     // TODO: masking
>     cacheline_data[wr1_adr] <= wr1_dat;
>   end
> end
>
> assign rd0_dat = cacheline_data[rd0_adr];
> assign rd1_dat = cacheline_data[rd1_adr];
>
> endmodule
>
> Thanks,
> --
> Phil.
> usene...(a)philpem.me.ukhttp://www.philpem.me.uk/
> If mail bounces, replace "10" with the last two digits of the current year

There are two main problems with this.

First, the block rams are synchronous. If you do not do synchronous
reads, it can not be a block ram.

Second, the block rams only have one address per port. You have
separate read and write addresses for each port.

What you are describing in this code does not match what the block
rams are capable of doing.

Regards,

John McCaskill
www.FasterTechnology.com