Optimisation of RAM - vhdl

I'm currently debugging my DPRAM. As usual, simulation works perfectly but in real life it fails. The syntax is as such:
ram[Address][Data]
I can get the data to write to the first 8 addresses but anymore and the data is just lost (even on chipscope). As a work around, I stitched two pieces of DPRAM together with a simple logical switch to re-route the data to the second DPRAM after it hit the eighth address. This worked but it just looks so messy.
My thinking is that it is being optimised away, even if it isn't this will be a good learning curve anyway (and any thoughts on this are welcome).
Here is the signal variable in the DPRAM with my effort at stopping it from being optimised away:
type ram_array is array(16 downto 0) of std_logic_vector(31 downto 0);
shared variable ram: ram_array;
attribute KEEP: string;
attribute KEEP of ram_array : type is "TRUE";
I think I need to add a line to the UCF file also though I can't seem to get the syntax right, with entity name path obviously changed:
NET "entity/name/path/dpram/ram_array" KEEP ="TRUE";
So is this how I'd add the code if I wanted to stop optimisation?
EDIT:
Output (guess there wasn't optimisation):
Found 17x32-bit dual-port RAM <Mram_ram> for signal <ram>. Summary: inferred 1 RAM(s). inferred 65 D-type flip-flop(s). Unit <dpram> synthesized.
Thanks =)

Don't use shared variable for synthesizable code
If your code needs to be synthesizable, then don't use shared variable.
Inferring BlockRAM
If you need sample code to implement your memory without using a shared variable, use the documentation provided for your FPGA.
Sample doc links :
(xilinx) http://www.xilinx.com/support/documentation/sw_manuals/xilinx12_2/xst_v6s6.pdf
(altera) http://www.altera.com/literature/hb/qts/qts_qii51007.pdf
Hope this helps.

Related

How to drive the DDS Compiler IP core from Xilinx

I completed Anton Potočniks' introductory guide to the red pitaya board and I am now able to send commands from the linux machine running on the SoC to its FPGA logic.
I would like to further modify the project so that I can control the phase of the signal that is being transmitted via the red pitayas' DAC. Some pins (from 7 down to 1) of the first GPIO port were still unused so I started setting them from within the OS and used the red pitaya's LEDs to confirm that they were being set without interfering with the functionality of Anton Potočnik's "high bandwidth averager".
I then set the DDS_compilers' to Phase Offset Programmability to "streaming" mode so that it can be configured on the fly using the bits that are currently controling the red pitaya's LEDs. I used some slices to connect my signals to the AXI4-Stream Constant IP core, which in turn drives the DDS compiler.
Unfortunately the DAC is just giving me a constant output of 500 mV.
I created a new project with a testbench for the DDS compiler, because synthesis takes a long time and doesn't give me much insight into what is happening.
Unfortunately all the output signals of the DDS compiler are undefined.
My question:
What am I doing wrong and how can I proceed to control DACs' phase?
EDIT1; here is my test bench
The IP core is configured as follows, so many of the control signals that I provided should not be required:
EDIT2; I changed declarations of the form m_axis_data_tready => '0' to m_axis_phase_tready => m_axis_phase_tready_signal. I also took a look at the wrapper file called dds_compiler_0.vhd and saw that it treats both m_axis_phase_tready and m_axis_data_tready as inputs.
My simulation results remained unchanged...
My new test bench can be found here.
EDIT3: Vivado was just giving me the old simulation results - creating a new testbench, deleting the file under <project_name>.sim/sim_1/behav/xsim/simulate.log and restarting vivado solved this problem.
I noticed that the wrapper file (dds_compiler_0.vhd) only has five ports:
aclk (in)
s_axis_phase_tvalid (in)
s_axis_phase_tdata (in)
m_axis_data_tvalid (out)
and m_axis_data_tdata (out)
So I removed all the unnecessary control signals and got a new simulation result, but I am still not recieving any useful output from the dds_compiler:
The corresponding testbench can be found here.
I also don't get any valid output when I include the control signals.
The corresponding testbench can be found here.
Looks like m_axis_data_tready is not connected. No data will come out unless that's asserted.

VHDL ATTRIBUTE keep

I am currently studying VHDL about SR Latch, and there comes to a part which I don't understand.
Can anyone explain What does ATTRIBUTE keep: boolean mean and what does it do in VHDL?
Thank you.
Warning heavy Xilinx bias...
The attributes for the VHDL are different for different tools and even change between versions of the same tools. The "keep" attribute for Xilinx used to insure that in the Vivado synthesis process the signal is not optimized away. It has been renamed recently to "syn_keep" to avoid confusion. I've used similar attributes to fix build issues before in which the tools make false assumptions.
NOTE: In order to avoid optimization during the implementation for Xilinx use "dont_touch".
Example:
A clock coming into the FPGA needs to be buffer through the Xilinx BUFG, but I needed the raw signal for a specific IP core. So I split the route, buffer the clock and fed the raw clock signal to the IP. The Vivado 2016.4 tool optimized out the unbuffered route creating a time constraint critical warnings and misbehavior on the hardware. The issue was found by tracing through the synthesis design schematics, observing the proper routing, and then viewing the implementation design schematic and seeing the route is altered. I fixed this by adding the dont_touch attribute to the unbuffered signal.
attribute dont_touch : boolean;
attribute clock_signal : string;
attribute dont_touch of clk_in : signal is true;
attribute clock_signal of clk_in : signal is "yes";
...
CLK_BUFG: component BUFG
port map (
I => clk_in,
O => buf_clk_in
);
It is a user defined attribute, thus not part of the VHDL standard itself. It is typically used to instruct the synthesis tool that it should keep a certain signal, for example being a flip-flop, even through the synthesis tool may determine that the signal can be removed during optimization.
For Altera Quartus synthesis tool, see this description: keep VHDL Synthesis Attribute

iCEstick + yosys - using the Global Set/Reset (GSR)

This is probably more of an iCEstick question than a yosys one, but asking here since I'm using the Icestorm tool chain.
I want to specify startup behavior of my design, which various places on the internet seem to agree is related to the typically named rst signal. It wasn't obvious to me where such a signal comes from, so I dug into the powerup sequence. Current understanding is from Figure 2 in this document.
After CDONE is pulled high by the device, all of the internal registers have been reset, to some initial value. Now, I've found plenty of lattice documents about how each type of flip-flop or hard IP receives a reset signal and does something with its internal state, but I still don't quite understand how I specify what those states are (or even just know what they are so I can use them).
For example, if I wanted to bring an LED high for 1 second after powerup (and only after powerup) I would want to start a counter after this reset signal (whatever it is) disables.
Poking around the ice40 family data sheet and the Lattice site, I found this document about using the Global Set/Reset signal. I confirmed this GSR is mentioned in the family data sheet, referenced on page 2-3 under "Clock/Control Distribution Network". It seems that a global reset signal is usable by one of the global buffers GBUF[0-7] and can be routed (up to 4 of them) to all LUTs with the global/high-fanout distribution network.
This seems like exactly what I was after, I but I can't find any other info about how to use this in my designs. The document on using the GSR states that you can instantiate a native GSR component like this:
GSR GSR_INST (.GSR (<global reset sig>));
but I can't tell whether this is just for simulation. Am I completely going in the wrong direction here or just missing something? I'm very inexperienced with FPGAs and hardware, so its entirely possible my entire approach is flawed.
I'm not sure if that GSR document actually is about iCE40. The Lattice iCEcube tool interestingly accepts instances of GSR cells, but it seems to simply treat them as constant zero drivers. There is also no simulation model for the GSR cell type in the iCE40 sim library and no description of it in the iCE40 tech library documentation provided by Lattice.
Furthermore, I have built the following two designs with the lattice tools, and besides the timestamp in the "comment field" of the generated bit-stream file, the generated bit-streams are identical! (This test was performed with Lattice LSE as synthesis tool, not Synplify. I had problems getting Synplify to run on my machine for some reason and gave up trying to do so over a year ago..)
This is the first test design I've used:
module top (
input clk,
output rst,
output reg val
);
always #(posedge clk, posedge rst)
if (rst)
val = 1;
else
val = 0;
GSR GSR_INST (.GSR (rst));
endmodule
And this is the second test design:
module top (
input clk,
output rst,
output val
);
assign val = 0, rst = 0;
endmodule
Given this results I think it is safe to say that the lattice tools simply ignore GSR cells in iCE40 designs. (Maybe for compatibility with their other FPGA families?)
So how does one generate a rst signal then? For example, the following is a simple reset generator that asserts (pulls low) resetn for the first 15 cycles:
input clk;
...
wire resetn;
reg [3:0] rststate = 0;
assign resetn = &rststate;
always #(posedge clk) rststate <= rststate + !resetn;
(The IceStorm flow does support arbitrary initialization values for registers, whereas the lattice tools ignore the initialization value and simply initialize all FFs to zero. So if you want your designs to be portable between the tools, it is recommended to only initialize regs to zero.)
If you are using a PLL, then it is custom to use the PLL LOCK output to drive the resetn signal. Unfortunately the "iCE40 sysCLOCK PLL Design and Usage Guide" does not state if the generated LOCK signal is already synchronous to the generated clock, so it would be a good idea to synchronize it to the clock to avoid problems with metastability:
wire clk, resetn, PLL_LOCKED;
reg [3:0] PLL_LOCKED_BUF;
...
SB_PLL40_PAD #( ... ) PLL_INST (
...
.PLLOUTGLOBAL(clk),
.LOCK(PLL_LOCKED)
);
always #(posedge clk)
PLL_LOCKED_BUF <= {PLL_LOCKED_BUF, PLL_LOCKED};
assign resetn = PLL_LOCKED_BUF[3];
Regarding usage of global nets: You can explicitly route the resetn signal via a global net (using the SB_GB primitive), but using the IceStorm flow, arachne-pnr will automatically route a set/reset signal (when used by more than just a few FFs) over a global net, if a global net is available.

How to effectively utilize a VHDL module?

There's a few questions in here, so bear with me, and thanks for taking the time to read this...
I recently wrote an SPI master, and have fully simulated it to make sure it works as expected.
From here I'd like to use it in another design where I've already got a 7 segment display component set up to take the value received from an ADC on the SPI bus, however I think I've confused myself with things at this point.
I need to send a pulse with other parameters to the SPI master to initiate a transfer, and wait on a busy signal to be de-asserted before I can send anything else. I'm not really sure how best way to implement the SPI master within the new design.
Would I use it in the design as a component? is there a better way?
If it has to be a component, is there any way I can set it up to directly output from that component to pins rather than me having to map to new inputs/outputs in the top level design?
For example, I have SCLK, MOSI, MISO, and CS; Can I not just have them output directly rather than having to be mapped through the top level? Seems like it'd simplify the top level and make it less clunky.
Also, would it be possible to set up a function to just say "Send this data over SPI and then return what's received"?
I'm still getting my head around how to put these things together so help/examples would be greatly appreciated. It seems like all the examples/tutorials available are based on things like using two half-adders, logic gates, etc. which only help to a point when they're so simple.
edit: Entity of my SPI Master
entity SPI_master is
generic(data_width: integer := 8;
clock_select: integer := 0);
port(SCLK: out std_logic;
MOSI: out std_logic;
MISO: in std_logic;
CS: out std_logic;
Mclk_in: in std_logic;
RST: in std_logic;
CPOL: in std_logic;
CPHA: in integer;
send_packet: in std_logic;
busy: out std_logic;
Tx_data: in std_logic_vector(data_width-1 downto 0);
Rx_data: out std_logic_vector(data_width-1 downto 0));
end SPI_master;
Your entity looks reasonable, though better names or comments on CPOL,CPHA would be useful!
Partial answers :
1) You CAN use it in your design as a component, but as previously mentioned, direct entity instantiation is simpler and less verbose.
2) No you can't directly output from deep in the hierarchy, and even if you could it would be a terrible idea!
Are you familiar with "Design Patterns" from C++, Ada or Java programming? If so, think of your top level design as the "Facade" pattern.
It's the only thing the external world needs to know about your design. And it will often be written as structural HDL, instantiating your other entities, and making interconnections between sub-units and connections to external ports.
There are ways to reduce the pain of these interconnections, especially across multiple layers of hierarchy, but ultimately you must break out the SPI signals to individual pins on the top level design, so that they can be connected to the correct wires on the PCB!
3) would it be possible to set up a function to just say "Send this data over SPI and then return what's received" ... not a function, no.
But certainly you can introduce a hardware wrapper to provide the rest of your design with a simple view of a complex task. For example, (assuming "send_packet" is asserted to write a byte on SPI, and "busy" goes high until the write is complete) you can create an entity taking an array of bytes and a "start" signal as inputs. Its architecture contains a process to count the bytes, outputting each in turn to SPI and waiting while "busy", and it can signal to its "caller" when done.

Preserving the widths of ports

I am trying to re-use netlists in other designs without the success.
I have a component which is translated to the netlist:
entity c is
port (... sel : in std_logic_vector(31 downto 0); ... );
In the design I am using just sel(4 downto 0).
The synthesis tools notices this behaviour and gives a warning:
'WARNING:Xst:647 - Input sel<31:5> is never used ..
I am generating netlist with properties:
keep hierarchy = true
add I/O buffers = off
Whenever I want to instantiate this netlist as an black-box module in other circuit I got an error:
ERROR:NgdBuild:76 - cannot be merged into block because one or more pins on the block, including pin "sel<31>", were not found in the file.
How can I preserve the size of sel?
I should mention that the sel needs to be 32bits width since it's connected to the bus.
You could try driving the unused input ports to zero.
Can you use the component directly instead of as a pre-synthesised black-box?
You may get things to work by putting a KEEP attribute (see your synth tools manual) on the port. I've only ever tried this on signals, but it may work.
This sort of task is often described as "pushing on the rope" of the synthesiser, as it's such a pain to get it to not be as celever as it wants to be (and then in the next release of tools you need a different attribute :)

Resources