How to effectively utilize a VHDL module? - vhdl

There's a few questions in here, so bear with me, and thanks for taking the time to read this...
I recently wrote an SPI master, and have fully simulated it to make sure it works as expected.
From here I'd like to use it in another design where I've already got a 7 segment display component set up to take the value received from an ADC on the SPI bus, however I think I've confused myself with things at this point.
I need to send a pulse with other parameters to the SPI master to initiate a transfer, and wait on a busy signal to be de-asserted before I can send anything else. I'm not really sure how best way to implement the SPI master within the new design.
Would I use it in the design as a component? is there a better way?
If it has to be a component, is there any way I can set it up to directly output from that component to pins rather than me having to map to new inputs/outputs in the top level design?
For example, I have SCLK, MOSI, MISO, and CS; Can I not just have them output directly rather than having to be mapped through the top level? Seems like it'd simplify the top level and make it less clunky.
Also, would it be possible to set up a function to just say "Send this data over SPI and then return what's received"?
I'm still getting my head around how to put these things together so help/examples would be greatly appreciated. It seems like all the examples/tutorials available are based on things like using two half-adders, logic gates, etc. which only help to a point when they're so simple.
edit: Entity of my SPI Master
entity SPI_master is
generic(data_width: integer := 8;
clock_select: integer := 0);
port(SCLK: out std_logic;
MOSI: out std_logic;
MISO: in std_logic;
CS: out std_logic;
Mclk_in: in std_logic;
RST: in std_logic;
CPOL: in std_logic;
CPHA: in integer;
send_packet: in std_logic;
busy: out std_logic;
Tx_data: in std_logic_vector(data_width-1 downto 0);
Rx_data: out std_logic_vector(data_width-1 downto 0));
end SPI_master;

Your entity looks reasonable, though better names or comments on CPOL,CPHA would be useful!
Partial answers :
1) You CAN use it in your design as a component, but as previously mentioned, direct entity instantiation is simpler and less verbose.
2) No you can't directly output from deep in the hierarchy, and even if you could it would be a terrible idea!
Are you familiar with "Design Patterns" from C++, Ada or Java programming? If so, think of your top level design as the "Facade" pattern.
It's the only thing the external world needs to know about your design. And it will often be written as structural HDL, instantiating your other entities, and making interconnections between sub-units and connections to external ports.
There are ways to reduce the pain of these interconnections, especially across multiple layers of hierarchy, but ultimately you must break out the SPI signals to individual pins on the top level design, so that they can be connected to the correct wires on the PCB!
3) would it be possible to set up a function to just say "Send this data over SPI and then return what's received" ... not a function, no.
But certainly you can introduce a hardware wrapper to provide the rest of your design with a simple view of a complex task. For example, (assuming "send_packet" is asserted to write a byte on SPI, and "busy" goes high until the write is complete) you can create an entity taking an array of bytes and a "start" signal as inputs. Its architecture contains a process to count the bytes, outputting each in turn to SPI and waiting while "busy", and it can signal to its "caller" when done.

Related

iCEstick + yosys - using the Global Set/Reset (GSR)

This is probably more of an iCEstick question than a yosys one, but asking here since I'm using the Icestorm tool chain.
I want to specify startup behavior of my design, which various places on the internet seem to agree is related to the typically named rst signal. It wasn't obvious to me where such a signal comes from, so I dug into the powerup sequence. Current understanding is from Figure 2 in this document.
After CDONE is pulled high by the device, all of the internal registers have been reset, to some initial value. Now, I've found plenty of lattice documents about how each type of flip-flop or hard IP receives a reset signal and does something with its internal state, but I still don't quite understand how I specify what those states are (or even just know what they are so I can use them).
For example, if I wanted to bring an LED high for 1 second after powerup (and only after powerup) I would want to start a counter after this reset signal (whatever it is) disables.
Poking around the ice40 family data sheet and the Lattice site, I found this document about using the Global Set/Reset signal. I confirmed this GSR is mentioned in the family data sheet, referenced on page 2-3 under "Clock/Control Distribution Network". It seems that a global reset signal is usable by one of the global buffers GBUF[0-7] and can be routed (up to 4 of them) to all LUTs with the global/high-fanout distribution network.
This seems like exactly what I was after, I but I can't find any other info about how to use this in my designs. The document on using the GSR states that you can instantiate a native GSR component like this:
GSR GSR_INST (.GSR (<global reset sig>));
but I can't tell whether this is just for simulation. Am I completely going in the wrong direction here or just missing something? I'm very inexperienced with FPGAs and hardware, so its entirely possible my entire approach is flawed.
I'm not sure if that GSR document actually is about iCE40. The Lattice iCEcube tool interestingly accepts instances of GSR cells, but it seems to simply treat them as constant zero drivers. There is also no simulation model for the GSR cell type in the iCE40 sim library and no description of it in the iCE40 tech library documentation provided by Lattice.
Furthermore, I have built the following two designs with the lattice tools, and besides the timestamp in the "comment field" of the generated bit-stream file, the generated bit-streams are identical! (This test was performed with Lattice LSE as synthesis tool, not Synplify. I had problems getting Synplify to run on my machine for some reason and gave up trying to do so over a year ago..)
This is the first test design I've used:
module top (
input clk,
output rst,
output reg val
);
always #(posedge clk, posedge rst)
if (rst)
val = 1;
else
val = 0;
GSR GSR_INST (.GSR (rst));
endmodule
And this is the second test design:
module top (
input clk,
output rst,
output val
);
assign val = 0, rst = 0;
endmodule
Given this results I think it is safe to say that the lattice tools simply ignore GSR cells in iCE40 designs. (Maybe for compatibility with their other FPGA families?)
So how does one generate a rst signal then? For example, the following is a simple reset generator that asserts (pulls low) resetn for the first 15 cycles:
input clk;
...
wire resetn;
reg [3:0] rststate = 0;
assign resetn = &rststate;
always #(posedge clk) rststate <= rststate + !resetn;
(The IceStorm flow does support arbitrary initialization values for registers, whereas the lattice tools ignore the initialization value and simply initialize all FFs to zero. So if you want your designs to be portable between the tools, it is recommended to only initialize regs to zero.)
If you are using a PLL, then it is custom to use the PLL LOCK output to drive the resetn signal. Unfortunately the "iCE40 sysCLOCK PLL Design and Usage Guide" does not state if the generated LOCK signal is already synchronous to the generated clock, so it would be a good idea to synchronize it to the clock to avoid problems with metastability:
wire clk, resetn, PLL_LOCKED;
reg [3:0] PLL_LOCKED_BUF;
...
SB_PLL40_PAD #( ... ) PLL_INST (
...
.PLLOUTGLOBAL(clk),
.LOCK(PLL_LOCKED)
);
always #(posedge clk)
PLL_LOCKED_BUF <= {PLL_LOCKED_BUF, PLL_LOCKED};
assign resetn = PLL_LOCKED_BUF[3];
Regarding usage of global nets: You can explicitly route the resetn signal via a global net (using the SB_GB primitive), but using the IceStorm flow, arachne-pnr will automatically route a set/reset signal (when used by more than just a few FFs) over a global net, if a global net is available.

Optimisation of RAM

I'm currently debugging my DPRAM. As usual, simulation works perfectly but in real life it fails. The syntax is as such:
ram[Address][Data]
I can get the data to write to the first 8 addresses but anymore and the data is just lost (even on chipscope). As a work around, I stitched two pieces of DPRAM together with a simple logical switch to re-route the data to the second DPRAM after it hit the eighth address. This worked but it just looks so messy.
My thinking is that it is being optimised away, even if it isn't this will be a good learning curve anyway (and any thoughts on this are welcome).
Here is the signal variable in the DPRAM with my effort at stopping it from being optimised away:
type ram_array is array(16 downto 0) of std_logic_vector(31 downto 0);
shared variable ram: ram_array;
attribute KEEP: string;
attribute KEEP of ram_array : type is "TRUE";
I think I need to add a line to the UCF file also though I can't seem to get the syntax right, with entity name path obviously changed:
NET "entity/name/path/dpram/ram_array" KEEP ="TRUE";
So is this how I'd add the code if I wanted to stop optimisation?
EDIT:
Output (guess there wasn't optimisation):
Found 17x32-bit dual-port RAM <Mram_ram> for signal <ram>. Summary: inferred 1 RAM(s). inferred 65 D-type flip-flop(s). Unit <dpram> synthesized.
Thanks =)
Don't use shared variable for synthesizable code
If your code needs to be synthesizable, then don't use shared variable.
Inferring BlockRAM
If you need sample code to implement your memory without using a shared variable, use the documentation provided for your FPGA.
Sample doc links :
(xilinx) http://www.xilinx.com/support/documentation/sw_manuals/xilinx12_2/xst_v6s6.pdf
(altera) http://www.altera.com/literature/hb/qts/qts_qii51007.pdf
Hope this helps.

Due to SelectIO banking constraints, the IOBs in your design cannot be automatically placed

I'm using Spartan 3E starter kit. In creating a custom peripheral. I use the default settings except interfacing it to the PLB bus. I also generated XISE project. I added my ports which only consists of:
phy_tx_data : out std_logic_vector (0 to 3);
phy_tx_en : out std_logic;
phy_tx_clk : in std_logic;
phy_crs : in std_logic;
That is only but a few ports but the IOBs exceeded the limit of the available resources after synthesizing my ethernet transmit module. I would want to know how to actually implement it in the FPGA. Does the IOBs pertains to the ports of the top-level module? If so, I just added a few ports and it already exceeded. Why is that so? How can I overcome it? It seemed that the plb slave module consumed most of the available ports.. Or does the IOBs pertain to all signals and registers.. I really need help here.
IOB = physical IO pin. You can't put a design on an FPGA that requires more physical pins than the FPGA package provides. There are two solutions to this: either get a bigger FPGA or use fewer IO pins.

How do User Constraint Files actually work?

I got WebPack up and running on my machine again and after synthesizing a simple design and uploading it to my FPGA, I encountered quite a problem with my understanding.
When there is a line like this in the user constraint file:
NET "W1A<0>" LOC = "P18" ;
How exactly does the synthesis software determine how this pin gets assigned to by the VHDL code?
For instance, take this example code I've been provided with:
entity Webpack_Quickstart is
Port (
W1A : out STD_LOGIC_VECTOR(15 downto 0);
rx : in STD_LOGIC;
tx : inout STD_LOGIC;
clk : in STD_LOGIC
);
end Webpack_Quickstart;
architecture Behavioral of Webpack_Quickstart is
signal counter : STD_LOGIC_VECTOR(47 downto 0) := (others => '0');
begin
W1A(0) <= '1';
end;
How exactly does this code make the WIA0 pin on my FPGA turn on? What is the link? Is it just the name of the port in the entity declaration is there more magic involved?
Your .ucf constraints are applied in the implementation phase. At this point your design has been synthesized, and the available top-level nets are thus "known". So yes, it is only a matter of matching equally named nets to equally named constraints.
The syntax is slightly different though (using <> instead of () for indexing vectors for instance), but otherwise it's just a simple string match.
The easiest way to initially configure your pin constraints, especially for large designs, is to just use one of the graphical tools (PlanAhead, if it's included in the WebPack) to assign the pins, and generate an initial .ucf file.
I find that making small changes later on is easiest to do by hand using the standard ISE text editor directly though.

Preserving the widths of ports

I am trying to re-use netlists in other designs without the success.
I have a component which is translated to the netlist:
entity c is
port (... sel : in std_logic_vector(31 downto 0); ... );
In the design I am using just sel(4 downto 0).
The synthesis tools notices this behaviour and gives a warning:
'WARNING:Xst:647 - Input sel<31:5> is never used ..
I am generating netlist with properties:
keep hierarchy = true
add I/O buffers = off
Whenever I want to instantiate this netlist as an black-box module in other circuit I got an error:
ERROR:NgdBuild:76 - cannot be merged into block because one or more pins on the block, including pin "sel<31>", were not found in the file.
How can I preserve the size of sel?
I should mention that the sel needs to be 32bits width since it's connected to the bus.
You could try driving the unused input ports to zero.
Can you use the component directly instead of as a pre-synthesised black-box?
You may get things to work by putting a KEEP attribute (see your synth tools manual) on the port. I've only ever tried this on signals, but it may work.
This sort of task is often described as "pushing on the rope" of the synthesiser, as it's such a pain to get it to not be as celever as it wants to be (and then in the next release of tools you need a different attribute :)

Resources