LFSR in VHDL always generating zero - vhdl

I have written a LFSR in VHDL. I have tested it in simulation and it works as expected (generates random integers between 1 and 512). However when I put it onto hardware it always generates "000000000"
The code is as follows:
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
entity LFSR is
port(clk, reset : in bit;
random : out std_logic_vector (8 downto 0));
end entity LFSR;
architecture behaviour of LFSR is
signal temp : std_logic_vector (8 downto 0) := (8 => '1', others => '0');
begin
process(clk)
begin
if(clk'event and clk='1') then
if(reset='0') then --reset on signal high, carry out normal function
temp(0) <= temp(8);
temp(1) <= temp(0);
temp(2) <= temp(1) XOR temp(8);
temp(3) <= temp(2) XOR temp(8);
temp(4) <= temp(3) XOR temp(8);
temp(8 downto 5) <= temp(7 downto 4);
else
--RESET
temp <= "100000000";
end if;
end if;
random <= temp;
end process;
end architecture behaviour;
It was tested in Modelsim and compiled in Quartus II for a Cyclone III DE0 board.
Can anyone see why it is not working (in practice, the simulation is fine) and explain what I need to change to get it to work?

If reset is directly from a FPGA pin, then it is probably not synchronized
with clk, so proper synchronous reset operation is not ensured.
Add two flip-flops for synchronization of reset to clk before it is used in
the process. This can be done with:
...
signal reset_meta : bit; -- Meta-stable flip-flop
signal reset_sync : bit; -- Synchronized reset
begin
process(clk)
begin
if(clk'event and clk='1') then
reset_meta <= reset;
reset_sync <= reset_meta;
if (reset_sync = '0') then -- Normal operation
...
Altera have some comment about this in External Reset Should be Correctly
Synchronized.
The description covers flip-flops with asynchronous reset, but the use of two
flip-flops for synchronisation of external reset applies equally in your case.
Still remember to move the random <= temp inside the if as David pointed
out.

If you haven't any luck you might take a look at the synthesized schematic or perform post-synthesis simulation. I occasionally swap RTL models out for post - synth models for verification if there's something that isn't obvious in behavioral simulation.
-Jerry

Related

UART Transmitter only functions when embedded logic analyzer is running

I have been trying to implement a UART in order to communicate between my Lattice MachXO3D board and my computer. At the moment I am attempting to implement the transmission from the FPGA.
Upon testing on the hardware, I encountered a very strange issue. If run normally, it will function for a few seconds and then it will suddenly stop functioning (The CH340 connected to my computer will report it is receiving messages containing 0x0). However, if I embed a logic analyzer onto the FPGA through the Lattice Diamond software, and I run the analyzer, it will function perfectly for an extended period of time.
Sadly, I don't have a logic analyzer, so the embedded logic analyzer is my only chance at knowing what is actually being transmitted.
These are the files related to my implementation:
baud_gen
LIBRARY IEEE;
USE IEEE.std_logic_1164.ALL;
USE IEEE.numeric_std.ALL;
-- Generates 16 ticks per bit
ENTITY baud_gen IS
GENERIC(divider: INTEGER := 13 -- 24M/115200*16
);
PORT(
clk, reset: IN STD_LOGIC;
s_tick: OUT STD_LOGIC
);
END baud_gen;
ARCHITECTURE working OF baud_gen IS
BEGIN
PROCESS(clk)
VARIABLE counter: UNSIGNED(3 DOWNTO 0) := to_unsigned(0,4);
BEGIN
IF clk'EVENT AND clk='1' THEN
IF reset='1' THEN
s_tick <= '0';
counter := to_unsigned(0,4);
ELSIF counter=to_unsigned(divider-1,4) then
s_tick <= '1';
counter:= to_unsigned(0,4);
ELSE
s_tick <= '0';
counter := counter + 1;
END IF;
END IF;
END PROCESS;
END working;
rs232_tx
LIBRARY IEEE;
USE IEEE.std_logic_1164.ALL;
USE IEEE.numeric_std.ALL;
ENTITY rs232_tx IS
PORT(clk: IN std_logic;
tx: OUT std_logic;
rst: IN std_logic;
fifo_empty: IN std_logic;
fifo_RdEn, fifo_RdClock: OUT std_logic;
fifo_data: IN STD_LOGIC_VECTOR(7 DOWNTO 0);
Q: OUT STD_LOGIC_VECTOR(1 DOWNTO 0));
END rs232_tx;
ARCHITECTURE working OF rs232_tx IS
TYPE state IS (idle,start,data);
SIGNAL tx_pulse: STD_LOGIC := '1';
SIGNAL s_tick: STD_LOGIC;
SIGNAL pr_state, nx_state: state := idle;
SIGNAL data_val: std_logic_vector(7 DOWNTO 0):=(others=>'0');
SIGNAL data_count: unsigned(2 DOWNTO 0):=to_unsigned(0,3);
BEGIN
process(s_tick, rst)
VARIABLE count: unsigned(3 DOWNTO 0):= to_unsigned(0,4);
BEGIN
IF rising_edge(s_tick) THEN
count := count + to_unsigned(1,4);
IF count = to_unsigned(15,4) THEN
tx_pulse <= '1';
ELSE
tx_pulse <='0';
END IF;
END IF;
END PROCESS;
process(tx_pulse,rst)
BEGIN
IF rising_edge(tx_pulse) THEN
IF rst='1' THEN
pr_state <= idle;
data_val <= (others=>'0');
data_count <= to_unsigned(0,3);
ELSE
pr_state <= nx_state;
CASE pr_state IS
WHEN idle =>
data_count <= to_unsigned(0,3);
WHEN data =>
data_count <= data_count + to_unsigned(1,3);
WHEN start =>
data_val <= fifo_data;
WHEN OTHERS =>
END case;
END IF;
END IF;
END process;
process(fifo_empty,rst,data_count,pr_state,data_count)
BEGIN
case pr_state is
when idle =>
Q <= ('1','1');
fifo_RdEn <= '0';
tx <= '1';
IF fifo_empty='0' AND rst='0' THEN
nx_state <= start;
ELSE
nx_state <= idle;
END IF;
WHEN start =>
Q <= ('1','0');
fifo_RdEn <= '1';
tx <= '0';
nx_state <= data;
WHEN data =>
Q <= ('0','1');
fifo_RdEn <= '0';
tx <= data_val(to_integer(data_count));
if data_count=to_unsigned(7,3) then
nx_state <= idle;
ELSE
nx_state <= data;
end if;
end case;
END process;
fifo_RdClock <= tx_pulse;
baud_gen: ENTITY work.baud_gen
PORT MAP(clk,reset=>'0',s_tick=>s_tick);
END working;
testbench
LIBRARY IEEE;
USE IEEE.std_logic_1164.ALL;
USE IEEE.numeric_std.ALL;
ENTITY rs232_tx_test IS
GENERIC(clk_period: TIME := 41666666.7 fs;
baud_period: TIME := 8680.55556 ns);
END rs232_tx_test;
ARCHITECTURE working OF rs232_tx_test IS
SIGNAL clk: STD_LOGIC := '0';
SIGNAL tx, rst, fifo_empty, fifo_RdEn, fifo_RdClock: STD_LOGIC;
SIGNAL fifo_data: STD_LOGIC_VECTOR(7 DOWNTO 0);
BEGIN
clk <= NOT clk AFTER clk_period/2;
rst <= '1', '0' AFTER 100 ns;
PROCESS
BEGIN
fifo_empty <= '1';
WAIT FOR baud_period;
fifo_empty <= '0';
WAIT FOR baud_period*16;
END PROCESS;
fifo_data <= ('1','1','0','0','1','0','1','1') WHEN fifo_RdEn='1' ELSE (others=>'0');
dut: ENTITY work.rs232_tx
PORT MAP(clk,tx,rst,fifo_empty,fifo_RdEn,fifo_RdClock,fifo_data);
END working;
EDIT
I have tested another UART design I found online # 9600 bps and it fails in the same way. It can send a constant character, in this case 'a', to a terminal on my computer, and then it suddenly stops sending anything. However, if I start listening to the soft logic analyzer I generated in Lattice Diamond, it works without a problem and does not fail.
This smells like a classic timing issue.
In the comments, others have explained where there are weaknesses in the code that could indirectly lead to this. I'm going to concentrate on what is occurring.
As you have stated, in simulation your code works as you expect. However, that is only half the story. To create an FPGA bit stream the build tools take your code and several other files, synthesises, and conducts Place & Route. Your timing issue occurs during P&R. This is why your simulation doesn't pick up on any errors, as I'm assuming it's an RTL (pre-place & route) simulation.
During P&R the tools lay the logic in the best way to fit the timing model of the device so all paths meet their timing requirements. The path timing requirements are derived from explicit statements in a Timing Constraints file and inferred from your code (that's why your coding style matters btw).
Once P&R is complete, the tools will put the build artefact through a static timing analyser (STA) tool and report back whether the build fails to meet the timing requirements.
This leads to two questions:
Does the build report a timing error?
Do you have a Timing Constraints file - if you're unsure, the answer is no.
The way to debug your problem is to use the Timing Report generated by Lattice Diamond to see where the failures are.
If there are no reported failures it means your timing model is wrong because of a lack of appropriate timing constraints. As a minimum, you will need to constrain all IO in the design and describe all the clocks in your design.
Here is a good document to help you out:
https://www.latticesemi.com/-/media/LatticeSemi/Documents/UserManuals/RZ/Timing_Closure_Document.ashx?document_id=45588
The reason that your design works when you use the embedded Logic Analyser is that extra logic has been placed in the design, which changes the timing model. The P&R tools lay out the design differently, and by luck have placed the design in such a way as meet the real timing requirements on it.
As I often say to my Software Engineer colleagues, software languages create a set of instructions, HDL creates a set of suggestions.

VHDL - synthesis results is not the same as behavioral

I have to write program in VHDL which calculate sqrt using Newton method. I wrote the code which seems to me to be ok but it does not work.
Behavioral simulation gives proper output value but post synthesis (and launched on hardware) not.
Program was implemented as state machine. Input value is an integer (used format is std_logic_vector), and output is fixed point (for calculation
purposes input value was multiplied by 64^2 so output value has 6 LSB bits are fractional part).
I used function to divide in vhdl from vhdlguru blogspot.
In behavioral simulation calculating sqrt takes about 350 ns (Tclk=10 ns) but in post synthesis only 50 ns.
Used code:
library ieee;
use ieee.std_logic_1164.all;
use ieee.std_logic_arith.all;
use ieee.std_logic_unsigned.all;
entity moore_sqrt is
port (clk : in std_logic;
enable : in std_logic;
input : in std_logic_vector (15 downto 0);
data_ready : out std_logic;
output : out std_logic_vector (31 downto 0)
);
end moore_sqrt;
architecture behavioral of moore_sqrt is
------------------------------------------------------------
function division (x : std_logic_vector; y : std_logic_vector) return std_logic_vector is
variable a1 : std_logic_vector(x'length-1 downto 0):=x;
variable b1 : std_logic_vector(y'length-1 downto 0):=y;
variable p1 : std_logic_vector(y'length downto 0):= (others => '0');
variable i : integer:=0;
begin
for i in 0 to y'length-1 loop
p1(y'length-1 downto 1) := p1(y'length-2 downto 0);
p1(0) := a1(x'length-1);
a1(x'length-1 downto 1) := a1(x'length-2 downto 0);
p1 := p1-b1;
if(p1(y'length-1) ='1') then
a1(0) :='0';
p1 := p1+b1;
else
a1(0) :='1';
end if;
end loop;
return a1;
end division;
--------------------------------------------------------------
type state_type is (s0, s1, s2, s3, s4, s5, s6); --type of state machine
signal current_state,next_state: state_type; --current and next state declaration
signal xk : std_logic_vector (31 downto 0);
signal temp : std_logic_vector (31 downto 0);
signal latched_input : std_logic_vector (15 downto 0);
signal iterations : integer := 0;
signal max_iterations : integer := 10; --corresponds with accuracy
begin
process (clk,enable)
begin
if enable = '0' then
current_state <= s0;
elsif clk'event and clk = '1' then
current_state <= next_state; --state change
end if;
end process;
--state machine
process (current_state)
begin
case current_state is
when s0 => -- reset
output <= "00000000000000000000000000000000";
data_ready <= '0';
next_state <= s1;
when s1 => -- latching input data
latched_input <= input;
next_state <= s2;
when s2 => -- start calculating
-- initial value is set as a half of input data
output <= "00000000000000000000000000000000";
data_ready <= '0';
xk <= "0000000000000000" & division(latched_input, "0000000000000010");
next_state <= s3;
iterations <= 0;
when s3 => -- division
temp <= division ("0000" & latched_input & "000000000000", xk);
next_state <= s4;
when s4 => -- calculating
if(iterations < max_iterations) then
xk <= xk + temp;
next_state <= s5;
iterations <= iterations + 1;
else
next_state <= s6;
end if;
when s5 => -- shift logic right by 1
xk <= division(xk, "00000000000000000000000000000010");
next_state <= s3;
when s6 => -- stop - proper data
-- output <= division(xk, "00000000000000000000000001000000"); --the nearest integer value
output <= xk; -- fixed point 24.6, sqrt = output/64;
data_ready <= '1';
end case;
end process;
end behavioral;
Below screenshoots of behavioral and post-sythesis simulation results:
Behavioral simulation
Post-synthesis simulation
I have only little experience with VHDL and I have no idea what can I do to fix problem. I tried to exclude other process which was for calculation but it also did not work.
I hope you can help me.
Platform: Zynq ZedBoard
IDE: Vivado 2014.4
Regards,
Michal
A lot of the problems can be eliminated if you rewrite the state machine in single process form, in a pattern similar to this. That will eliminate both the unwanted latches, and the simulation /synthesis mismatches arising from sensitivity list errors.
I believe you are also going to have to rewrite the division function with its loop in the form of a state machine - either a separate state machine, handshaking with the main one to start a divide and signal its completion, or as part of a single hierarchical state machine as described in this Q&A.
This code is neither correct for simulation nor for synthesis.
Simulation issues:
Your sensitivity list is not complete, so the simulation does not show the correct behavior of the synthesized hardware. All right-hand-side signals should be include if the process is not clocked.
Synthesis issues:
Your code produces masses of latches. There is only one register called current_state. Latches should be avoided unless you know exactly what you are doing.
You can't divide numbers in the way you are using the function, if you want to keep a proper frequency of your circuit.
=> So check your Fmax report and
=> the RTL schematic or synthesis report for resource utilization.
Don't use the devision to shift bits. Neither in software the compiler implements a division if a value is shifted by a power of two. Us a shift operation to shift a value.
Other things to rethink:
enable is a low active asynchronous reset. Synchronous resets are better for FPGA implementations.
VHDL code may by synthesizable or not, and the synthesis result may behave as the simulation, or not. This depends on the code, the synthesizer, and the target platform, and is very normal.
Behavioral code is good for test-benches, but - in general - cannot be synthesized.
Here I see the most obvious issue with your code:
process (current_state)
begin
[...]
iterations <= iterations + 1;
[...]
end process;
You are iterating over a signal which does not appear in the sensitivity list of the process. This might be ok for the simulator which executes the process blocks just like software. On the other hand side, the synthesis result is totally unpredictable. But adding iterations to the sensitivity list is not enough. You would just end up with an asynchronous design. Your target platform is a clocked device. State changes may only occur at the trigger edge of the clock.
You need to tell the synthesizer how to map the iterations required to perform this calculation over the clock cycles. The safest way to do that is to break down the behavioural code into RTL code (https://en.wikipedia.org/wiki/Register-transfer_level#RTL_in_the_circuit_design_cycle).

VHDL - FSM not starting (JUST in timing simulation)

I'm working for my master thesis and I'm pretty new to VHDL, but still I have to implement some complex things. This is one of the easiest structures I had to write, and still I'm encountering some problems.
It's a FSM implementing a 24bit shift register with an active-low sync signal (to program a DAC). It's just the end of a complex elaboration chain I created for my project. I followed the example model of a FSM as much as I could.
The behavioral simulation works fine, actually the whole elaboration chain I created works perfectly fine as far as the behavioral simulation concerns. However, once I try the Post-translate simulation things start to go wrong: lots of 'X' output signals.
With this simple shift register I DON'T get any 'X', however I can't get to the load_and_prepare_data phase. It seems that the current_state changes (by inspecting some signals), but the elaboration doesn't go on.
Please keep in mind that since I'm new to the language, I have no idea of what timing constraints I should set on this FSM (and I wouldn't know how to write them on the top.ucf anyway)
Can you see what's wrong?
Thanks in advance
EDIT
I followed your advices and cleaned up the FSM by using a single state process. I still have some doubts about "where to put what" but I really like the new implementation. Anyway I now get a clean behavioral simulation but 'X' on all outputs in post translate simulation.
What is causing this?
I'll post the both the new code and the testbench:
----------------------------------------------------------------------------------
-- Company:
-- Engineer:
--
-- Create Date: 14:44:03 11/28/2014
-- Design Name:
-- Module Name: dac_ad5764r_24bit_sr_programmer_v2 - Behavioral
-- Project Name:
-- Target Devices:
-- Tool versions:
-- Description: This is a PISO shift register that gets a 24bit parallel input word.
-- It outputs the 24bit input word starting from the MSB and enables
-- an active low ChipSelect line for 24 clock periods.
-- Dependencies:
--
-- Revision:
-- Revision 0.01 - File Created
-- Additional Comments:
--
----------------------------------------------------------------------------------
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
-- Uncomment the following library declaration if using
-- arithmetic functions with Signed or Unsigned values
use IEEE.NUMERIC_STD.ALL;
-- Uncomment the following library declaration if instantiating
-- any Xilinx primitives in this code.
--library UNISIM;
--use UNISIM.VComponents.all;
entity dac_ad5764r_24bit_sr_programmer_v2 is
Port ( clk : in STD_LOGIC;
start : in STD_LOGIC;
reset : in STD_LOGIC; -- Note that this reset is for the FSM not for the DAC
reset_all_dac : in STD_LOGIC;
data_in : in STD_LOGIC_VECTOR (23 downto 0);
serial_data_out : out STD_LOGIC;
sync_out : out STD_LOGIC; -- This is a chip select
reset_out : out STD_LOGIC;
busy : out STD_LOGIC
);
end dac_ad5764r_24bit_sr_programmer_v2;
architecture Behavioral of dac_ad5764r_24bit_sr_programmer_v2 is
-- Stati
type state_type is (idle, load_and_prepare_data, transmission);
--ATTRIBUTE ENUM_ENCODING : STRING;
--ATTRIBUTE ENUM_ENCODING OF state_type: TYPE IS "001 010 100";
signal state: state_type := idle;
--signal next_state: state_type := idle;
-- Clock counter
--signal clk_counter_enable : STD_LOGIC := '0';
signal clk_counter : unsigned(4 downto 0) := (others => '0');
-- Shift register
signal stored_data: STD_LOGIC_VECTOR (23 downto 0) := (others => '0');
begin
FSM_single_process: process(clk)
begin
if rising_edge(clk) then
if reset = '1' then
serial_data_out <= '0';
sync_out <= '1';
reset_out <= '1';
busy <= '0';
state <= idle;
else
-- Default
serial_data_out <= '0';
sync_out <= '1';
reset_out <= '1';
busy <= '0';
case (state) is
when transmission =>
serial_data_out <= stored_data(23);
sync_out <= '0';
busy <= '1';
clk_counter <= clk_counter + 1;
stored_data <= stored_data(22 downto 0) & "0";
state <= transmission;
if (clk_counter = 23) then
state <= idle;
end if;
when others => -- Idle
if start = '1' then
serial_data_out <= data_in(23);
sync_out <= '0';
reset_out <= '1';
busy <= '1';
stored_data <= data_in;
clk_counter <= "00001";
state <= transmission;
end if;
end case;
-- if (reset_all_dac = '1') then
-- reset_out <= '0';
-- end if;
end if;
end if;
end process;
end;
And the testbench:
LIBRARY ieee;
USE ieee.std_logic_1164.ALL;
-- Uncomment the following library declaration if using
-- arithmetic functions with Signed or Unsigned values
--USE ieee.numeric_std.ALL;
ENTITY dac_ad5764r_24bit_sr_programmer_tb IS
END dac_ad5764r_24bit_sr_programmer_tb;
ARCHITECTURE behavior OF dac_ad5764r_24bit_sr_programmer_tb IS
-- Component Declaration for the Unit Under Test (UUT)
COMPONENT dac_ad5764r_24bit_sr_programmer_v2
PORT(
clk : IN std_logic;
start : IN std_logic;
reset : IN std_logic;
data_in : IN std_logic_vector(23 downto 0);
serial_data_out : OUT std_logic;
reset_all_dac : IN std_logic;
sync_out : OUT std_logic;
reset_out : OUT std_logic;
--finish : OUT std_logic;
busy : OUT std_logic
);
END COMPONENT;
--Inputs
signal clk : std_logic := '0';
signal start : std_logic := '0';
signal reset : std_logic := '0';
signal data_in : std_logic_vector(23 downto 0) := (others => '0');
signal reset_all_dac : std_logic := '0';
--Outputs
signal serial_data_out : std_logic;
signal sync_out : std_logic;
signal reset_out : std_logic;
--signal finish : std_logic;
signal busy : std_logic;
-- Clock period definitions
constant clk_period : time := 100 ns;
BEGIN
-- Instantiate the Unit Under Test (UUT)
uut: dac_ad5764r_24bit_sr_programmer_v2 PORT MAP (
clk => clk,
start => start,
reset => reset,
data_in => data_in,
reset_all_dac => reset_all_dac,
serial_data_out => serial_data_out,
sync_out => sync_out,
reset_out => reset_out,
--finish => finish,
busy => busy
);
-- Clock process definitions
clk_process :process
begin
clk <= '0';
wait for clk_period/2;
clk <= '1';
wait for clk_period/2;
end process;
-- Stimulus process
stim_proc: process
begin
-- hold reset state for 100 ns.
wait for clk_period*10;
reset <= '1' after 25 ns;
wait for clk_period*1;
reset <= '0' after 25 ns;
wait for clk_period*3;
reset_all_dac <= '1' after 25 ns;
wait for clk_period*1;
reset_all_dac <= '0' after 25 ns;
wait for clk_period*5;
data_in <= "111111111111111111111111" after 25 ns;
wait for clk_period*3;
start <= '1' after 25 ns;
wait for clk_period*1;
start <= '0' after 25 ns;
wait;
end process;
END;
UPDATE 1
Updated with the last design: this code is not causing any 'X' (can't figure out why, this doesn't but the previous did). However it's not starting (in POST-TRANSLATE simulation) just like the first 3 process machine, and the signal sync_out is stuck at 0 while it should be '1' by default.
UPDATE 2
I've been looking into the tecnology schematic, starting from the problem of the sync_out=0: it's implemented with a FDS, S is the FSM reset signal, D is coming from a LUT3 with I = state&reset&start and INIT = 45 = "00101101". I've looked for this LUT3 in the simulation and I've noticed that it has INIT = "00000000"!
Is there something I'm missing about how to run this simulation? It seems that every LUT in the design have not been set!
UPDATE 3
It seems that the Post-Translate simulation is buggy in some way, or I'm not configuring it correctly for some reason: the Post-Map and the Post-PAR simulations work and display some outputs.
However there is an odd bug: the stored_data register is not updated with the complete data_in vector, after that, the FSM operates correctly and outputs the data stored.
I've looked in the tecnology schematic just after synthesis and for some reason the bits 23,22,21,19,18 are not connected to the corresponding data_in bit. You can see the effect in this screenshot from Post-Map simulation. Same happens in Post-PAR, but it seems that this problems comes directly from the synthesis!
Solved: the strange output comes from the Synthesis optimization. The tool realized that the previous block in the elaboration chain will never output a bit different from 0 for those specific bit. My mistake was assuming that I could test the single block alone: what I was really testing was the block synthetized for the FPGA taking into account everything else in the design!
Thanks to everybody helped me, I'm going to follow your advices!
I prefer the single-process form of state machine, which is cleaner, simpler and much less prone to bugs like sensitivity-list errors. I would also endorse the points in Paebbels' excellent answer. However I don't think any of these are the problem here.
One thing to be aware of in post-synth and post-PAR simulations is that their model of time is different from the behavioural model. The behavioural model follows simple rules as I described in this answer and ensures that in a typical design flow you can go straight to hardware - without post-synth simulation, without worry.
Indeed I only use post-synth or post-PAR simulations if I'm chasing a suspected tool bug. (For FPGA designs, not ASIC, that is!)
However, that simple timing model has its limitations. You may be familiar with problems like a clock signal assigned via signal assignment (usually buried in a 3rd party model where you don't expect it) which consumes a delta cycle, and ensures that your clocked data arrives before your clock instead of after, and everything subsequently occurs one cycle earlier than intended...
In behavioural modelling, a little discipline will keep clear of such troubles. But the same is not true of post-PAR modelling.
Your testbench is probably set up the same way as the behavioural model. And if so, that is likely to be the problem.
Here's what I do in this situation : I claim no formal authority for it, just experience. It also works well when interfacing the FPGA to external memory models with realistic timings.
1) I assume the simple (behavioural) timing model works correctly for all signals INTERNAL to the design.
2) I assume nothing of the sort for inputs and outputs from the design.
3) I take note of the estimated setup and hold timings on the inputs, (a) from the FPGA datasheet or better, (b) from the worst case values shown in the post-synth or post-PAR report, and structure the testbench around them.
Worked example : setup time 1 ns, hold time 2 ns, clock period 10 ns. This means that any input between 2 ns and 9 ns after a clock edge is guaranteed to be corrrectly read. I choose (arbitrarily) 5 ns.
signal_to_fpga <= driving_value after 5 ns;
(Note that Xilinx makes this absurdly counter-intuitive by expressing them as "offset in/out before/after" which refers timings to a previous or future clock edge instead of the one you're looking at)
Alternatively, if the input is fed from a CPU or memory in the real world, I use datasheet timing specifications for that device.
4) I take note of the worst case clk-out timing reported in the datasheet or report, and structure the design around them. (say, 7 ns)
fpga_output_pin <= driving_value after 7 ns;
Note that this "after" clause is obviously ignored by synthesis; however the post-synth back-annotation will introduce something very like it.
5) If this turns out to be not good enough, then (possibly in a wrapper component to avoid polluting the synthesisable code) improve accuracy like
fpga_output_pin <= 'X' after 1 ps, driving_value after 7 ns;
6) I re-run the behavioural simulation. Typically, it now fails, because it was written without realistic timings in mind.
7) I fix those failures. This may include adding realistic delays before testing values output from the design. It can be an iterative process.
Now, I have a reasonable expectation that the post-PAR simulation model will drop straight in to the testbench and work.
Here are some hints to improve your code:
You can remove the Xilinx dependencies to UNISIM, because you are not using any Xilinx Primitves.
Applying attribute ENUM_ENCODING has no effect on state encoding unless you also define the attribute FSM_ENCODING and set it's value to user. One-Hot encoding can be forced by setting FSM_ENCODING to one-hot. Normally synthesis is smart enough to find the best encoding.
read more ...
None of your registers has a default value:
signal current_state : state_type := idle;
Your FSM is no FSM in the eyes of Xilinx synthesis tool (XST). I'm sure if you look into your synthesis report, you won't find that XST reports a FSM for current_state.
So what's wrong with your FSM?
Your FSM has no initial state.
Your FSM has multiple reset states (idle, load_and_prepare_data)
Your FSM has no transition from idle to load_and_prepare_data (reset is no transition)
Writing next_state transitions for the current state can cause XST to think it's no FSM
the default assignment next_state <= current_state; is sufficient.
If you change the type of signal clk_counter to unsigned you can do arithmetic much easier.
increment: clk_counter <= clk_counter + 1;
clear: clk_counter <= (others => '0');
compare: if (clk_counter = 23) then
It's no good style to use the FSM's state signal outside of the FSM processes.
FSM_next_state_process: process(current_state, start, clk_counter, reset_all_dac)
begin
next_state <= current_state;
OutReg_busy <= '1';
OutReg_reset_out <= '1';
OutReg_sync_out <= '1';
clk_counter_enable <= '0';
case (current_state) is
when idle =>
OutReg_busy <= '0';
if (reset_all_dac = '1') then
OutReg_reset_out <= '0';
end if;
when load_and_prepare_data =>
next_state <= transmission;
when transmission =>
clk_counter_enable <= '1';
OutReg_sync_out <= '0';
if (clk_counter = 23) then
next_state <= idle;
end if;
when others =>
next_state <= idle;
end case ;
end process;

Why it is necessary to use internal signal for process?

I'm learning VHDL from the root, and everything is OK except this. I found this from Internet. This is the code for a left shift register.
library ieee;
use ieee.std_logic_1164.all;
entity lsr_4 is
port(CLK, RESET, SI : in std_logic;
Q : out std_logic_vector(3 downto 0);
SO : out std_logic);
end lsr_4;
architecture sequential of lsr_4 is
signal shift : std_logic_vector(3 downto 0);
begin
process (RESET, CLK)
begin
if (RESET = '1') then
shift <= "0000";
elsif (CLK'event and (CLK = '1')) then
shift <= shift(2 downto 0) & SI;
end if;
end process;
Q <= shift;
SO <= shift(3);
end sequential;
My problem is the third line from bottom. My question is, why we need to pass the internal signal value to the output? Or in other words, what would be the problem if I write Q <= shift (2 downto 0) & SI?
In the case of the shown code, the Q output of the lsr_4 entity comes from a register (shift representing a register stage and being connected to Q). If you write the code as you proposed, the SI input is connected directly (i.e. combinationally) to the Q output. This can also work (assuming you leave the rest of the code in place), it will perform the same operation logically expect eliminate one clock cycle latency. However, it's (generally) considered good design practice to have an entity's output being registered in order to not introduce long "hidden" combinational paths which are not visible when not looking inside an entity. It usually makes designing easier and avoids running into timing problems.
First, this is just a shift register, so no combinational blocks should be inferred (except for input and output buffers, which are I/O related, not related to the circuit proper).
Second, the signal called "shift" can be eliminated altogether by specifying Q as "buffer" instead of "out" (this is needed because Q would appear on both sides of the expression; "buffer" has no side effects on the inferred circuit). A suggestion for your code follows.
Note: After compiling your code, check in the Netlist Viewers / Technology Map Viewer tool what was actually implemented.
library ieee;
use ieee.std_logic_1164.all;
entity generic_shift_register is
generic (
N: integer := 4);
port(
CLK, RESET, SI: in std_logic;
Q: buffer std_logic_vector(N-1 downto 0);
SO: out std_logic);
end entity;
architecture sequential of generic_shift_register is
begin
process (RESET, CLK)
begin
if (RESET = '1') then
Q <= (others => '0');
elsif rising_edge(CLK) then
Q <= Q(N-2 downto 0) & SI;
end if;
end process;
SO <= Q(N-1);
end architecture;

VHDL error can't infer register because its behavior does not match any supported register model

I am new to VHDL and trying to make a delay/gate application for programmable FPGA, with adjustable lenght of delay and gate output. As soon as the input signal is recieved, the thing should ignore any other inputs, until generating of gate signal is finished.
I want to use this component for 8 different inputs and 8 different outputs later, and set desired delay/gate prameters separately for each one by means of writing registers.
When trying to compile in Quartus II v 11.0 i am getting this error:
Error (10821): HDL error at clkgen.vhd(46): can't infer register for "control_clkgen" because its behavior does not match any supported register model
And as well
Error (10822): HDL error at clkgen.vhd(37): couldn't implement registers for assignments on this clock edge
No idea whats wrong, here is the code of the component:
library ieee;
use IEEE.Std_Logic_1164.all;
use IEEE.Std_Logic_arith.all;
use IEEE.Std_Logic_unsigned.all;
ENTITY clkgen is
port(
lclk : in std_logic;
start_clkgen : in std_logic;
gate_clkgen : in std_logic_vector(31 downto 0);
delay_clkgen : in std_logic_vector(31 downto 0);
output_clkgen : out std_logic
);
END clkgen ;
ARCHITECTURE RTL of clkgen is
signal gate_cycles_clkgen : std_logic_vector(32 downto 0);
signal delay_cycles_clkgen : std_logic_vector(32 downto 0);
signal total_cycles_clkgen : std_logic_vector(32 downto 0);
signal counter_clkgen : std_logic_vector(32 downto 0);
signal control_clkgen : std_logic;
begin
gate_cycles_clkgen <= '0' & gate_clkgen;
delay_cycles_clkgen <= '0' & delay_clkgen;
total_cycles_clkgen <= gate_cycles_clkgen + delay_cycles_clkgen;
start_proc: process(lclk, start_clkgen)
begin
if (start_clkgen'event and start_clkgen = '1') then
if control_clkgen = '0' then
control_clkgen <= '1';
end if;
end if;
if (lclk'event and lclk = '1') then
if control_clkgen = '1' then
counter_clkgen <= counter_clkgen + 1;
if (counter_clkgen > delay_cycles_clkgen - 1 AND counter_clkgen < total_cycles_clkgen + 1) then
output_clkgen <= '1';
elsif (counter_clkgen = total_cycles_clkgen) then
counter_clkgen <= (others => '0');
output_clkgen <= '0';
control_clkgen <= '0';
end if;
end if;
end if;
end process start_proc;
END RTL;
Big thanks in advance for help.
The problem is that in the way you has described the element control_clkgen - it is edge sensitive to two different signals (lclk, and start_clkgen). What the tools are telling you is that "hey, as I am trying to make your valid VHDL design fit into a real piece of hardware, I have found that there are not any pieces of hardware that can implement what you want. Basically, there are no flip flops that can be edge sensitive to two signals (only one, typically the clock.
Possible solution: Do you really need control_clkgen to be sensitive to the edge of start_clkgen? Would it be good enough, or could you find another solution where start_proc is sensitive only to lclk and you simply check if start_clkgen is high?
start_proc: process(lclk)
begin
if (rising_edge(lclk)) then
start_clkgen_d <= start_clkgen;
if (start_clkgen='1' and start_clkgen_d='0') then
if control_clkgen = '0' then
control_clkgen <= '1';
end if;
end if;
end if;
...
You are describing a register control_clkgen which has two clocks start_clkgen and lclk. I guess that's not supported by your synthesis tool.
You have to describe this behavior in another way. Maybe use start_clkgen as asynchronous or synchronous preset signal or combine those two signals into one single clock signal or use more than one flipflop for that functionality.

Resources