So i saw this VHDL code for a testbench for a DFF somewhere and i don't quite get a few things.
1) Why are there 5 cases? Why aren't there just two? when the input is 0 and when it is 1;
2) Why did he pick those waiting periods so randomly? It seems that 12,28,2,10,20 ns seem very randomly chosen. What was the logic behind that?
architecture testbench of dff_tb is
signal T_din: std_logic;
signal T_dclk: std_logic;
signal T_qout: std_logic;
signal T_nqout: std_logic;
component dff
port ( din: in std_logic;
dclk: in std_logic;
qout: out std_logic;
nqout: out std_logic
);
end component;
begin
dut_dff: dff port map (T_din,T_dclk,T_qout,T_nqout);
process
begin
T_dclk <= '0';
wait for 5 ns;
T_dclk <= '1';
wait for 5 ns;
end process;
process
variable err_cnt: integer := 0;
begin
--case1
T_din <= '1';
wait for 12 ns;
assert (T_qout='1') report "Error1!" severity error;
-- case 2
T_din <= '0';
wait for 28 ns;
assert (T_qout='0') report "Error2!" severity error;
-- case 3
T_din <= '1';
wait for 2 ns;
assert (T_qout='0') report "Error3!" severity error;
-- case 4
T_din <= '0';
wait for 10 ns;
assert (T_qout='0') report "Error4!" severity error;
-- case 5
T_din <= '1';
wait for 20 ns;
assert (T_qout='1') report "Error5!" severity error;
wait;
end process;
end testbench;
In the following the case 1, case 2, through case 5 are represented by named markers A through E:
Case 1 checks to see that T_qout doesn't get updated on the falling edge of T_clk with T_din = '1'. See marker A.
Case 2 (marker B) checks to see T_qout doesn't get updated on the falling edge of T_clk with T_din = '0'.
(And about now you get the impression a student was supposed to do a gate level implementation of dff).
Case 3 (marker C) checks to see if T_qout remains a '0' (an assertion occurs when the condition is False), that the dff is clocked. That a '1' on T_din doesn't cause the output to change.
Case 4 (marker D) checks to see if T_qout remains a '1' the for the opposite value of T_din.
(These all appear to be checking gate level dff implementations).
Case 5 (marker E) appears to be checking the that a Master–slave edge-triggered D flip-flop isn't oscillating or 'relaxing' to the original state.
The testbench appears to be specific to a class assignment for implementing a DFF as a gate level model.
Now the question is, did the instructor cover all possible cases for a student to get it wrong?
You have to look closer on the clock. The clock switches every 5 ns.
Case 1
So in the beginning the DFF should be '1'.
Case 2
After 15 ns since the start, the output should be '0', not after 12 ns, thats, what you have to check.
Case 3
You set the input to '1' but the DFF should never react on it, because the duration is too short.
Case 4 & Case 5
They just ensure you, that after Case 3, there is everything alright. Again, like in Case 2 here you can check after which amount of time the DFF really switches.
This testbench IS a little bit large, if you consider testing a DFF. But if you are learning about hardware description and testbenching, it is good to know what you have to look after, before you start to implement more complicated and complex designs. Especially if you are going to make a silicon out of it, there is never enough testing done :)
Related
I have been modelling a few simple VHDL gates, but I can't seem to get the time delay rightI have the following code:
LIBRARY IEEE;
USE IEEE.std_logic_1164.ALL;
ENTITY AND_4 IS
GENERIC (delay : delay_length := 0 ns);
PORT (a, b, c, d : IN std_logic;
x : OUT STD_logic);
END ENTITY AND_4;
ARCHITECTURE dflow OF AND_4 IS
BEGIN
x <= ( a and b and c and d) AFTER delay;
END ARCHITECTURE dflow;
LIBRARY IEEE;
USE IEEE.std_logic_1164.ALL;
ENTITY TEST_AND_4 IS
END ENTITY TEST_AND_4;
ARCHITECTURE IO OF TEST_AND_4 IS
COMPONENT AND_4 IS
GENERIC (delay : delay_length := 0 ns);
PORT (a, b, c, d : IN std_logic;
x : OUT STD_logic);
END COMPONENT AND_4;
SIGNAL a,b,c,d,x : std_logic := '0';
BEGIN
G1 : AND_4 GENERIC MAP (delay => 5ns) PORT MAP (a,b,c,d,x);
PROCESS
VARIABLE error_count : integer:= 0;
BEGIN
WAIT FOR 1 NS;
a <= '1';
b <= '0';
c <= '0';
d <= '0';
ASSERT (x = '1') REPORT "output error" SEVERITY error;
IF (x /= '1') THEN
error_count := error_count + 1;
END IF;
--Repeated test vector -- omitted
END PROCESS;
END ARCHITECTURE IO;
CONFIGURATION TESTER1 OF TEST_AND_4 IS
FOR IO
FOR G1 : AND_4
USE ENTITY work.AND_4(dflow)
GENERIC MAP (delay);
END FOR;
END FOR;
END CONFIGURATION TESTER1;
When I simulate the model I only get the 1 ns delay that I added to each test vector. I'm guessing the problem is how I pass the delay to the component declaration in the test bench. I've tried a few things and reread the topic in the book I have but still no joy. Any help ?
Many thanks
D
modifying the unlabelled stimulus process in your testbench:
process
variable error_count : integer:= 0;
begin
wait for 1 ns;
a <= '1';
-- b <= '0';
-- c <= '0';
-- d <= '0';
-- assert (x = '1') report "output error" severity error;
-- if (x /= '1') then
-- error_count := error_count + 1;
-- end if;
--repeated test vector -- omitted
b <= '1';
c <= '1';
d <= '1';
wait for 5 ns;
wait for 5 ns;
wait;
end process;
to simply demonstrated the delay shows that the generic delay is being passed to the instantiated component:
If you get something different perhaps you could convert your question to a Minimal, Complete, and Verifiable example by ensuring that the example actually reproduces the problem and that we know your results:
Describe the problem. "It doesn't work" is not a problem statement. Tell us what the expected behavior should be. Tell us what the exact wording of the error message is, and which line of code is producing it. Put a brief summary of the problem in the title of your question.
The little bit of stimulus you left in your testbench doesn't appear properly test the and_4.
In the event there was more stimulus and you weren't waiting the pulse rejection limit implied by your signal assignment delay mechanism, you'd get nothing but those annoying assertions.
See IEEE Std 1076-2008 10.5. Simple signal assignment statements, 5.2.1 General, paragraphs 5 and 6:
The right-hand side of a simple waveform assignment may optionally specify a delay mechanism. A delay mechanism consisting of the reserved word transport specifies that the delay associated with the first waveform element is to be construed as transport delay. Transport delay is characteristic of hardware devices (such as transmission lines) that exhibit nearly infinite frequency response: any pulse is transmitted, no matter how short its duration. If no delay mechanism is present, or if a delay mechanism including the reserved word inertial is present, the delay is construed to be inertial delay. Inertial delay is characteristic of switching circuits: a pulse whose duration is shorter than the switching time of the circuit will not be transmitted, or in the case that a pulse rejection limit is specified, a pulse whose duration is shorter than that limit will not be transmitted.
Every inertially delayed signal assignment has a pulse rejection limit. If the delay mechanism specifies inertial delay, and if the reserved word reject followed by a time expression is present, then the time expression specifies the pulse rejection limit. In all other cases, the pulse rejection limit is specified by the time expression associated with the first waveform element.
(Note you can go to 10.5.2.2 Executing a simple assignment statement and see the after time_expression is part of the waveform_element and not the delay mechanism).
Sure
ENTITY TEST_AND_4 IS
END ENTITY TEST_AND_4;
ARCHITECTURE IO OF TEST_AND_4 IS
COMPONENT AND_4 IS
GENERIC (delay : delay_length := 0 ns);
PORT (a, b, c, d : IN std_logic;
x : OUT STD_logic);
END COMPONENT AND_4;
SIGNAL a,b,c,d,x : std_logic := '0';
BEGIN
G1 : AND_4 GENERIC MAP (delay => 5 NS) PORT MAP (a,b,c,d,x);
PROCESS
VARIABLE error_count : integer:= 0;
BEGIN
WAIT FOR 1 NS; -- Changed to 6 ns so that the wait is longer then the
-- generic gate propagation delay
a <= '1';
b <= '1';
c <= '1';
d <= '1';
ASSERT (x = '1') REPORT "output error" SEVERITY error;
IF (x /= '1') THEN
error_count := error_count + 1;
END IF;
I have noted the change I made to the test bench model above, seems kinda obvious now but yesterday it had me pulling my hair out.
Cheers
D
The 'fix' was to change the WAIT value in the sequential test bench model from 1 ns to 6 ns. This gives the gate the time to change state because it has a 5 ns inertial delay.
WAIT FOR 6 NS; -- Changed to 6 ns so that the wait is longer then the
-- generic gate propagation delay
Thanks for the help, but I spotted the problem this morning after reading USER115520's post. The delay I set was 'inertial' and set generically at 5 ns. In my test bench process I had only set 1 ns wait statements in between input signal changes. Thus the gate would not perform the transition when the correct stimuli and introduced.
I inserted a 6 ns delay after a=1 b=1 c=1 d=1 and got the correct response from the gate
In the past I asked a question about resets, and how to divide a high clock frequency down to a series of lower clock square wave frequencies, where each output is a harmonic of one another e.g. the first output is 10 Hz, second is 20 Hz etc.
I received several really helpful answers recommending what appears to be the convention of using a clock enable pin to create lower frequencies.
An alternative since occurred to me; using a n bit number that is constantly incremented, and taking the last x bits of the number as the clock ouputs, where x is the number of outputs.
It works in synthesis for me - but I'm curious to know - as I've never seen it mentioned anywhere online or on SO, am I missing something that means its actually a terrible idea and I'm simply creating problems for later?
I'm aware that the limitations on this are that I can only produce frequencies that are the input frequency divided by a power of 2, and so most of the time it will only approximate the desired output frequency (but will still be of the right order). Is this limitation the only reason it isn't recommended?
Thanks very much!
David
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.NUMERIC_STD.ALL;
library UNISIM;
use UNISIM.VComponents.all;
use IEEE.math_real.all;
ENTITY CLK_DIVIDER IS
GENERIC(INPUT_FREQ : INTEGER; --Can only divide the input frequency by a power a of 2
OUT1_FREQ : INTEGER
);
PORT(SYSCLK : IN STD_LOGIC;
RESET_N : IN STD_LOGIC;
OUT1 : OUT STD_LOGIC; --Actual divider is 2^(ceiling[log2(input/freq)])
OUT2 : OUT STD_LOGIC); --Actual output is input over value above
END CLK_DIVIDER;
architecture Behavioral of Clk_Divider is
constant divider : integer := INPUT_FREQ / OUT1_FREQ;
constant counter_bits : integer := integer(ceil(log2(real(divider))));
signal counter : unsigned(counter_bits - 1 downto 0) := (others => '0');
begin
proc : process(SYSCLK)
begin
if rising_edge(SYSCLK) then
counter <= counter + 1;
if RESET_N = '0' then
counter <= (others => '0');
end if;
end if;
end process;
OUT1 <= counter(counter'length - 1);
OUT2 <= not counter(counter'length - 2);
end Behavioral;
Functionally the two outputs OUT1 and OUT2 can be used as clocks, but that method of making clocks does not scale and is likely to cause problems in the implementation, so it is a bad habit. However, it is of course important to understand why this is so.
The reason it does not scale, is that every signal used as clock in a FPGA is to be distributed through a special clock net, where the latency and skew is well-defined, so all flip-flops and memories on each clock are updated synchronously. The number of such clock nets is very limited, usually in the range of 10 to 40 in a FPGA device, and some restrictions on use and location makes it typically even more critical to plan the use of clock nets. So it is typically required to reserve clock nets for only real asynchronous clocks, where there is no alternative than to use a clock net.
The reason it is likely to cause problems, is that clocks created based on bits in a counter have no guaranteed timing relation. So if it is required to moved data between these clock domains, it requires additional constrains for synchronization, in order to be sure that the Clock Domain Crossing (CDC) is handled correctly. This is done through constrains for synthesis and/or Static Timing Analysis (STA), and is usually a little tricky to get right, so using a design methodology that simplifies STA is habit that saves design time.
So in designs where it is possible to use a common clock, and then generate synchronous clock enable signals, this should be the preferred approach. For the specific design above, a clock enable can be generated simply by detecting the '0' to '1' transition of the relevant counter bit, and then assert the clock enable in the single cycle where the transition is detected. Then a single clock net can be used, together with 2 clock enables like CE1 and CE2, and no special STA constrains are required.
Morten already pointed out the theory in his answer.
With the aid of two examples, I will demonstrate the problems you encounter when using a generated clock instead of clock enables.
Clock Distribution
At first, one must take care that a clock arrives at (almost) the same time at all destination flip-flops. Otherwise, even a simple shift register with 2 stages like this one would fail:
process(clk_gen)
begin
if rising_edge(clk_gen) then
tmp <= d;
q <= tmp;
end if;
end if;
The intended behavior of this example is that q gets the value of d after two rising edges of the generated clock clock_gen.
If the generated clock is not buffered by a global clock buffer, then the delay will be different for each destination flip-flop because it will be routed via the general-purpose routing.
Thus, the behavior of the shift register can be described as follows with some explicit delays:
library ieee;
use ieee.std_logic_1164.all;
entity shift_reg is
port (
clk_gen : in std_logic;
d : in std_logic;
q : out std_logic);
end shift_reg;
architecture rtl of shift_reg is
signal ff_0_q : std_logic := '0'; -- output of flip-flop 0
signal ff_1_q : std_logic := '0'; -- output of flip-flop 1
signal ff_0_c : std_logic; -- clock input of flip-flop 0
signal ff_1_c : std_logic; -- clock input of flip-flop 1
begin -- rtl
-- different clock delay per flip-flop if general-purpose routing is used
ff_0_c <= transport clk_gen after 500 ps;
ff_1_c <= transport clk_gen after 1000 ps;
-- two closely packed registers with clock-to-output delay of 100 ps
ff_0_q <= d after 100 ps when rising_edge(ff_0_c);
ff_1_q <= ff_0_q after 100 ps when rising_edge(ff_1_c);
q <= ff_1_q;
end rtl;
The following test bench just feeds in a '1' at input d, so that, q should be '0' after 1 clock edge an '1' after two clock edges.
library ieee;
use ieee.std_logic_1164.all;
entity shift_reg_tb is
end shift_reg_tb;
architecture sim of shift_reg_tb is
signal clk_gen : std_logic;
signal d : std_logic;
signal q : std_logic;
begin -- sim
DUT: entity work.shift_reg port map (clk_gen => clk_gen, d => d, q => q);
WaveGen_Proc: process
begin
-- Note: registers inside DUT are initialized to zero
d <= '1'; -- shift in '1'
clk_gen <= '0';
wait for 2 ns;
clk_gen <= '1'; -- just one rising edge
wait for 2 ns;
assert q = '0' report "Wrong output" severity error;
wait;
end process WaveGen_Proc;
end sim;
But, the simulation waveform shows that q already gets '1' after the first clock edge (at 3.1 ns) which is not the intended behavior.
That's because FF 1 already sees the new value from FF 0 when the clock arrives there.
This problem can be solved by distributing the generated clock via a clock tree which has a low skew.
To access one of the clock trees of the FPGA, one must use a global clock buffer, e.g., BUFG on Xilinx FPGAs.
Data Handover
The second problem is the handover of multi-bit signals between two clock domains.
Let's assume we have 2 registers with 2 bits each. Register 0 is clocked by the original clock and register 1 is clocked by the generated clock.
The generated clock is already distributed by clock tree.
Register 1 just samples the output from register 0.
But now, the different wire delays for both register bits in between play an important role. These have been modeled explicitly in the following design:
library ieee;
use ieee.std_logic_1164.all;
library unisim;
use unisim.vcomponents.all;
entity handover is
port (
clk_orig : in std_logic; -- original clock
d : in std_logic_vector(1 downto 0); -- data input
q : out std_logic_vector(1 downto 0)); -- data output
end handover;
architecture rtl of handover is
signal div_q : std_logic := '0'; -- output of clock divider
signal bufg_o : std_logic := '0'; -- output of clock buffer
signal clk_gen : std_logic; -- generated clock
signal reg_0_q : std_logic_vector(1 downto 0) := "00"; -- output of register 0
signal reg_1_d : std_logic_vector(1 downto 0); -- data input of register 1
signal reg_1_q : std_logic_vector(1 downto 0) := "00"; -- output of register 1
begin -- rtl
-- Generate a clock by dividing the original clock by 2.
-- The 100 ps delay is the clock-to-output time of the flip-flop.
div_q <= not div_q after 100 ps when rising_edge(clk_orig);
-- Add global clock-buffer as well as mimic some delay.
-- Clock arrives at (almost) same time on all destination flip-flops.
clk_gen_bufg : BUFG port map (I => div_q, O => bufg_o);
clk_gen <= transport bufg_o after 1000 ps;
-- Sample data input with original clock
reg_0_q <= d after 100 ps when rising_edge(clk_orig);
-- Different wire delays between register 0 and register 1 for each bit
reg_1_d(0) <= transport reg_0_q(0) after 500 ps;
reg_1_d(1) <= transport reg_0_q(1) after 1500 ps;
-- All flip-flops of register 1 are clocked at the same time due to clock buffer.
reg_1_q <= reg_1_d after 100 ps when rising_edge(clk_gen);
q <= reg_1_q;
end rtl;
Now, just feed in the new data value "11" via register 0 with this testbench:
library ieee;
use ieee.std_logic_1164.all;
entity handover_tb is
end handover_tb;
architecture sim of handover_tb is
signal clk_orig : std_logic := '0';
signal d : std_logic_vector(1 downto 0);
signal q : std_logic_vector(1 downto 0);
begin -- sim
DUT: entity work.handover port map (clk_orig => clk_orig, d => d, q => q);
WaveGen_Proc: process
begin
-- Note: registers inside DUT are initialized to zero
d <= "11";
clk_orig <= '0';
for i in 0 to 7 loop -- 4 clock periods
wait for 2 ns;
clk_orig <= not clk_orig;
end loop; -- i
wait;
end process WaveGen_Proc;
end sim;
As can be seen in the following simulation output, the output of register 1 toggles to an intermediate value of "01" at 3.1 ns first because the input of register 1 (reg_1_d) is still changing when the rising edge of the generated clock occurs.
The intermediate value was not intended and can lead to undesired behavior. The correct value is seen not until another rising edge of the generated clock.
To solve this issue, one can use:
special codes, where only one bit flips at a time, e.g., gray code, or
cross-clock FIFOs, or
handshaking with the help of single control bits.
I'm working for my master thesis and I'm pretty new to VHDL, but still I have to implement some complex things. This is one of the easiest structures I had to write, and still I'm encountering some problems.
It's a FSM implementing a 24bit shift register with an active-low sync signal (to program a DAC). It's just the end of a complex elaboration chain I created for my project. I followed the example model of a FSM as much as I could.
The behavioral simulation works fine, actually the whole elaboration chain I created works perfectly fine as far as the behavioral simulation concerns. However, once I try the Post-translate simulation things start to go wrong: lots of 'X' output signals.
With this simple shift register I DON'T get any 'X', however I can't get to the load_and_prepare_data phase. It seems that the current_state changes (by inspecting some signals), but the elaboration doesn't go on.
Please keep in mind that since I'm new to the language, I have no idea of what timing constraints I should set on this FSM (and I wouldn't know how to write them on the top.ucf anyway)
Can you see what's wrong?
Thanks in advance
EDIT
I followed your advices and cleaned up the FSM by using a single state process. I still have some doubts about "where to put what" but I really like the new implementation. Anyway I now get a clean behavioral simulation but 'X' on all outputs in post translate simulation.
What is causing this?
I'll post the both the new code and the testbench:
----------------------------------------------------------------------------------
-- Company:
-- Engineer:
--
-- Create Date: 14:44:03 11/28/2014
-- Design Name:
-- Module Name: dac_ad5764r_24bit_sr_programmer_v2 - Behavioral
-- Project Name:
-- Target Devices:
-- Tool versions:
-- Description: This is a PISO shift register that gets a 24bit parallel input word.
-- It outputs the 24bit input word starting from the MSB and enables
-- an active low ChipSelect line for 24 clock periods.
-- Dependencies:
--
-- Revision:
-- Revision 0.01 - File Created
-- Additional Comments:
--
----------------------------------------------------------------------------------
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
-- Uncomment the following library declaration if using
-- arithmetic functions with Signed or Unsigned values
use IEEE.NUMERIC_STD.ALL;
-- Uncomment the following library declaration if instantiating
-- any Xilinx primitives in this code.
--library UNISIM;
--use UNISIM.VComponents.all;
entity dac_ad5764r_24bit_sr_programmer_v2 is
Port ( clk : in STD_LOGIC;
start : in STD_LOGIC;
reset : in STD_LOGIC; -- Note that this reset is for the FSM not for the DAC
reset_all_dac : in STD_LOGIC;
data_in : in STD_LOGIC_VECTOR (23 downto 0);
serial_data_out : out STD_LOGIC;
sync_out : out STD_LOGIC; -- This is a chip select
reset_out : out STD_LOGIC;
busy : out STD_LOGIC
);
end dac_ad5764r_24bit_sr_programmer_v2;
architecture Behavioral of dac_ad5764r_24bit_sr_programmer_v2 is
-- Stati
type state_type is (idle, load_and_prepare_data, transmission);
--ATTRIBUTE ENUM_ENCODING : STRING;
--ATTRIBUTE ENUM_ENCODING OF state_type: TYPE IS "001 010 100";
signal state: state_type := idle;
--signal next_state: state_type := idle;
-- Clock counter
--signal clk_counter_enable : STD_LOGIC := '0';
signal clk_counter : unsigned(4 downto 0) := (others => '0');
-- Shift register
signal stored_data: STD_LOGIC_VECTOR (23 downto 0) := (others => '0');
begin
FSM_single_process: process(clk)
begin
if rising_edge(clk) then
if reset = '1' then
serial_data_out <= '0';
sync_out <= '1';
reset_out <= '1';
busy <= '0';
state <= idle;
else
-- Default
serial_data_out <= '0';
sync_out <= '1';
reset_out <= '1';
busy <= '0';
case (state) is
when transmission =>
serial_data_out <= stored_data(23);
sync_out <= '0';
busy <= '1';
clk_counter <= clk_counter + 1;
stored_data <= stored_data(22 downto 0) & "0";
state <= transmission;
if (clk_counter = 23) then
state <= idle;
end if;
when others => -- Idle
if start = '1' then
serial_data_out <= data_in(23);
sync_out <= '0';
reset_out <= '1';
busy <= '1';
stored_data <= data_in;
clk_counter <= "00001";
state <= transmission;
end if;
end case;
-- if (reset_all_dac = '1') then
-- reset_out <= '0';
-- end if;
end if;
end if;
end process;
end;
And the testbench:
LIBRARY ieee;
USE ieee.std_logic_1164.ALL;
-- Uncomment the following library declaration if using
-- arithmetic functions with Signed or Unsigned values
--USE ieee.numeric_std.ALL;
ENTITY dac_ad5764r_24bit_sr_programmer_tb IS
END dac_ad5764r_24bit_sr_programmer_tb;
ARCHITECTURE behavior OF dac_ad5764r_24bit_sr_programmer_tb IS
-- Component Declaration for the Unit Under Test (UUT)
COMPONENT dac_ad5764r_24bit_sr_programmer_v2
PORT(
clk : IN std_logic;
start : IN std_logic;
reset : IN std_logic;
data_in : IN std_logic_vector(23 downto 0);
serial_data_out : OUT std_logic;
reset_all_dac : IN std_logic;
sync_out : OUT std_logic;
reset_out : OUT std_logic;
--finish : OUT std_logic;
busy : OUT std_logic
);
END COMPONENT;
--Inputs
signal clk : std_logic := '0';
signal start : std_logic := '0';
signal reset : std_logic := '0';
signal data_in : std_logic_vector(23 downto 0) := (others => '0');
signal reset_all_dac : std_logic := '0';
--Outputs
signal serial_data_out : std_logic;
signal sync_out : std_logic;
signal reset_out : std_logic;
--signal finish : std_logic;
signal busy : std_logic;
-- Clock period definitions
constant clk_period : time := 100 ns;
BEGIN
-- Instantiate the Unit Under Test (UUT)
uut: dac_ad5764r_24bit_sr_programmer_v2 PORT MAP (
clk => clk,
start => start,
reset => reset,
data_in => data_in,
reset_all_dac => reset_all_dac,
serial_data_out => serial_data_out,
sync_out => sync_out,
reset_out => reset_out,
--finish => finish,
busy => busy
);
-- Clock process definitions
clk_process :process
begin
clk <= '0';
wait for clk_period/2;
clk <= '1';
wait for clk_period/2;
end process;
-- Stimulus process
stim_proc: process
begin
-- hold reset state for 100 ns.
wait for clk_period*10;
reset <= '1' after 25 ns;
wait for clk_period*1;
reset <= '0' after 25 ns;
wait for clk_period*3;
reset_all_dac <= '1' after 25 ns;
wait for clk_period*1;
reset_all_dac <= '0' after 25 ns;
wait for clk_period*5;
data_in <= "111111111111111111111111" after 25 ns;
wait for clk_period*3;
start <= '1' after 25 ns;
wait for clk_period*1;
start <= '0' after 25 ns;
wait;
end process;
END;
UPDATE 1
Updated with the last design: this code is not causing any 'X' (can't figure out why, this doesn't but the previous did). However it's not starting (in POST-TRANSLATE simulation) just like the first 3 process machine, and the signal sync_out is stuck at 0 while it should be '1' by default.
UPDATE 2
I've been looking into the tecnology schematic, starting from the problem of the sync_out=0: it's implemented with a FDS, S is the FSM reset signal, D is coming from a LUT3 with I = state&reset&start and INIT = 45 = "00101101". I've looked for this LUT3 in the simulation and I've noticed that it has INIT = "00000000"!
Is there something I'm missing about how to run this simulation? It seems that every LUT in the design have not been set!
UPDATE 3
It seems that the Post-Translate simulation is buggy in some way, or I'm not configuring it correctly for some reason: the Post-Map and the Post-PAR simulations work and display some outputs.
However there is an odd bug: the stored_data register is not updated with the complete data_in vector, after that, the FSM operates correctly and outputs the data stored.
I've looked in the tecnology schematic just after synthesis and for some reason the bits 23,22,21,19,18 are not connected to the corresponding data_in bit. You can see the effect in this screenshot from Post-Map simulation. Same happens in Post-PAR, but it seems that this problems comes directly from the synthesis!
Solved: the strange output comes from the Synthesis optimization. The tool realized that the previous block in the elaboration chain will never output a bit different from 0 for those specific bit. My mistake was assuming that I could test the single block alone: what I was really testing was the block synthetized for the FPGA taking into account everything else in the design!
Thanks to everybody helped me, I'm going to follow your advices!
I prefer the single-process form of state machine, which is cleaner, simpler and much less prone to bugs like sensitivity-list errors. I would also endorse the points in Paebbels' excellent answer. However I don't think any of these are the problem here.
One thing to be aware of in post-synth and post-PAR simulations is that their model of time is different from the behavioural model. The behavioural model follows simple rules as I described in this answer and ensures that in a typical design flow you can go straight to hardware - without post-synth simulation, without worry.
Indeed I only use post-synth or post-PAR simulations if I'm chasing a suspected tool bug. (For FPGA designs, not ASIC, that is!)
However, that simple timing model has its limitations. You may be familiar with problems like a clock signal assigned via signal assignment (usually buried in a 3rd party model where you don't expect it) which consumes a delta cycle, and ensures that your clocked data arrives before your clock instead of after, and everything subsequently occurs one cycle earlier than intended...
In behavioural modelling, a little discipline will keep clear of such troubles. But the same is not true of post-PAR modelling.
Your testbench is probably set up the same way as the behavioural model. And if so, that is likely to be the problem.
Here's what I do in this situation : I claim no formal authority for it, just experience. It also works well when interfacing the FPGA to external memory models with realistic timings.
1) I assume the simple (behavioural) timing model works correctly for all signals INTERNAL to the design.
2) I assume nothing of the sort for inputs and outputs from the design.
3) I take note of the estimated setup and hold timings on the inputs, (a) from the FPGA datasheet or better, (b) from the worst case values shown in the post-synth or post-PAR report, and structure the testbench around them.
Worked example : setup time 1 ns, hold time 2 ns, clock period 10 ns. This means that any input between 2 ns and 9 ns after a clock edge is guaranteed to be corrrectly read. I choose (arbitrarily) 5 ns.
signal_to_fpga <= driving_value after 5 ns;
(Note that Xilinx makes this absurdly counter-intuitive by expressing them as "offset in/out before/after" which refers timings to a previous or future clock edge instead of the one you're looking at)
Alternatively, if the input is fed from a CPU or memory in the real world, I use datasheet timing specifications for that device.
4) I take note of the worst case clk-out timing reported in the datasheet or report, and structure the design around them. (say, 7 ns)
fpga_output_pin <= driving_value after 7 ns;
Note that this "after" clause is obviously ignored by synthesis; however the post-synth back-annotation will introduce something very like it.
5) If this turns out to be not good enough, then (possibly in a wrapper component to avoid polluting the synthesisable code) improve accuracy like
fpga_output_pin <= 'X' after 1 ps, driving_value after 7 ns;
6) I re-run the behavioural simulation. Typically, it now fails, because it was written without realistic timings in mind.
7) I fix those failures. This may include adding realistic delays before testing values output from the design. It can be an iterative process.
Now, I have a reasonable expectation that the post-PAR simulation model will drop straight in to the testbench and work.
Here are some hints to improve your code:
You can remove the Xilinx dependencies to UNISIM, because you are not using any Xilinx Primitves.
Applying attribute ENUM_ENCODING has no effect on state encoding unless you also define the attribute FSM_ENCODING and set it's value to user. One-Hot encoding can be forced by setting FSM_ENCODING to one-hot. Normally synthesis is smart enough to find the best encoding.
read more ...
None of your registers has a default value:
signal current_state : state_type := idle;
Your FSM is no FSM in the eyes of Xilinx synthesis tool (XST). I'm sure if you look into your synthesis report, you won't find that XST reports a FSM for current_state.
So what's wrong with your FSM?
Your FSM has no initial state.
Your FSM has multiple reset states (idle, load_and_prepare_data)
Your FSM has no transition from idle to load_and_prepare_data (reset is no transition)
Writing next_state transitions for the current state can cause XST to think it's no FSM
the default assignment next_state <= current_state; is sufficient.
If you change the type of signal clk_counter to unsigned you can do arithmetic much easier.
increment: clk_counter <= clk_counter + 1;
clear: clk_counter <= (others => '0');
compare: if (clk_counter = 23) then
It's no good style to use the FSM's state signal outside of the FSM processes.
FSM_next_state_process: process(current_state, start, clk_counter, reset_all_dac)
begin
next_state <= current_state;
OutReg_busy <= '1';
OutReg_reset_out <= '1';
OutReg_sync_out <= '1';
clk_counter_enable <= '0';
case (current_state) is
when idle =>
OutReg_busy <= '0';
if (reset_all_dac = '1') then
OutReg_reset_out <= '0';
end if;
when load_and_prepare_data =>
next_state <= transmission;
when transmission =>
clk_counter_enable <= '1';
OutReg_sync_out <= '0';
if (clk_counter = 23) then
next_state <= idle;
end if;
when others =>
next_state <= idle;
end case ;
end process;
Description:
I want to include vhdl assert statements to report when set_delay and hold_delay time violations occur. I am not sure how to do this with my code and I have been to many places on the web and I don't understand. Please give examples with my code.
Code:
LIBRARY ieee;
USE ieee.std_logic_1164.all;
ENTITY dff IS
GENERIC (set_delay : TIME := 3 NS; prop_delay : TIME := 12 NS;
hold_delay : TIME := 5 NS);
PORT (d, set, rst, clk : IN BIT; q : OUT BIT; nq : OUT BIT := '1');
END dff;
--
ARCHITECTURE dff OF dff IS
SIGNAL state : BIT := '0';
BEGIN
dff: PROCESS
BEGIN
wait until rst;
wait until set;
wait until clk;
IF set = '1' THEN
q <= '1' AFTER set_delay;
nq <= '0' AFTER set_delay;
ELSIF rst = '1' THEN
q <= '0' AFTER prop_delay;
nq <= '1' AFTER prop_delay;
ELSIF clk = '1' AND clk'EVENT THEN
q <= d AFTER hold_delay;
nq <= NOT d AFTER hold_delay;
END IF;
END PROCESS dff;
END dff;
I do understand that the general assert syntax is:
ASSERT
condition
REPORT
"message"
SEVERITY
severity level;
Part of my problem is that I don't know where to put these assert statements and I am not sure how I would write them.
I would introduce additional signals in which you store the time of the last manipulation. Then I'd add other process which manage the signals and check the times:.
time_debug : block
signal t_setup, t_hold : time := 0 ns;
begin
setup_check : process (clk)
begin
if clk'event and clk = '1' then
t_hold <= now;
assert (t_setup - now)>set_delay REPORT "Setup time violated." SEVERITY note;
end if;
end process setup_check;
hold_check: process (d)
begin
if d'event then
t_setup <= now;
assert (t_hold - now)>hold_delay REPORT "Hold time violated." SEVERITY note;
end if;
end process hold_check;
end block time_debug;
What this does is it saves the time of the last positive clock edge and the time of the last input change. Now every time either d changes or the clock rises the delays are checked. I couldn't verify this in a compiler because I don't have one set up here, but I'll gladly do so if there are problems with this solution.
I personally like to keep debug stuff in a dedicated block, so I can easily keep track of which signals I only use for debugging and can later easily remove them. It also makes it easier to add all debug signals to e.g. modelsim's wave screen.
Also note that these asserts and reports will only work in simulation.
I have this signal that should be zero until another signal Start = 0. How can I accomplish this? Here is the relevant code:
din<=0;
wait until falling_edge(start);
for i in 0 to 63 loop
wait until clk = '1' and clk'event;
if i = 0 then
Start <= '1','0' after clk_period;
end if;
if (i < 24) then
din <= 255;
elsif (i > 40) then
din <= 255;
else
din <= 0;
end if;
end loop;
wait;
I thought I could just make din = 0 until the falling edge of start but it stops at the rising edge of start. I want to start reading the din values when start =0. Before that din = 0.
Here is a pic:
EDIT: Actually I got it to start at the correct signal values but the dout value always has an intermediate value that isn't necessary. In this case its 78450. I know this has to do with the testbench code but I can't get it to just calculate the correct value at the correct time. What changes can be made to the code below to get rid of the intermediate value?
din<=0;
for i in 0 to 63 loop
wait until clk = '1' and clk'event;
if i = 0 then
Start <= '1','0' after clk_period;
elsif (i < 24) then
din <= 255;
elsif (i > 40) then
din <= 255;
else
din <= 0;
end if;
end loop;
First of all I assume (and hope) you are writing a testbench. If not, you should avoid using wait statements, as these have very limited support in synthesis tools.
Even in a testbench, it is best to use time-based wait or after statements only to generate the clock, and make all other signals dependent on an event (e.g. rising_edge(clk)). This avoids the problem of having multiple signals changing during delta cycle 0 along with the clock.
Consider the following code for a typical register:
process(clk) begin
if(rising_edge(clk)) then
a <= b;
end if;
end process;
and assume that clk and b are generated in a testbench as follows:
clk <= not clock after 1 ns;
process begin
b <= '1', '0' after 10 ns;
wait;
end process;
At time 0 delta 0, clk changes to '1' and b would change to '1'.
At time 0 delta 1, the register process would run since clk changed, and a would change to '1'.
No further sensitivity exists, so time would update to the next event at 1 ns.
At time 1 delta 0, clk changes to '0'.
At time 1 delta 1, the register process is run since clk changed, but nothing happens because rising_edge(clk) is false.
The above repeats for time 2-9 ns.
At time 10 delta 0, clk changes to '1' and b changes to '0'. Note that clk and b change in the same delta cycle.
At time 10 delta 1, the register process runs and a changes to '0'! As far as the result is concerned, this means that b changed before the rising clock edge!
Even if this behavior is understandable in this simple system, it can lead to some incredibly difficult to find simulation bugs. It is therefore better to base all signals off of the appropriate clock.
process begin
-- Initialize b to 1.
b <= '1';
-- Wait for 5 cycles.
for i in 1 to 5 loop
wait for rising_edge(clk);
end loop;
-- Set b to 0.
b <= '0';
-- Done.
wait;
end process;
This avoids unexpected behavior, since all signals will change at least one delta cycle after the associated clock, meaning causality is maintained throughout all of your processes.
I have this signal that should be zero until another signal Start = 0. How can I accomplish this?
Maybe you can use a handshake signal and put it in the sensitive list of the process. It will behave like a reset signal.
process (handshake_s, ...)
begin
if (handshake_s = '1') then -- failing edge of start
din <= 0;
else
-- do something
end if;
end process;
Use another process to update handshake_s.
process (start, ...)
begin
if failing_edge(start) then
handshake_s <= '1', '0' after 10 ns; -- produce a pulse
end if;
-- do something
end process;
Would you mind post all your code here so that we could understand the waveform better?
Testbench or RTL code?
For a testbench, your coding style is mostly ok, however, your signal Start has a problem and will never be '1' during a rising edge of clock. It goes to '1' just after the rising edge of clock and will return to '0' either simultaneously with clock or 1 delta cycle before clock (depending on your clock setup). Either way, anything running on rising_edge clock, such as your design, will not see it as a '1'.
A simple way to avoid this is to use nominal delays (25% of tperiod_Clk) on all of your testbench outputs that go to the DUT (Device Under Test). The pattern for a pulse is as follows.
wait until clk = '1' ; -- I recommend using rising_edge(Clk) for readability
Start <= '1' after tpd, '0' after tpd + tperiod_clk ;
Alternately, you can avoid this issue by not using waveform assignments. Such as the following. In this case, you don't need the tpd, however, if it really is a testbench, I recommend using it.
wait until clk = '1' ;
if i = 0 then
Start <= '1' after tpd ;
else
Start <= '0' after tpd ;
end if ;
For RTL code, you need to explore a different approach. Very briefly one way to approach it is as follows. Note do not use any delays, waveform assignments, or loops.
-- Counter to count from 0 to 63. Use "+ 1". Use "mod 64" if using type integer.
-- Start logic = decoder (can be coded separately)
-- Din Logic = decoder (can be coded separately)