UART Transmitter only functions when embedded logic analyzer is running - vhdl

I have been trying to implement a UART in order to communicate between my Lattice MachXO3D board and my computer. At the moment I am attempting to implement the transmission from the FPGA.
Upon testing on the hardware, I encountered a very strange issue. If run normally, it will function for a few seconds and then it will suddenly stop functioning (The CH340 connected to my computer will report it is receiving messages containing 0x0). However, if I embed a logic analyzer onto the FPGA through the Lattice Diamond software, and I run the analyzer, it will function perfectly for an extended period of time.
Sadly, I don't have a logic analyzer, so the embedded logic analyzer is my only chance at knowing what is actually being transmitted.
These are the files related to my implementation:
baud_gen
LIBRARY IEEE;
USE IEEE.std_logic_1164.ALL;
USE IEEE.numeric_std.ALL;
-- Generates 16 ticks per bit
ENTITY baud_gen IS
GENERIC(divider: INTEGER := 13 -- 24M/115200*16
);
PORT(
clk, reset: IN STD_LOGIC;
s_tick: OUT STD_LOGIC
);
END baud_gen;
ARCHITECTURE working OF baud_gen IS
BEGIN
PROCESS(clk)
VARIABLE counter: UNSIGNED(3 DOWNTO 0) := to_unsigned(0,4);
BEGIN
IF clk'EVENT AND clk='1' THEN
IF reset='1' THEN
s_tick <= '0';
counter := to_unsigned(0,4);
ELSIF counter=to_unsigned(divider-1,4) then
s_tick <= '1';
counter:= to_unsigned(0,4);
ELSE
s_tick <= '0';
counter := counter + 1;
END IF;
END IF;
END PROCESS;
END working;
rs232_tx
LIBRARY IEEE;
USE IEEE.std_logic_1164.ALL;
USE IEEE.numeric_std.ALL;
ENTITY rs232_tx IS
PORT(clk: IN std_logic;
tx: OUT std_logic;
rst: IN std_logic;
fifo_empty: IN std_logic;
fifo_RdEn, fifo_RdClock: OUT std_logic;
fifo_data: IN STD_LOGIC_VECTOR(7 DOWNTO 0);
Q: OUT STD_LOGIC_VECTOR(1 DOWNTO 0));
END rs232_tx;
ARCHITECTURE working OF rs232_tx IS
TYPE state IS (idle,start,data);
SIGNAL tx_pulse: STD_LOGIC := '1';
SIGNAL s_tick: STD_LOGIC;
SIGNAL pr_state, nx_state: state := idle;
SIGNAL data_val: std_logic_vector(7 DOWNTO 0):=(others=>'0');
SIGNAL data_count: unsigned(2 DOWNTO 0):=to_unsigned(0,3);
BEGIN
process(s_tick, rst)
VARIABLE count: unsigned(3 DOWNTO 0):= to_unsigned(0,4);
BEGIN
IF rising_edge(s_tick) THEN
count := count + to_unsigned(1,4);
IF count = to_unsigned(15,4) THEN
tx_pulse <= '1';
ELSE
tx_pulse <='0';
END IF;
END IF;
END PROCESS;
process(tx_pulse,rst)
BEGIN
IF rising_edge(tx_pulse) THEN
IF rst='1' THEN
pr_state <= idle;
data_val <= (others=>'0');
data_count <= to_unsigned(0,3);
ELSE
pr_state <= nx_state;
CASE pr_state IS
WHEN idle =>
data_count <= to_unsigned(0,3);
WHEN data =>
data_count <= data_count + to_unsigned(1,3);
WHEN start =>
data_val <= fifo_data;
WHEN OTHERS =>
END case;
END IF;
END IF;
END process;
process(fifo_empty,rst,data_count,pr_state,data_count)
BEGIN
case pr_state is
when idle =>
Q <= ('1','1');
fifo_RdEn <= '0';
tx <= '1';
IF fifo_empty='0' AND rst='0' THEN
nx_state <= start;
ELSE
nx_state <= idle;
END IF;
WHEN start =>
Q <= ('1','0');
fifo_RdEn <= '1';
tx <= '0';
nx_state <= data;
WHEN data =>
Q <= ('0','1');
fifo_RdEn <= '0';
tx <= data_val(to_integer(data_count));
if data_count=to_unsigned(7,3) then
nx_state <= idle;
ELSE
nx_state <= data;
end if;
end case;
END process;
fifo_RdClock <= tx_pulse;
baud_gen: ENTITY work.baud_gen
PORT MAP(clk,reset=>'0',s_tick=>s_tick);
END working;
testbench
LIBRARY IEEE;
USE IEEE.std_logic_1164.ALL;
USE IEEE.numeric_std.ALL;
ENTITY rs232_tx_test IS
GENERIC(clk_period: TIME := 41666666.7 fs;
baud_period: TIME := 8680.55556 ns);
END rs232_tx_test;
ARCHITECTURE working OF rs232_tx_test IS
SIGNAL clk: STD_LOGIC := '0';
SIGNAL tx, rst, fifo_empty, fifo_RdEn, fifo_RdClock: STD_LOGIC;
SIGNAL fifo_data: STD_LOGIC_VECTOR(7 DOWNTO 0);
BEGIN
clk <= NOT clk AFTER clk_period/2;
rst <= '1', '0' AFTER 100 ns;
PROCESS
BEGIN
fifo_empty <= '1';
WAIT FOR baud_period;
fifo_empty <= '0';
WAIT FOR baud_period*16;
END PROCESS;
fifo_data <= ('1','1','0','0','1','0','1','1') WHEN fifo_RdEn='1' ELSE (others=>'0');
dut: ENTITY work.rs232_tx
PORT MAP(clk,tx,rst,fifo_empty,fifo_RdEn,fifo_RdClock,fifo_data);
END working;
EDIT
I have tested another UART design I found online # 9600 bps and it fails in the same way. It can send a constant character, in this case 'a', to a terminal on my computer, and then it suddenly stops sending anything. However, if I start listening to the soft logic analyzer I generated in Lattice Diamond, it works without a problem and does not fail.

This smells like a classic timing issue.
In the comments, others have explained where there are weaknesses in the code that could indirectly lead to this. I'm going to concentrate on what is occurring.
As you have stated, in simulation your code works as you expect. However, that is only half the story. To create an FPGA bit stream the build tools take your code and several other files, synthesises, and conducts Place & Route. Your timing issue occurs during P&R. This is why your simulation doesn't pick up on any errors, as I'm assuming it's an RTL (pre-place & route) simulation.
During P&R the tools lay the logic in the best way to fit the timing model of the device so all paths meet their timing requirements. The path timing requirements are derived from explicit statements in a Timing Constraints file and inferred from your code (that's why your coding style matters btw).
Once P&R is complete, the tools will put the build artefact through a static timing analyser (STA) tool and report back whether the build fails to meet the timing requirements.
This leads to two questions:
Does the build report a timing error?
Do you have a Timing Constraints file - if you're unsure, the answer is no.
The way to debug your problem is to use the Timing Report generated by Lattice Diamond to see where the failures are.
If there are no reported failures it means your timing model is wrong because of a lack of appropriate timing constraints. As a minimum, you will need to constrain all IO in the design and describe all the clocks in your design.
Here is a good document to help you out:
https://www.latticesemi.com/-/media/LatticeSemi/Documents/UserManuals/RZ/Timing_Closure_Document.ashx?document_id=45588
The reason that your design works when you use the embedded Logic Analyser is that extra logic has been placed in the design, which changes the timing model. The P&R tools lay out the design differently, and by luck have placed the design in such a way as meet the real timing requirements on it.
As I often say to my Software Engineer colleagues, software languages create a set of instructions, HDL creates a set of suggestions.

Related

Switch on LED after receiving Ethernet packets

I'm a novice in VHDL programming and currently try to execute a program where the LED on the FPGA board should switch on after transmitting every 10 Ethernet packets which I generate from a Linux server. The code I've written is in the following which doesn't work properly. I'm trying to figure out the problem but still undone. Any help would be much appreciated.
---------------------------------------------
library IEEE;
use IEEE.std_logic_1164.all;
use IEEE.std_logic_arith.all;
use IEEE.std_logic_unsigned.all;
---------------------------------------------
entity notification is
port (clk, reset, qdv: in std_logic;
LED: out std_logic
);
end notification;
architecture behavior of notification is
signal qdv_a: std_logic;
signal qdv_b: std_logic;
signal packet_count: std_logic_vector (3 downto 0);
begin
no_1: process(clk, reset)
begin
if (reset = '1') then
qdv_a <= '0';
elsif rising_edge (clk) then
qdv_a <= qdv;
end if;
end process no_1;
qdv_b <= qdv and (not qdv_a);
no_2: process(clk, reset)
begin
if (reset = '1') then
packet_count <= "0000";
elsif rising_edge (clk) then
if qdv_b = '1' then
if packet_count < "1010" then
packet_count <= packet_count + 1;
LED <= '0';
else
LED <= '1';
packet_count <= (others => '0');
end if;
end if;
end if;
end process no_2;
end behavior;
I am making assumptions based on your code:
1) You are trying to increment packet_count every time you see a rising edge on qdv,
2) The pulse width of qdv is longer than a period of the 25MHz clock (clk_25MHz) - 40ns and
3) You want an asynchronous reset. (Trying to decide which is better - a synchronous or asynchronous reset - is like trying to decide which is a better - a Mac or a PC.)
So,
If (1) and (2) are true, you need a synchronous edge detector:
signal qdv_d : std_logic;
signal qdv_r : std_logic;
...
process (clk_25MHz, reset)
begin
if reset = '1' then
qdv_d <= '0';
elsif rising_edge (clk_25MHz) then
qdv_d <= qdv;
end if;
end process;
qdv_r <= qdv and not qdv_d;
Please draw this out as a schematic so that you can see how it works.
Then, assuming (3), you need to sort out your main process. If you're coding sequential logic, you should stick to a template. Here is the template for sequential logic with an asynchronous reset, which all synthesis tools should understand:
process(clock, async_reset) -- nothing else should go in the sensitivity list
begin
-- never put anything here
if async_reset ='1' then -- or '0' for an active low reset
-- set/reset the flip-flops here
-- ie drive the signals to their initial values
elsif rising_edge(clock) then -- or falling_edge(clock)
-- put the synchronous stuff here
-- ie the stuff that happens on the rising or falling edge of the clock
end if;
-- never put anything here
end process;
Only clock and reset go in the sensitivity list, because the outputs of the sequential process (though they depend on all the inputs) only change when clock and/or reset change. On a real D-type flip-flop, reset takes priority over clock, so we test that first and do the resetting should reset be asserted. If there is a change on clock (when reset is not asserted) and that change is a rising edge, then do all the stuff that should happen on the rising edge of a clock (stuff that will get synthesised to combinational logic driving the D inputs of the flip-flops).
So, using that template, here is how I would write your main process:
process(clk_25MHz, reset)
begin
if reset = '1' then
packet_count <= "0000";
elsif rising_edge (clk_25MHz) then
if qdv_r = '1' then
if packet_count < "1010" then
packet_count <= packet_count + 1;
LED <= '0';
else
LED <= '1';
packet_count <= (others => '0');
end if;
end if;
end if;
end process;
Now we have a synchronous process which increments packet_count and drives the LED output. (What is q bringing to the party?)
Please
bear in mind that I haven't simulated any of this
don't just type it in without trying to understand how it works

VHDL state machine testbench - works when on board but not on simulation

I have the VHDL implementation that works on board, it detects the sequence 01110 and will raise a flag for 2 clock counts. It detects overlapping sequences as well where 011101110 would raise the flag twice.
I've checked my implementation with a logic analyzer on the board and am fairly confident that it works. I am feeding in a repetition sequence of 0111 at 10 kHz, on the board, it has a clock at 100 MHz where I scale it to 10 kHz with a prescaler.
My problem is, when trying to recreate a similar scenario using a simulation, I do not get any outputs as expected
Image from logic analyzer from board
Image from Test Bench
Test Bench Code
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.STD_LOGIC_ARITH.ALL;
use IEEE.STD_LOGIC_UNSIGNED.ALL;
entity test_FSM_prac4 is
-- Port ( );
end test_FSM_prac4;
architecture Behavioral of test_FSM_prac4 is
component FSM_prac4 is
port (
inputSignal : in STD_LOGIC;
pushButton : in STD_LOGIC;
clk100mhz : in STD_LOGIC;
logic_analyzer : out STD_LOGIC_VECTOR (7 downto 0);
LEDs: out STD_LOGIC
); end component;
signal inputSignal : std_logic := '0';
signal pushButton: std_logic := '0';
signal clk100mhz: std_logic := '0';
signal logic_analyzer: std_logic_vector(7 downto 0);
signal LEDs : std_logic;
begin
uut : FSM_prac4 port map(
inputSignal => inputSignal,
pushButton => pushButton,
clk100mhz => clk100mhz,
logic_analyzer => logic_analyzer,
LEDs => LEDs
);
--generate clock 100mhz
clock_tic: process begin
loop
clk100mhz <= '0';
wait for 5ns;
clk100mhz <= '1';
wait for 5ns;
end loop;
end process;
input_changes: process begin
loop
inputSignal <= '0';
wait for 100us;
inputSignal <= '1';
wait for 100us;
inputSignal <= '1';
wait for 100us;
inputSignal <= '1';
wait for 100us;
end loop;
end process;
end Behavioral;
To show the mapping for logic Analyzer
logic_analyzer(0) <= masterReset;
logic_analyzer(1) <= newClock -- 10Khz Clock;
logic_analyzer(2) <= outputZ;
--FSM States
logic_analyzer(3) <= '1' when y = A ELSE '0';
logic_analyzer(4) <= '1' when y = B ELSE '0';
logic_analyzer(5) <= '1' when y = C ELSE '0';
logic_analyzer(6) <= '1' when y = D ELSE '0';
logic_analyzer(7) <= '1' when y = E ELSE '0';
If anyone could direct to what I am doing wrong on the test bench and how to replicate to get similar results as the first image as it shows that in simulation, it always stays at state A and the new clock is not toggling meaning that clk100mhz is somehow not connected but I can't figure out why.
Any help is greatly appreciated, thanks guys
edit:
I wrote a simple program to test my scalar clock
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.STD_LOGIC_ARITH.ALL;
use IEEE.STD_LOGIC_UNSIGNED.ALL;
entity scaler_clk is
Port (
pushButton : in std_logic;
indicator : out std_logic;
clk100mhz : in STD_LOGIC;
clk10khz: out STD_LOGIC
);
end scaler_clk;
architecture Behavioral of scaler_clk is
signal clockScalers : std_logic_vector (12 downto 0):= (others => '0') ;
signal prescaler: std_logic_vector(12 downto 0) := "1001110001000";
signal newClock: std_logic := '0';
signal masterReset : std_logic;
begin
clk10khz <= newClock;
masterReset <= pushButton;
process (clk100mhz,masterReset) begin
if(masterReset <= '1') then <--- error occurs here
clockScalers <= "0000000000000";
newClock <= '0';
indicator <= '1';
elsif (clk100mhz'event and clk100mhz = '1')then
indicator <= '0';
clockScalers <= clockScalers + 1;
if(clockScalers > prescaler) then
newClock <= not newClock;
clockScalers <= (others => '0');
end if;
end if;
end process;
end Behavioral;
test bench code
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
entity test_scaler_clk is
-- Port ( );
end test_scaler_clk;
architecture Behavioral of test_scaler_clk is
component scaler_clk Port (
pushButton : in std_logic;
indicator : out std_logic;
--input clock
clk100mhz : in STD_LOGIC;
clk10khz: out STD_LOGIC
);end component;
signal clk100mhz: std_logic := '0';
signal clk10khz : std_logic;
signal pushButton: std_logic;
signal indicator : std_logic;
begin
uut: scaler_clk port map(
pushButton => pushButton,
indicator => indicator,
clk100mhz => clk100mhz,
clk10khz => clk10khz
);
pushButton <= '0';
clock_tic: process begin
loop
clk100mhz <= '0';
wait for 5ns;
clk100mhz <= '1';
wait for 5ns;
end loop;
end process;
end Behavioral;
Even though I set pushButton to '0', it is still triggering masterReset, anyone knows why, that's why the 10 kHz clock isn't working
There are several things that you could (should) improve in your code. As Brian already explained, in your Behavioral architecture of scaler_clk, you should have:
if(masterReset = '1') then
instead of:
if(masterReset <= '1') then
Now, let's start with the most likely cause of your initial problem: unbound components. Your test benches instantiate the design to validate as components. VHDL components are kind of prototypes of actual entities. Prototypes are enough to compile because the compiler can perform all necessary syntax and type checking. But they are not enough to simulate because the simulator also needs the implementation behind the prototype. Some tools have a default binding strategy for unbound components: if they find an entity with the same name and if it has only one architecture, they use that. Your simulator apparently does not use such strategy (at least not by default, there is maybe an option for that but it is disabled). Note that most simulators I know issue warnings when they find unbound components. You probably missed these warnings.
Anyway, your component instances are unbound (they have no associated entity/architecture) and the simulator considers them as black boxes. Their outputs are not driven, except by the initial values you declared (1).
How to fix this? Two options:
Use a configuration to specify which entity/architecture pair shall be used for each component instance:
for all: scaler_clk use entity work.scaler_clk(Behavioral);
Use entity instantiations instead of components:
uut: entity work.scaler_clk(Behavioral) port map...
Now, let's go through some other aspects of your code that could be improved:
You are using non-standard packages, that are frequently not even compatible: IEEE.STD_LOGIC_ARITH and IEEE.STD_LOGIC_UNSIGNED. As they are not standard they should not even be in the standard IEEE library. You should use IEEE.NUMERIC_STD instead, and only that one. It declares the SIGNED and UNSIGNED types (with the same declaration as STD_LOGIC_VECTOR) and overloads the arithmetic operators on them.
Your test benches generate the 100MHz clock with:
clock_tic: process begin
loop
clk100mhz <= '0';
wait for 5ns;
clk100mhz <= '1';
wait for 5ns;
end loop;
end process;
The infinite loop is useless: a process is already an infinite loop:
clock_tic: process
begin
clk100mhz <= '0';
wait for 5ns;
clk100mhz <= '1';
wait for 5ns;
end process clock_tic;
would do the same. Same remark for your input_changes process.
Your input_changes process uses wait for <duration> statements. This is not a good idea because you do not know when the inputSignal signal toggles, compared to the clock. Is it just before, just after or exactly at the same time as the rising edge of clk100mhz? And if it is exactly at the same time, what will happen? Of course, you can carefully chose the <durations> to avoid such ambiguities but it is error prone. You should use the wait for <duration> only in the clock generating process. Everywhere else, it is better to synchronize with the clock:
input_changes: process
begin
inputSignal <= '0';
for i in 1 to 10000 loop
wait until rising_edge(clk100mhz);
end loop;
inputSignal <= '1';
for i in 1 to 10000 loop
wait until rising_edge(clk100mhz);
end loop;
inputSignal <= '1';
for i in 1 to 10000 loop
wait until rising_edge(clk100mhz);
end loop;
inputSignal <= '1';
for i in 1 to 10000 loop
wait until rising_edge(clk100mhz);
end loop;
end process input_changes;
This guarantees that inputSignal changes just after the rising edge of the clock. And you could rewrite it in a bit more elegant way (and probably a bit easier to maintain):
input_changes: process
constant values: std_logic_vector(0 to 3) := "0111";
begin
for i in values'range loop
inputSignal <= values(i);
for i in 1 to 10000 loop
wait until rising_edge(clk100mhz);
end loop;
end loop;
end process input_changes;
You are using resolved types (STD_LOGIC and STD_LOGIC_VECTOR). These types allow multiple drive, that is, having a hardware wire (VHDL signal) that is driven by several devices (VHDL processes). Usually you do not want this. Usually you even want to avoid this like the plague because it can cause short-circuits. In most cases it is wiser to use non-resolved types (STD_ULOGIC and STD_ULOGIC_VECTOR) because the compiler and/or the simulator will raise errors if you accidentally create a short circuit in your design.
One last thing: if, as its name suggests, you intend to use the clk10khz signal as a real clock, you should reconsider this decision. It is a signal that you generate with your custom logic. Clocks have very specific electrical and timing constraints that cannot really be fulfilled by regular signals. Before using clk10khz as a clock you must deal with clock skew, clock buffering... Not impossible but tricky. If you did use it as a clock your synthesizer probably issued critical warnings that you also missed (have a look maybe at the timing report). Moreover, this is probably useless in your case: an enable signal generated from clk100mhz could probably be used instead, avoiding all these problems. Instead of:
process (clk100mhz,masterReset) begin
if(masterReset = '1') then
clockScalers <= "0000000000000";
newClock <= '0';
indicator <= '1';
elsif (clk100mhz'event and clk100mhz = '1')then
indicator <= '0';
clockScalers <= clockScalers + 1;
if(clockScalers > prescaler) then
newClock <= not newClock;
clockScalers <= (others => '0');
end if;
end if;
end process;
use:
signal tick10khz: std_ulogic;
...
process(clk100mhz, masterReset) begin
if masterReset = '1') then
clockScalers <= "0000000000000";
tick10khz <= '0';
elsif rising_edge(clk100mhz) then
clockScalers <= clockScalers + 1;
tick10khz <= '0'
if(clockScalers > prescaler) then
tick10khz <= '1';
clockScalers <= (others => '0');
end if;
end if;
end process;
And then, instead of:
process(clk10khz)
begin
if rising_edge(clk10khz) then
register <= register_input;
end if;
end process;
use:
process(clk100mhz)
begin
if rising_edge(clk100mhz) then
if tick10khz = '1' then
register <= register_input;
end if;
end if;
end process;
The result will be the same but with only one single 100MHz clock, which avoids clock skew, clock buffering and clock domain crossing problems.
(1) This illustrates why declaring variables and signals with initial values is usually not a good idea: it hides potential problems. Without this your signals would have been stuck at 'U' (uninitialized) and it would maybe have helped understanding where the problem comes from.

VHDL - FSM not starting (JUST in timing simulation)

I'm working for my master thesis and I'm pretty new to VHDL, but still I have to implement some complex things. This is one of the easiest structures I had to write, and still I'm encountering some problems.
It's a FSM implementing a 24bit shift register with an active-low sync signal (to program a DAC). It's just the end of a complex elaboration chain I created for my project. I followed the example model of a FSM as much as I could.
The behavioral simulation works fine, actually the whole elaboration chain I created works perfectly fine as far as the behavioral simulation concerns. However, once I try the Post-translate simulation things start to go wrong: lots of 'X' output signals.
With this simple shift register I DON'T get any 'X', however I can't get to the load_and_prepare_data phase. It seems that the current_state changes (by inspecting some signals), but the elaboration doesn't go on.
Please keep in mind that since I'm new to the language, I have no idea of what timing constraints I should set on this FSM (and I wouldn't know how to write them on the top.ucf anyway)
Can you see what's wrong?
Thanks in advance
EDIT
I followed your advices and cleaned up the FSM by using a single state process. I still have some doubts about "where to put what" but I really like the new implementation. Anyway I now get a clean behavioral simulation but 'X' on all outputs in post translate simulation.
What is causing this?
I'll post the both the new code and the testbench:
----------------------------------------------------------------------------------
-- Company:
-- Engineer:
--
-- Create Date: 14:44:03 11/28/2014
-- Design Name:
-- Module Name: dac_ad5764r_24bit_sr_programmer_v2 - Behavioral
-- Project Name:
-- Target Devices:
-- Tool versions:
-- Description: This is a PISO shift register that gets a 24bit parallel input word.
-- It outputs the 24bit input word starting from the MSB and enables
-- an active low ChipSelect line for 24 clock periods.
-- Dependencies:
--
-- Revision:
-- Revision 0.01 - File Created
-- Additional Comments:
--
----------------------------------------------------------------------------------
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
-- Uncomment the following library declaration if using
-- arithmetic functions with Signed or Unsigned values
use IEEE.NUMERIC_STD.ALL;
-- Uncomment the following library declaration if instantiating
-- any Xilinx primitives in this code.
--library UNISIM;
--use UNISIM.VComponents.all;
entity dac_ad5764r_24bit_sr_programmer_v2 is
Port ( clk : in STD_LOGIC;
start : in STD_LOGIC;
reset : in STD_LOGIC; -- Note that this reset is for the FSM not for the DAC
reset_all_dac : in STD_LOGIC;
data_in : in STD_LOGIC_VECTOR (23 downto 0);
serial_data_out : out STD_LOGIC;
sync_out : out STD_LOGIC; -- This is a chip select
reset_out : out STD_LOGIC;
busy : out STD_LOGIC
);
end dac_ad5764r_24bit_sr_programmer_v2;
architecture Behavioral of dac_ad5764r_24bit_sr_programmer_v2 is
-- Stati
type state_type is (idle, load_and_prepare_data, transmission);
--ATTRIBUTE ENUM_ENCODING : STRING;
--ATTRIBUTE ENUM_ENCODING OF state_type: TYPE IS "001 010 100";
signal state: state_type := idle;
--signal next_state: state_type := idle;
-- Clock counter
--signal clk_counter_enable : STD_LOGIC := '0';
signal clk_counter : unsigned(4 downto 0) := (others => '0');
-- Shift register
signal stored_data: STD_LOGIC_VECTOR (23 downto 0) := (others => '0');
begin
FSM_single_process: process(clk)
begin
if rising_edge(clk) then
if reset = '1' then
serial_data_out <= '0';
sync_out <= '1';
reset_out <= '1';
busy <= '0';
state <= idle;
else
-- Default
serial_data_out <= '0';
sync_out <= '1';
reset_out <= '1';
busy <= '0';
case (state) is
when transmission =>
serial_data_out <= stored_data(23);
sync_out <= '0';
busy <= '1';
clk_counter <= clk_counter + 1;
stored_data <= stored_data(22 downto 0) & "0";
state <= transmission;
if (clk_counter = 23) then
state <= idle;
end if;
when others => -- Idle
if start = '1' then
serial_data_out <= data_in(23);
sync_out <= '0';
reset_out <= '1';
busy <= '1';
stored_data <= data_in;
clk_counter <= "00001";
state <= transmission;
end if;
end case;
-- if (reset_all_dac = '1') then
-- reset_out <= '0';
-- end if;
end if;
end if;
end process;
end;
And the testbench:
LIBRARY ieee;
USE ieee.std_logic_1164.ALL;
-- Uncomment the following library declaration if using
-- arithmetic functions with Signed or Unsigned values
--USE ieee.numeric_std.ALL;
ENTITY dac_ad5764r_24bit_sr_programmer_tb IS
END dac_ad5764r_24bit_sr_programmer_tb;
ARCHITECTURE behavior OF dac_ad5764r_24bit_sr_programmer_tb IS
-- Component Declaration for the Unit Under Test (UUT)
COMPONENT dac_ad5764r_24bit_sr_programmer_v2
PORT(
clk : IN std_logic;
start : IN std_logic;
reset : IN std_logic;
data_in : IN std_logic_vector(23 downto 0);
serial_data_out : OUT std_logic;
reset_all_dac : IN std_logic;
sync_out : OUT std_logic;
reset_out : OUT std_logic;
--finish : OUT std_logic;
busy : OUT std_logic
);
END COMPONENT;
--Inputs
signal clk : std_logic := '0';
signal start : std_logic := '0';
signal reset : std_logic := '0';
signal data_in : std_logic_vector(23 downto 0) := (others => '0');
signal reset_all_dac : std_logic := '0';
--Outputs
signal serial_data_out : std_logic;
signal sync_out : std_logic;
signal reset_out : std_logic;
--signal finish : std_logic;
signal busy : std_logic;
-- Clock period definitions
constant clk_period : time := 100 ns;
BEGIN
-- Instantiate the Unit Under Test (UUT)
uut: dac_ad5764r_24bit_sr_programmer_v2 PORT MAP (
clk => clk,
start => start,
reset => reset,
data_in => data_in,
reset_all_dac => reset_all_dac,
serial_data_out => serial_data_out,
sync_out => sync_out,
reset_out => reset_out,
--finish => finish,
busy => busy
);
-- Clock process definitions
clk_process :process
begin
clk <= '0';
wait for clk_period/2;
clk <= '1';
wait for clk_period/2;
end process;
-- Stimulus process
stim_proc: process
begin
-- hold reset state for 100 ns.
wait for clk_period*10;
reset <= '1' after 25 ns;
wait for clk_period*1;
reset <= '0' after 25 ns;
wait for clk_period*3;
reset_all_dac <= '1' after 25 ns;
wait for clk_period*1;
reset_all_dac <= '0' after 25 ns;
wait for clk_period*5;
data_in <= "111111111111111111111111" after 25 ns;
wait for clk_period*3;
start <= '1' after 25 ns;
wait for clk_period*1;
start <= '0' after 25 ns;
wait;
end process;
END;
UPDATE 1
Updated with the last design: this code is not causing any 'X' (can't figure out why, this doesn't but the previous did). However it's not starting (in POST-TRANSLATE simulation) just like the first 3 process machine, and the signal sync_out is stuck at 0 while it should be '1' by default.
UPDATE 2
I've been looking into the tecnology schematic, starting from the problem of the sync_out=0: it's implemented with a FDS, S is the FSM reset signal, D is coming from a LUT3 with I = state&reset&start and INIT = 45 = "00101101". I've looked for this LUT3 in the simulation and I've noticed that it has INIT = "00000000"!
Is there something I'm missing about how to run this simulation? It seems that every LUT in the design have not been set!
UPDATE 3
It seems that the Post-Translate simulation is buggy in some way, or I'm not configuring it correctly for some reason: the Post-Map and the Post-PAR simulations work and display some outputs.
However there is an odd bug: the stored_data register is not updated with the complete data_in vector, after that, the FSM operates correctly and outputs the data stored.
I've looked in the tecnology schematic just after synthesis and for some reason the bits 23,22,21,19,18 are not connected to the corresponding data_in bit. You can see the effect in this screenshot from Post-Map simulation. Same happens in Post-PAR, but it seems that this problems comes directly from the synthesis!
Solved: the strange output comes from the Synthesis optimization. The tool realized that the previous block in the elaboration chain will never output a bit different from 0 for those specific bit. My mistake was assuming that I could test the single block alone: what I was really testing was the block synthetized for the FPGA taking into account everything else in the design!
Thanks to everybody helped me, I'm going to follow your advices!
I prefer the single-process form of state machine, which is cleaner, simpler and much less prone to bugs like sensitivity-list errors. I would also endorse the points in Paebbels' excellent answer. However I don't think any of these are the problem here.
One thing to be aware of in post-synth and post-PAR simulations is that their model of time is different from the behavioural model. The behavioural model follows simple rules as I described in this answer and ensures that in a typical design flow you can go straight to hardware - without post-synth simulation, without worry.
Indeed I only use post-synth or post-PAR simulations if I'm chasing a suspected tool bug. (For FPGA designs, not ASIC, that is!)
However, that simple timing model has its limitations. You may be familiar with problems like a clock signal assigned via signal assignment (usually buried in a 3rd party model where you don't expect it) which consumes a delta cycle, and ensures that your clocked data arrives before your clock instead of after, and everything subsequently occurs one cycle earlier than intended...
In behavioural modelling, a little discipline will keep clear of such troubles. But the same is not true of post-PAR modelling.
Your testbench is probably set up the same way as the behavioural model. And if so, that is likely to be the problem.
Here's what I do in this situation : I claim no formal authority for it, just experience. It also works well when interfacing the FPGA to external memory models with realistic timings.
1) I assume the simple (behavioural) timing model works correctly for all signals INTERNAL to the design.
2) I assume nothing of the sort for inputs and outputs from the design.
3) I take note of the estimated setup and hold timings on the inputs, (a) from the FPGA datasheet or better, (b) from the worst case values shown in the post-synth or post-PAR report, and structure the testbench around them.
Worked example : setup time 1 ns, hold time 2 ns, clock period 10 ns. This means that any input between 2 ns and 9 ns after a clock edge is guaranteed to be corrrectly read. I choose (arbitrarily) 5 ns.
signal_to_fpga <= driving_value after 5 ns;
(Note that Xilinx makes this absurdly counter-intuitive by expressing them as "offset in/out before/after" which refers timings to a previous or future clock edge instead of the one you're looking at)
Alternatively, if the input is fed from a CPU or memory in the real world, I use datasheet timing specifications for that device.
4) I take note of the worst case clk-out timing reported in the datasheet or report, and structure the design around them. (say, 7 ns)
fpga_output_pin <= driving_value after 7 ns;
Note that this "after" clause is obviously ignored by synthesis; however the post-synth back-annotation will introduce something very like it.
5) If this turns out to be not good enough, then (possibly in a wrapper component to avoid polluting the synthesisable code) improve accuracy like
fpga_output_pin <= 'X' after 1 ps, driving_value after 7 ns;
6) I re-run the behavioural simulation. Typically, it now fails, because it was written without realistic timings in mind.
7) I fix those failures. This may include adding realistic delays before testing values output from the design. It can be an iterative process.
Now, I have a reasonable expectation that the post-PAR simulation model will drop straight in to the testbench and work.
Here are some hints to improve your code:
You can remove the Xilinx dependencies to UNISIM, because you are not using any Xilinx Primitves.
Applying attribute ENUM_ENCODING has no effect on state encoding unless you also define the attribute FSM_ENCODING and set it's value to user. One-Hot encoding can be forced by setting FSM_ENCODING to one-hot. Normally synthesis is smart enough to find the best encoding.
read more ...
None of your registers has a default value:
signal current_state : state_type := idle;
Your FSM is no FSM in the eyes of Xilinx synthesis tool (XST). I'm sure if you look into your synthesis report, you won't find that XST reports a FSM for current_state.
So what's wrong with your FSM?
Your FSM has no initial state.
Your FSM has multiple reset states (idle, load_and_prepare_data)
Your FSM has no transition from idle to load_and_prepare_data (reset is no transition)
Writing next_state transitions for the current state can cause XST to think it's no FSM
the default assignment next_state <= current_state; is sufficient.
If you change the type of signal clk_counter to unsigned you can do arithmetic much easier.
increment: clk_counter <= clk_counter + 1;
clear: clk_counter <= (others => '0');
compare: if (clk_counter = 23) then
It's no good style to use the FSM's state signal outside of the FSM processes.
FSM_next_state_process: process(current_state, start, clk_counter, reset_all_dac)
begin
next_state <= current_state;
OutReg_busy <= '1';
OutReg_reset_out <= '1';
OutReg_sync_out <= '1';
clk_counter_enable <= '0';
case (current_state) is
when idle =>
OutReg_busy <= '0';
if (reset_all_dac = '1') then
OutReg_reset_out <= '0';
end if;
when load_and_prepare_data =>
next_state <= transmission;
when transmission =>
clk_counter_enable <= '1';
OutReg_sync_out <= '0';
if (clk_counter = 23) then
next_state <= idle;
end if;
when others =>
next_state <= idle;
end case ;
end process;

VHDL: creating a very slow clock pulse based on a very fast clock

(I'd post this in EE but it seems there are far more VHDL questions here...)
Background: I'm using the Xilinx Spartan-6LX9 FPGA with the Xilinx ISE 14.4 (webpack).
I stumbled upon the dreaded "PhysDesignRules:372 - Gated clock" warning today, and I see there's a LOT of discussion out there concerning that in general. The consensus seems to be to use one of the DCMs on the FPGA to do clock division but... my DCM doesn't appear to be capable of going from 32 MHz to 4.096 KHz (per the wizard it bottoms out at 5MHz based on 32MHz... and it seems absurd to try to chain multiple DCMs for this low-frequency purpose).
My current design uses clk_in to count up to a specified value (15265), resets that value to zero and toggles the clk_out bit (so I end up with a duty cycle of 50%, FWIW). It does the job, and I can easily use the rising edge of clk_out to drive the next stage of my design. It seems to work just fine, but... gated clock (even though it isn't in the range where clock skew would IMHO be very relevant). (Note: All clock tests are done using the rising_edge() function in processes sensitive to the given clock.)
So, my questions:
If we're talking about deriving a relatively slow clk_out from a much faster clk_in, is gating still considered bad? Or is this sort of "count to x and send a pulse" thing pretty typical for FPGAs to generate a "clock" in the KHz range and instead some other unnecessary side-effect may be triggering this warning instead?
Is there a better way to create a low KHz-range clock from a MHz-range master clock, keeping in mind that using multiple DCMs appears to be overkill here (if it's possible at all given the very low output frequency)? I realize the 50% duty cycle may be superfluous but assuming one clock in and not using the on-board DCMs how else would one perform major clock division with an FPGA?
Edit: Given the following (where CLK_MASTER is the 32 MHz input clock and CLK_SLOW is the desired slow-rate clock, and LOCAL_CLK_SLOW was a way to store the state of the clock for the whole duty-cycle thing), I learned that this configuration causes the warning:
architecture arch of clock is
constant CLK_MASTER_FREQ: natural := 32000000; -- time := 31.25 ns
constant CLK_SLOW_FREQ: natural := 2048;
constant MAX_COUNT: natural := CLK_MASTER_FREQ/CLK_SLOW_FREQ;
shared variable counter: natural := 0;
signal LOCAL_CLK_SLOW: STD_LOGIC := '0';
begin
clock_proc: process(CLK_MASTER)
begin
if rising_edge(CLK_MASTER) then
counter := counter + 1;
if (counter >= MAX_COUNT) then
counter := 0;
LOCAL_CLK_SLOW <= not LOCAL_CLK_SLOW;
CLK_SLOW <= LOCAL_CLK_SLOW;
end if;
end if;
end process;
end arch;
Whereas this configuration does NOT cause the warning:
architecture arch of clock is
constant CLK_MASTER_FREQ: natural := 32000000; -- time := 31.25 ns
constant CLK_SLOW_FREQ: natural := 2048;
constant MAX_COUNT: natural := CLK_MASTER_FREQ/CLK_SLOW_FREQ;
shared variable counter: natural := 0;
begin
clock_proc: process(CLK_MASTER)
begin
if rising_edge(CLK_MASTER) then
counter := counter + 1;
if (counter >= MAX_COUNT) then
counter := 0;
CLK_SLOW <= '1';
else
CLK_SLOW <= '0';
end if;
end if;
end process;
end arch;
So, in this case it was all for lack of an else (like I said, the 50% duty cycle was originally interesting but wasn't a requirement in the end, and the toggle of the "local" clock bit seemed quite clever at the time...) I was mostly on the right track it appears.
What's not clear to me at this point is why using a counter (which stores lots of bits) isn't causing warnings, but a stored-and-toggled output bit does cause warnings. Thoughts?
If you just need a clock to drive another part of your logic in the FPGA, the easy answer is to use a clock enable.
That is, run your slow logic on the same (fast) clock as everything else, but use a slow enable for it. Example:
signal clk_enable_200kHz : std_logic;
signal clk_enable_counter : std_logic_vector(9 downto 0);
--Create the clock enable:
process(clk_200MHz)
begin
if(rising_edge(clk_200MHz)) then
clk_enable_counter <= clk_enable_counter + 1;
if(clk_enable_counter = 0) then
clk_enable_200kHz <= '1';
else
clk_enable_200kHz <= '0';
end if;
end if;
end process;
--Slow process:
process(clk_200MHz)
begin
if(rising_edge(clk_200MHz)) then
if(reset = '1') then
--Do reset
elsif(clk_enable_200kHz = '1') then
--Do stuff
end if;
end if;
end process;
The 200kHz is approximate though, but the above can be extended to basically any clock enable frequency you need. Also, it should be supported directly by the FPGA hardware in most FPGAs (it is in Xilinx parts at least).
Gated clocks are almost always a bad idea, as people often forget that they are creating new clock-domains, and thus do not take the necessary precautions when interfacing signals between these. It also uses more clock-lines inside the FPGA, so you might quickly use up all your available lines if you have a lot of gated clocks.
Clock enables have none of these drawbacks. Everything runs in the same clock domain (although at different speeds), so you can easily use the same signals without any synchronizers or similar.
Note for this example to work this line,
signal clk_enable_counter : std_logic_vector(9 downto 0);
must be changed to
signal clk_enable_counter : unsigned(9 downto 0);
and you'll need to include this library,
library ieee;
use ieee.numeric_std.all;
Both your samples create a signal, one of which toggles at a slow rate, and one of which pulses a narrow pulse at a "slow-rate". If both those signals go to the clock-inputs of other flipflops, I would expect warnings about clock routing being non-optimal.
I'm not sure why you get a gated clock warning, that usually comes about when you do:
gated_clock <= clock when en = '1' else '0';
Here's a Complete Sample Code :
LIBRARY IEEE;
USE IEEE.STD_LOGIC_1164.ALL;
USE IEEE.NUMERIC_STD.ALL;
ENTITY Test123 IS
GENERIC (
clk_1_freq_generic : unsigned(31 DOWNTO 0) := to_unsigned(0, 32); -- Presented in Hz
clk_in1_freq_generic : unsigned(31 DOWNTO 0) := to_unsigned(0, 32) -- Presented in Hz, Also
);
PORT (
clk_in1 : IN std_logic := '0';
rst1 : IN std_logic := '0';
en1 : IN std_logic := '0';
clk_1 : OUT std_logic := '0'
);
END ENTITY Test123;
ARCHITECTURE Test123_Arch OF Test123 IS
--
SIGNAL clk_en_en : std_logic := '0';
SIGNAL clk_en_cntr1 : unsigned(31 DOWNTO 0) := (OTHERS => '0');
--
SIGNAL clk_1_buffer : std_logic := '0';
SIGNAL clk_1_freq : unsigned(31 DOWNTO 0) := (OTHERS => '0'); -- Presented in Hz, Also
SIGNAL clk_in1_freq : unsigned(31 DOWNTO 0) := (OTHERS => '0'); -- Presented in Hz
--
SIGNAL clk_prescaler1 : unsigned(31 DOWNTO 0) := (OTHERS => '0'); -- Presented in Cycles (Relative To The Input Clk.)
SIGNAL clk_prescaler1_halved : unsigned(31 DOWNTO 0) := (OTHERS => '0');
--
BEGIN
clk_en_gen : PROCESS (clk_in1)
BEGIN
IF (clk_en_en = '1') THEN
IF (rising_edge(clk_in1)) THEN
clk_en_cntr1 <= clk_en_cntr1 + 1;
IF ((clk_en_cntr1 + 1) = clk_prescaler1_halved) THEN -- a Register's (F/F) Output Only Updates Upon a Clock-Edge : That's Why This Comparison Is Done This Way !
clk_1_buffer <= NOT clk_1_buffer;
clk_1 <= clk_1_buffer;
clk_en_cntr1 <= (OTHERS => '0');
END IF;
END IF;
ELSIF (clk_en_en = '0') THEN
clk_1_buffer <= '0';
clk_1 <= clk_1_buffer;
clk_en_cntr1 <= (OTHERS => '0'); -- Clear Counter 'clk_en_cntr1'
END IF;
END PROCESS;
update_clk_prescalers : PROCESS (clk_in1_freq, clk_1_freq)
BEGIN
clk_prescaler1 <= (OTHERS => '0');
clk_prescaler1_halved <= (OTHERS => '0');
clk_en_en <= '0';
IF ((clk_in1_freq > 0) AND (clk_1_freq > 0)) THEN
clk_prescaler1 <= (clk_in1_freq / clk_1_freq); -- a Register's (F/F) Output Only Updates Upon a Clock-Edge : That's Why This Assignment Is Done This Way !
clk_prescaler1_halved <= ((clk_in1_freq / clk_1_freq) / 2); -- (Same Thing Here)
IF (((clk_in1_freq / clk_1_freq) / 2) > 0) THEN -- (Same Thing Here, Too)
clk_en_en <= '1';
END IF;
ELSE
NULL;
END IF;
END PROCESS;
clk_1_freq <= clk_1_freq_generic;
clk_in1_freq <= clk_in1_freq_generic;
END ARCHITECTURE Test123_Arch;

VHDL error can't infer register because its behavior does not match any supported register model

I am new to VHDL and trying to make a delay/gate application for programmable FPGA, with adjustable lenght of delay and gate output. As soon as the input signal is recieved, the thing should ignore any other inputs, until generating of gate signal is finished.
I want to use this component for 8 different inputs and 8 different outputs later, and set desired delay/gate prameters separately for each one by means of writing registers.
When trying to compile in Quartus II v 11.0 i am getting this error:
Error (10821): HDL error at clkgen.vhd(46): can't infer register for "control_clkgen" because its behavior does not match any supported register model
And as well
Error (10822): HDL error at clkgen.vhd(37): couldn't implement registers for assignments on this clock edge
No idea whats wrong, here is the code of the component:
library ieee;
use IEEE.Std_Logic_1164.all;
use IEEE.Std_Logic_arith.all;
use IEEE.Std_Logic_unsigned.all;
ENTITY clkgen is
port(
lclk : in std_logic;
start_clkgen : in std_logic;
gate_clkgen : in std_logic_vector(31 downto 0);
delay_clkgen : in std_logic_vector(31 downto 0);
output_clkgen : out std_logic
);
END clkgen ;
ARCHITECTURE RTL of clkgen is
signal gate_cycles_clkgen : std_logic_vector(32 downto 0);
signal delay_cycles_clkgen : std_logic_vector(32 downto 0);
signal total_cycles_clkgen : std_logic_vector(32 downto 0);
signal counter_clkgen : std_logic_vector(32 downto 0);
signal control_clkgen : std_logic;
begin
gate_cycles_clkgen <= '0' & gate_clkgen;
delay_cycles_clkgen <= '0' & delay_clkgen;
total_cycles_clkgen <= gate_cycles_clkgen + delay_cycles_clkgen;
start_proc: process(lclk, start_clkgen)
begin
if (start_clkgen'event and start_clkgen = '1') then
if control_clkgen = '0' then
control_clkgen <= '1';
end if;
end if;
if (lclk'event and lclk = '1') then
if control_clkgen = '1' then
counter_clkgen <= counter_clkgen + 1;
if (counter_clkgen > delay_cycles_clkgen - 1 AND counter_clkgen < total_cycles_clkgen + 1) then
output_clkgen <= '1';
elsif (counter_clkgen = total_cycles_clkgen) then
counter_clkgen <= (others => '0');
output_clkgen <= '0';
control_clkgen <= '0';
end if;
end if;
end if;
end process start_proc;
END RTL;
Big thanks in advance for help.
The problem is that in the way you has described the element control_clkgen - it is edge sensitive to two different signals (lclk, and start_clkgen). What the tools are telling you is that "hey, as I am trying to make your valid VHDL design fit into a real piece of hardware, I have found that there are not any pieces of hardware that can implement what you want. Basically, there are no flip flops that can be edge sensitive to two signals (only one, typically the clock.
Possible solution: Do you really need control_clkgen to be sensitive to the edge of start_clkgen? Would it be good enough, or could you find another solution where start_proc is sensitive only to lclk and you simply check if start_clkgen is high?
start_proc: process(lclk)
begin
if (rising_edge(lclk)) then
start_clkgen_d <= start_clkgen;
if (start_clkgen='1' and start_clkgen_d='0') then
if control_clkgen = '0' then
control_clkgen <= '1';
end if;
end if;
end if;
...
You are describing a register control_clkgen which has two clocks start_clkgen and lclk. I guess that's not supported by your synthesis tool.
You have to describe this behavior in another way. Maybe use start_clkgen as asynchronous or synchronous preset signal or combine those two signals into one single clock signal or use more than one flipflop for that functionality.

Resources