VHDL: creating a very slow clock pulse based on a very fast clock - vhdl

(I'd post this in EE but it seems there are far more VHDL questions here...)
Background: I'm using the Xilinx Spartan-6LX9 FPGA with the Xilinx ISE 14.4 (webpack).
I stumbled upon the dreaded "PhysDesignRules:372 - Gated clock" warning today, and I see there's a LOT of discussion out there concerning that in general. The consensus seems to be to use one of the DCMs on the FPGA to do clock division but... my DCM doesn't appear to be capable of going from 32 MHz to 4.096 KHz (per the wizard it bottoms out at 5MHz based on 32MHz... and it seems absurd to try to chain multiple DCMs for this low-frequency purpose).
My current design uses clk_in to count up to a specified value (15265), resets that value to zero and toggles the clk_out bit (so I end up with a duty cycle of 50%, FWIW). It does the job, and I can easily use the rising edge of clk_out to drive the next stage of my design. It seems to work just fine, but... gated clock (even though it isn't in the range where clock skew would IMHO be very relevant). (Note: All clock tests are done using the rising_edge() function in processes sensitive to the given clock.)
So, my questions:
If we're talking about deriving a relatively slow clk_out from a much faster clk_in, is gating still considered bad? Or is this sort of "count to x and send a pulse" thing pretty typical for FPGAs to generate a "clock" in the KHz range and instead some other unnecessary side-effect may be triggering this warning instead?
Is there a better way to create a low KHz-range clock from a MHz-range master clock, keeping in mind that using multiple DCMs appears to be overkill here (if it's possible at all given the very low output frequency)? I realize the 50% duty cycle may be superfluous but assuming one clock in and not using the on-board DCMs how else would one perform major clock division with an FPGA?
Edit: Given the following (where CLK_MASTER is the 32 MHz input clock and CLK_SLOW is the desired slow-rate clock, and LOCAL_CLK_SLOW was a way to store the state of the clock for the whole duty-cycle thing), I learned that this configuration causes the warning:
architecture arch of clock is
constant CLK_MASTER_FREQ: natural := 32000000; -- time := 31.25 ns
constant CLK_SLOW_FREQ: natural := 2048;
constant MAX_COUNT: natural := CLK_MASTER_FREQ/CLK_SLOW_FREQ;
shared variable counter: natural := 0;
signal LOCAL_CLK_SLOW: STD_LOGIC := '0';
begin
clock_proc: process(CLK_MASTER)
begin
if rising_edge(CLK_MASTER) then
counter := counter + 1;
if (counter >= MAX_COUNT) then
counter := 0;
LOCAL_CLK_SLOW <= not LOCAL_CLK_SLOW;
CLK_SLOW <= LOCAL_CLK_SLOW;
end if;
end if;
end process;
end arch;
Whereas this configuration does NOT cause the warning:
architecture arch of clock is
constant CLK_MASTER_FREQ: natural := 32000000; -- time := 31.25 ns
constant CLK_SLOW_FREQ: natural := 2048;
constant MAX_COUNT: natural := CLK_MASTER_FREQ/CLK_SLOW_FREQ;
shared variable counter: natural := 0;
begin
clock_proc: process(CLK_MASTER)
begin
if rising_edge(CLK_MASTER) then
counter := counter + 1;
if (counter >= MAX_COUNT) then
counter := 0;
CLK_SLOW <= '1';
else
CLK_SLOW <= '0';
end if;
end if;
end process;
end arch;
So, in this case it was all for lack of an else (like I said, the 50% duty cycle was originally interesting but wasn't a requirement in the end, and the toggle of the "local" clock bit seemed quite clever at the time...) I was mostly on the right track it appears.
What's not clear to me at this point is why using a counter (which stores lots of bits) isn't causing warnings, but a stored-and-toggled output bit does cause warnings. Thoughts?

If you just need a clock to drive another part of your logic in the FPGA, the easy answer is to use a clock enable.
That is, run your slow logic on the same (fast) clock as everything else, but use a slow enable for it. Example:
signal clk_enable_200kHz : std_logic;
signal clk_enable_counter : std_logic_vector(9 downto 0);
--Create the clock enable:
process(clk_200MHz)
begin
if(rising_edge(clk_200MHz)) then
clk_enable_counter <= clk_enable_counter + 1;
if(clk_enable_counter = 0) then
clk_enable_200kHz <= '1';
else
clk_enable_200kHz <= '0';
end if;
end if;
end process;
--Slow process:
process(clk_200MHz)
begin
if(rising_edge(clk_200MHz)) then
if(reset = '1') then
--Do reset
elsif(clk_enable_200kHz = '1') then
--Do stuff
end if;
end if;
end process;
The 200kHz is approximate though, but the above can be extended to basically any clock enable frequency you need. Also, it should be supported directly by the FPGA hardware in most FPGAs (it is in Xilinx parts at least).
Gated clocks are almost always a bad idea, as people often forget that they are creating new clock-domains, and thus do not take the necessary precautions when interfacing signals between these. It also uses more clock-lines inside the FPGA, so you might quickly use up all your available lines if you have a lot of gated clocks.
Clock enables have none of these drawbacks. Everything runs in the same clock domain (although at different speeds), so you can easily use the same signals without any synchronizers or similar.

Note for this example to work this line,
signal clk_enable_counter : std_logic_vector(9 downto 0);
must be changed to
signal clk_enable_counter : unsigned(9 downto 0);
and you'll need to include this library,
library ieee;
use ieee.numeric_std.all;

Both your samples create a signal, one of which toggles at a slow rate, and one of which pulses a narrow pulse at a "slow-rate". If both those signals go to the clock-inputs of other flipflops, I would expect warnings about clock routing being non-optimal.
I'm not sure why you get a gated clock warning, that usually comes about when you do:
gated_clock <= clock when en = '1' else '0';

Here's a Complete Sample Code :
LIBRARY IEEE;
USE IEEE.STD_LOGIC_1164.ALL;
USE IEEE.NUMERIC_STD.ALL;
ENTITY Test123 IS
GENERIC (
clk_1_freq_generic : unsigned(31 DOWNTO 0) := to_unsigned(0, 32); -- Presented in Hz
clk_in1_freq_generic : unsigned(31 DOWNTO 0) := to_unsigned(0, 32) -- Presented in Hz, Also
);
PORT (
clk_in1 : IN std_logic := '0';
rst1 : IN std_logic := '0';
en1 : IN std_logic := '0';
clk_1 : OUT std_logic := '0'
);
END ENTITY Test123;
ARCHITECTURE Test123_Arch OF Test123 IS
--
SIGNAL clk_en_en : std_logic := '0';
SIGNAL clk_en_cntr1 : unsigned(31 DOWNTO 0) := (OTHERS => '0');
--
SIGNAL clk_1_buffer : std_logic := '0';
SIGNAL clk_1_freq : unsigned(31 DOWNTO 0) := (OTHERS => '0'); -- Presented in Hz, Also
SIGNAL clk_in1_freq : unsigned(31 DOWNTO 0) := (OTHERS => '0'); -- Presented in Hz
--
SIGNAL clk_prescaler1 : unsigned(31 DOWNTO 0) := (OTHERS => '0'); -- Presented in Cycles (Relative To The Input Clk.)
SIGNAL clk_prescaler1_halved : unsigned(31 DOWNTO 0) := (OTHERS => '0');
--
BEGIN
clk_en_gen : PROCESS (clk_in1)
BEGIN
IF (clk_en_en = '1') THEN
IF (rising_edge(clk_in1)) THEN
clk_en_cntr1 <= clk_en_cntr1 + 1;
IF ((clk_en_cntr1 + 1) = clk_prescaler1_halved) THEN -- a Register's (F/F) Output Only Updates Upon a Clock-Edge : That's Why This Comparison Is Done This Way !
clk_1_buffer <= NOT clk_1_buffer;
clk_1 <= clk_1_buffer;
clk_en_cntr1 <= (OTHERS => '0');
END IF;
END IF;
ELSIF (clk_en_en = '0') THEN
clk_1_buffer <= '0';
clk_1 <= clk_1_buffer;
clk_en_cntr1 <= (OTHERS => '0'); -- Clear Counter 'clk_en_cntr1'
END IF;
END PROCESS;
update_clk_prescalers : PROCESS (clk_in1_freq, clk_1_freq)
BEGIN
clk_prescaler1 <= (OTHERS => '0');
clk_prescaler1_halved <= (OTHERS => '0');
clk_en_en <= '0';
IF ((clk_in1_freq > 0) AND (clk_1_freq > 0)) THEN
clk_prescaler1 <= (clk_in1_freq / clk_1_freq); -- a Register's (F/F) Output Only Updates Upon a Clock-Edge : That's Why This Assignment Is Done This Way !
clk_prescaler1_halved <= ((clk_in1_freq / clk_1_freq) / 2); -- (Same Thing Here)
IF (((clk_in1_freq / clk_1_freq) / 2) > 0) THEN -- (Same Thing Here, Too)
clk_en_en <= '1';
END IF;
ELSE
NULL;
END IF;
END PROCESS;
clk_1_freq <= clk_1_freq_generic;
clk_in1_freq <= clk_in1_freq_generic;
END ARCHITECTURE Test123_Arch;

Related

UART Transmitter only functions when embedded logic analyzer is running

I have been trying to implement a UART in order to communicate between my Lattice MachXO3D board and my computer. At the moment I am attempting to implement the transmission from the FPGA.
Upon testing on the hardware, I encountered a very strange issue. If run normally, it will function for a few seconds and then it will suddenly stop functioning (The CH340 connected to my computer will report it is receiving messages containing 0x0). However, if I embed a logic analyzer onto the FPGA through the Lattice Diamond software, and I run the analyzer, it will function perfectly for an extended period of time.
Sadly, I don't have a logic analyzer, so the embedded logic analyzer is my only chance at knowing what is actually being transmitted.
These are the files related to my implementation:
baud_gen
LIBRARY IEEE;
USE IEEE.std_logic_1164.ALL;
USE IEEE.numeric_std.ALL;
-- Generates 16 ticks per bit
ENTITY baud_gen IS
GENERIC(divider: INTEGER := 13 -- 24M/115200*16
);
PORT(
clk, reset: IN STD_LOGIC;
s_tick: OUT STD_LOGIC
);
END baud_gen;
ARCHITECTURE working OF baud_gen IS
BEGIN
PROCESS(clk)
VARIABLE counter: UNSIGNED(3 DOWNTO 0) := to_unsigned(0,4);
BEGIN
IF clk'EVENT AND clk='1' THEN
IF reset='1' THEN
s_tick <= '0';
counter := to_unsigned(0,4);
ELSIF counter=to_unsigned(divider-1,4) then
s_tick <= '1';
counter:= to_unsigned(0,4);
ELSE
s_tick <= '0';
counter := counter + 1;
END IF;
END IF;
END PROCESS;
END working;
rs232_tx
LIBRARY IEEE;
USE IEEE.std_logic_1164.ALL;
USE IEEE.numeric_std.ALL;
ENTITY rs232_tx IS
PORT(clk: IN std_logic;
tx: OUT std_logic;
rst: IN std_logic;
fifo_empty: IN std_logic;
fifo_RdEn, fifo_RdClock: OUT std_logic;
fifo_data: IN STD_LOGIC_VECTOR(7 DOWNTO 0);
Q: OUT STD_LOGIC_VECTOR(1 DOWNTO 0));
END rs232_tx;
ARCHITECTURE working OF rs232_tx IS
TYPE state IS (idle,start,data);
SIGNAL tx_pulse: STD_LOGIC := '1';
SIGNAL s_tick: STD_LOGIC;
SIGNAL pr_state, nx_state: state := idle;
SIGNAL data_val: std_logic_vector(7 DOWNTO 0):=(others=>'0');
SIGNAL data_count: unsigned(2 DOWNTO 0):=to_unsigned(0,3);
BEGIN
process(s_tick, rst)
VARIABLE count: unsigned(3 DOWNTO 0):= to_unsigned(0,4);
BEGIN
IF rising_edge(s_tick) THEN
count := count + to_unsigned(1,4);
IF count = to_unsigned(15,4) THEN
tx_pulse <= '1';
ELSE
tx_pulse <='0';
END IF;
END IF;
END PROCESS;
process(tx_pulse,rst)
BEGIN
IF rising_edge(tx_pulse) THEN
IF rst='1' THEN
pr_state <= idle;
data_val <= (others=>'0');
data_count <= to_unsigned(0,3);
ELSE
pr_state <= nx_state;
CASE pr_state IS
WHEN idle =>
data_count <= to_unsigned(0,3);
WHEN data =>
data_count <= data_count + to_unsigned(1,3);
WHEN start =>
data_val <= fifo_data;
WHEN OTHERS =>
END case;
END IF;
END IF;
END process;
process(fifo_empty,rst,data_count,pr_state,data_count)
BEGIN
case pr_state is
when idle =>
Q <= ('1','1');
fifo_RdEn <= '0';
tx <= '1';
IF fifo_empty='0' AND rst='0' THEN
nx_state <= start;
ELSE
nx_state <= idle;
END IF;
WHEN start =>
Q <= ('1','0');
fifo_RdEn <= '1';
tx <= '0';
nx_state <= data;
WHEN data =>
Q <= ('0','1');
fifo_RdEn <= '0';
tx <= data_val(to_integer(data_count));
if data_count=to_unsigned(7,3) then
nx_state <= idle;
ELSE
nx_state <= data;
end if;
end case;
END process;
fifo_RdClock <= tx_pulse;
baud_gen: ENTITY work.baud_gen
PORT MAP(clk,reset=>'0',s_tick=>s_tick);
END working;
testbench
LIBRARY IEEE;
USE IEEE.std_logic_1164.ALL;
USE IEEE.numeric_std.ALL;
ENTITY rs232_tx_test IS
GENERIC(clk_period: TIME := 41666666.7 fs;
baud_period: TIME := 8680.55556 ns);
END rs232_tx_test;
ARCHITECTURE working OF rs232_tx_test IS
SIGNAL clk: STD_LOGIC := '0';
SIGNAL tx, rst, fifo_empty, fifo_RdEn, fifo_RdClock: STD_LOGIC;
SIGNAL fifo_data: STD_LOGIC_VECTOR(7 DOWNTO 0);
BEGIN
clk <= NOT clk AFTER clk_period/2;
rst <= '1', '0' AFTER 100 ns;
PROCESS
BEGIN
fifo_empty <= '1';
WAIT FOR baud_period;
fifo_empty <= '0';
WAIT FOR baud_period*16;
END PROCESS;
fifo_data <= ('1','1','0','0','1','0','1','1') WHEN fifo_RdEn='1' ELSE (others=>'0');
dut: ENTITY work.rs232_tx
PORT MAP(clk,tx,rst,fifo_empty,fifo_RdEn,fifo_RdClock,fifo_data);
END working;
EDIT
I have tested another UART design I found online # 9600 bps and it fails in the same way. It can send a constant character, in this case 'a', to a terminal on my computer, and then it suddenly stops sending anything. However, if I start listening to the soft logic analyzer I generated in Lattice Diamond, it works without a problem and does not fail.
This smells like a classic timing issue.
In the comments, others have explained where there are weaknesses in the code that could indirectly lead to this. I'm going to concentrate on what is occurring.
As you have stated, in simulation your code works as you expect. However, that is only half the story. To create an FPGA bit stream the build tools take your code and several other files, synthesises, and conducts Place & Route. Your timing issue occurs during P&R. This is why your simulation doesn't pick up on any errors, as I'm assuming it's an RTL (pre-place & route) simulation.
During P&R the tools lay the logic in the best way to fit the timing model of the device so all paths meet their timing requirements. The path timing requirements are derived from explicit statements in a Timing Constraints file and inferred from your code (that's why your coding style matters btw).
Once P&R is complete, the tools will put the build artefact through a static timing analyser (STA) tool and report back whether the build fails to meet the timing requirements.
This leads to two questions:
Does the build report a timing error?
Do you have a Timing Constraints file - if you're unsure, the answer is no.
The way to debug your problem is to use the Timing Report generated by Lattice Diamond to see where the failures are.
If there are no reported failures it means your timing model is wrong because of a lack of appropriate timing constraints. As a minimum, you will need to constrain all IO in the design and describe all the clocks in your design.
Here is a good document to help you out:
https://www.latticesemi.com/-/media/LatticeSemi/Documents/UserManuals/RZ/Timing_Closure_Document.ashx?document_id=45588
The reason that your design works when you use the embedded Logic Analyser is that extra logic has been placed in the design, which changes the timing model. The P&R tools lay out the design differently, and by luck have placed the design in such a way as meet the real timing requirements on it.
As I often say to my Software Engineer colleagues, software languages create a set of instructions, HDL creates a set of suggestions.

Why does the output signals post-synthesis not work as usual?

I had written a small VHD file for simulating the behavior of a quadrature decoder, enclosed below. Simulating the design with a generic testbench works as expected. But after generating a synthesizable design with Quartus, I run into one of two problems (while playing with using unsigned, for example)
1. The position and direction signal are always at a constant 0 value throughout the post-synthesis simulation.
2. The position value seems to jump 10 values every 3-4 clock cycles, which I attribute to some jitter in data.
Does anyone have any recommendations to solve this issue? Is this mainly a timing problem or is there a major flaw in my design?
LIBRARY ieee;
USE ieee.std_logic_1164.ALL;
USE ieee.NUMERIC_STD.ALL;
entity quad_decoder is
port(rst : in std_logic;
clk : in std_logic;
a : in std_logic;
b : in std_logic;
direction : out std_logic;
position : out std_logic_vector(8 DOWNTO 0));
end quad_decoder;
architecture behavioral of quad_decoder is
begin
process(clk)
variable counter : integer range 0 to 360 := 0;
variable chanA,chanB : std_logic;
variable int_direction : std_logic;
begin
if (rst = '0') then
int_direction := '0';
counter := 0;
elsif (rising_edge(clk)) then
chanA := a;
chanB := b;
if (chanA = '1') and (chanB = '0') then
if (counter = 360) then
counter := 0;
else
counter:= counter + 1;
end if;
int_direction := '1';
elsif (chanA = '0') and (chanB = '1') then
if (counter = 0) then
counter := 360;
else
counter := counter-1;
end if;
int_direction := '0';
else
counter := counter;
int_direction := int_direction;
end if;
position <= std_logic_vector(to_unsigned(counter,9));
direction <= int_direction;
end if;
end process;
end behavioral;
The expected pre-synthesis snap is here.
I've linked an example snap of the post-synthesis simulation here. As seen, no change to position nor direction in multiple clock cycles.
If anyone is inquisitive, doing assignments right at the clock edge as well as turning the reset signal high proved to introduce all kinds of timing issues, which passed the multi-corner timing analysis test, but failed other tests in Quartus that I had failed to notice.
I can go into more details if my answer is vague.

VHDL - synthesis results is not the same as behavioral

I have to write program in VHDL which calculate sqrt using Newton method. I wrote the code which seems to me to be ok but it does not work.
Behavioral simulation gives proper output value but post synthesis (and launched on hardware) not.
Program was implemented as state machine. Input value is an integer (used format is std_logic_vector), and output is fixed point (for calculation
purposes input value was multiplied by 64^2 so output value has 6 LSB bits are fractional part).
I used function to divide in vhdl from vhdlguru blogspot.
In behavioral simulation calculating sqrt takes about 350 ns (Tclk=10 ns) but in post synthesis only 50 ns.
Used code:
library ieee;
use ieee.std_logic_1164.all;
use ieee.std_logic_arith.all;
use ieee.std_logic_unsigned.all;
entity moore_sqrt is
port (clk : in std_logic;
enable : in std_logic;
input : in std_logic_vector (15 downto 0);
data_ready : out std_logic;
output : out std_logic_vector (31 downto 0)
);
end moore_sqrt;
architecture behavioral of moore_sqrt is
------------------------------------------------------------
function division (x : std_logic_vector; y : std_logic_vector) return std_logic_vector is
variable a1 : std_logic_vector(x'length-1 downto 0):=x;
variable b1 : std_logic_vector(y'length-1 downto 0):=y;
variable p1 : std_logic_vector(y'length downto 0):= (others => '0');
variable i : integer:=0;
begin
for i in 0 to y'length-1 loop
p1(y'length-1 downto 1) := p1(y'length-2 downto 0);
p1(0) := a1(x'length-1);
a1(x'length-1 downto 1) := a1(x'length-2 downto 0);
p1 := p1-b1;
if(p1(y'length-1) ='1') then
a1(0) :='0';
p1 := p1+b1;
else
a1(0) :='1';
end if;
end loop;
return a1;
end division;
--------------------------------------------------------------
type state_type is (s0, s1, s2, s3, s4, s5, s6); --type of state machine
signal current_state,next_state: state_type; --current and next state declaration
signal xk : std_logic_vector (31 downto 0);
signal temp : std_logic_vector (31 downto 0);
signal latched_input : std_logic_vector (15 downto 0);
signal iterations : integer := 0;
signal max_iterations : integer := 10; --corresponds with accuracy
begin
process (clk,enable)
begin
if enable = '0' then
current_state <= s0;
elsif clk'event and clk = '1' then
current_state <= next_state; --state change
end if;
end process;
--state machine
process (current_state)
begin
case current_state is
when s0 => -- reset
output <= "00000000000000000000000000000000";
data_ready <= '0';
next_state <= s1;
when s1 => -- latching input data
latched_input <= input;
next_state <= s2;
when s2 => -- start calculating
-- initial value is set as a half of input data
output <= "00000000000000000000000000000000";
data_ready <= '0';
xk <= "0000000000000000" & division(latched_input, "0000000000000010");
next_state <= s3;
iterations <= 0;
when s3 => -- division
temp <= division ("0000" & latched_input & "000000000000", xk);
next_state <= s4;
when s4 => -- calculating
if(iterations < max_iterations) then
xk <= xk + temp;
next_state <= s5;
iterations <= iterations + 1;
else
next_state <= s6;
end if;
when s5 => -- shift logic right by 1
xk <= division(xk, "00000000000000000000000000000010");
next_state <= s3;
when s6 => -- stop - proper data
-- output <= division(xk, "00000000000000000000000001000000"); --the nearest integer value
output <= xk; -- fixed point 24.6, sqrt = output/64;
data_ready <= '1';
end case;
end process;
end behavioral;
Below screenshoots of behavioral and post-sythesis simulation results:
Behavioral simulation
Post-synthesis simulation
I have only little experience with VHDL and I have no idea what can I do to fix problem. I tried to exclude other process which was for calculation but it also did not work.
I hope you can help me.
Platform: Zynq ZedBoard
IDE: Vivado 2014.4
Regards,
Michal
A lot of the problems can be eliminated if you rewrite the state machine in single process form, in a pattern similar to this. That will eliminate both the unwanted latches, and the simulation /synthesis mismatches arising from sensitivity list errors.
I believe you are also going to have to rewrite the division function with its loop in the form of a state machine - either a separate state machine, handshaking with the main one to start a divide and signal its completion, or as part of a single hierarchical state machine as described in this Q&A.
This code is neither correct for simulation nor for synthesis.
Simulation issues:
Your sensitivity list is not complete, so the simulation does not show the correct behavior of the synthesized hardware. All right-hand-side signals should be include if the process is not clocked.
Synthesis issues:
Your code produces masses of latches. There is only one register called current_state. Latches should be avoided unless you know exactly what you are doing.
You can't divide numbers in the way you are using the function, if you want to keep a proper frequency of your circuit.
=> So check your Fmax report and
=> the RTL schematic or synthesis report for resource utilization.
Don't use the devision to shift bits. Neither in software the compiler implements a division if a value is shifted by a power of two. Us a shift operation to shift a value.
Other things to rethink:
enable is a low active asynchronous reset. Synchronous resets are better for FPGA implementations.
VHDL code may by synthesizable or not, and the synthesis result may behave as the simulation, or not. This depends on the code, the synthesizer, and the target platform, and is very normal.
Behavioral code is good for test-benches, but - in general - cannot be synthesized.
Here I see the most obvious issue with your code:
process (current_state)
begin
[...]
iterations <= iterations + 1;
[...]
end process;
You are iterating over a signal which does not appear in the sensitivity list of the process. This might be ok for the simulator which executes the process blocks just like software. On the other hand side, the synthesis result is totally unpredictable. But adding iterations to the sensitivity list is not enough. You would just end up with an asynchronous design. Your target platform is a clocked device. State changes may only occur at the trigger edge of the clock.
You need to tell the synthesizer how to map the iterations required to perform this calculation over the clock cycles. The safest way to do that is to break down the behavioural code into RTL code (https://en.wikipedia.org/wiki/Register-transfer_level#RTL_in_the_circuit_design_cycle).

Gated Clock in Clock Divider for a Square Wave

I recently have been designing a clock divider for my system - which I redesigned, and now has an asynchronous reset, which generates a synchronous reset for the rest of the system. To do this, I followed the answers & advice to my own question and used a clock enable to toggle an output, thus generating clock with a 50% duty cycle (which is desired).
However, this has thrown up the error when generating the bitstream of
PhysDesignRules:372 - Gated clock. Clock net ... is sourced by a combinatorial pin. This is not good design practice. Use the CE pin to control the loading of data into the flip-flop.
Reading into this question about using the clock enable, it appears when creating a clock enable is correct, however, because I need a square wave (and not just a 1/200MHz pulse), and thus using the enable to toggle another signal, it appears in this question that this is an intentional gated clock.
So my questions are; is this gated clock warning significant? Both in simulation, and on an oscilloscope it appears to function correctly (so I'm inclined to ignore it), but am I storing problems for later? Is there a way of getting a very slow 50% duty cycle pulse without a gated clock?
I've put my code below!
Thanks very much (especially to the handful of people who have collectively given a lot of time to my recent non-stop questions)
David
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.NUMERIC_STD.ALL;
library UNISIM;
use UNISIM.VComponents.all;
ENTITY CLK_DIVIDER IS
GENERIC(INPUT_FREQ : INTEGER;
OUT1_FREQ : INTEGER;
OUT2_FREQ : INTEGER
);
PORT(SYSCLK : IN STD_LOGIC;
RESET_N : IN STD_LOGIC;
RESET_N_OUT : OUT STD_LOGIC;
OUT1 : OUT STD_LOGIC;
OUT2 : OUT STD_LOGIC);
END CLK_DIVIDER;
architecture Behavioral of Clk_Divider is
constant divider1 : integer := INPUT_FREQ / OUT1_FREQ / 2;
constant divider2 : integer := INPUT_FREQ / OUT2_FREQ / 2;
signal counter1 : integer := 0;
signal counter2 : integer := 0;
signal output1 : std_logic := '0';
signal output2 : std_logic := '0';
signal reset : boolean;
begin
reset_proc : process(RESET_N, SYSCLK)
variable cycles : integer := 0;
variable reset_counter : integer := 0;
begin
if rising_edge(SYSCLK) then
if cycles < 2 then
if reset_counter >= divider1 then
cycles := cycles + 1;
reset_counter := 0;
else
reset_counter := reset_counter + 1;
end if;
reset <= true;
else
reset <= false;
end if;
end if;
if RESET_N = '0' then
cycles := 0;
reset_counter := 0;
reset <= true;
end if;
end process;
output1_proc : process(SYSCLK)
begin
if rising_edge(SYSCLK) then
if counter1 >= divider1 - 1 then
output1 <= not output1;
counter1 <= 0;
else
counter1 <= counter1 + 1;
end if;
if RESET_N = '0' then
counter1 <= 0;
output1 <= '1';
end if;
end if;
end process;
output2_proc : process(SYSCLK)
begin
if rising_edge(SYSCLK) then
if counter2 >= divider2 - 1 then
output2 <= not output2;
counter2 <= 0;
else
counter2 <= counter2 + 1;
end if;
if RESET_N = '0' then
counter2 <= 0;
output2 <= '1';
end if;
end if;
end process;
OUT1 <= output1;
OUT2 <= output2;
RESET_N_OUT <= '0' when reset else '1';
end Behavioral;
The comments suggest, that the clock-divider is instantiated in a larger design.
If you want to use the generated clock there, you must add a BUFG between signal output2 and the output out2 like this:
out2_bufg : BUFG port map(I => output2, O => out2);
The component BUFG is defined in the library unisim.vcomponents. Same applies to output out1.
The BUFG ensures, that the generated clock is distributed using a clock-tree, so that the clock signal arrives at all destination flip-flops at the same time. This minimizes hold-time violations and gives more room for the setup of data signals.
If you still get the warning/error, then you combined the generated clock signal with another signal elsewhere in the larger design.
The reset_proc process is written with a elsif cycles > 1 then outside the clocked part, and with the condition derived from cycles which is also assigned as part of reset. It is unclear what hardware is actually described here, but it makes the synthesis tool unhappy.
If you want to use RESET_N as asynchronous reset, then just make a simple if RESET_N = '0' then ... end if;, and assign whatever signals or variables required to be reset. Other assignments should be moved to the clocked part.
NOTE: Code was changed after the above update... which may have removed the reason for the question.
Btw. RESET_N is used as asynchronous reset in the reset_proc, but as synchronous reset in the other processes, and this inconsistent use looks like there may be a potential problem with the reset scheme.

Cannot create latch and counter with 2 clock signals in VHDL

I am completely new to programming CPLDs and I want to program a latch + counter in Xilinx ISE Project Navigator using VHDL language. This is how it must work and it MUST be only this way: this kind of device gets 2 clock signals. When one of them gets from HIGH to LOW state, data input bits get transferred to outputs and they get latched. When the 2nd clock gets from LOW to HIGH state, the output bits get incremented by 1. Unfortunately my code doesn't want to work....
library ieee;
use ieee.std_logic_1164.all;
use ieee.std_logic_unsigned.all;
entity counter8bit is
port(CLKDA, CLKLD : in std_logic;
D : in std_logic_vector(7 downto 0);
Q : out std_logic_vector(7 downto 0));
end counter8bit;
architecture archi of counter8bit is
signal tmp: std_logic_vector(7 downto 0);
begin
process (CLKDA, CLKLD, D)
begin
if (CLKLD'event and CLKLD='0') then
tmp <= D;
elsif (CLKDA'event and CLKDA='1') then
tmp <= tmp + 1;
end if;
end process;
Q <= tmp;
end archi;
Is there any other way around to achieve this?? Please for replies. Any kind of help/suggestions will be strongly appreciated. Many thanks in advance!!
Based on the added comments on what the counter is for, I came up with the following idea. Whether it would work in reality is hard to decide, because I would need a proper timing diagram for the EPROM interface. Importantly, there could be clock domain crossing issues depending on what restrictions there are on how the two clock signals are asserted; if there can be a falling edge of CLKLD close to a rising edge of CLKDA, the design may not work reliably.
signal new_start_address : std_logic := '0';
signal start_address : std_logic_vector(D'range) := (others => '0');
...
process (CLKLD, CLKDA)
begin
if (CLKDA = '1') then
new_start_address <= '0';
elsif (falling_edge(CLKLD)) then
start_address <= D;
new_start_address <= '1';
end if;
end process;
process (CLKDA)
begin
if (rising_edge(CLKDA)) then
if (new_start_address = '1') then
tmp <= start_address;
else
tmp <= tmp + 1;
end if;
end if;
end process;
I'm not completely clear on the interface, but it could be that the line tmp <= start_address; should become tmp <= start_address + 1;
You may also need to replace your assignment of Q with:
Q <= start_address when new_start_address = '1' else tmp;
Again, it's hard to know for sure without a timing diagram.

Resources