the task is to make a 4-Bit Multiplier that uses FSM. the steps would be 1) multiply 2) shift 3) add.
1011 (this is 11 in binary)
x 1110 (this is 14 in binary)
0000 (this is 1011 x 0)
1011 (this is 1011 x 1, shifted 1 position to the left)
1011 (this is 1011 x 1, shifted 2 positions to the left)
1011 (this is 1011 x 1, shifted three positions to the left)
10011010 (this is 154 in binary)
here are my codes:
library IEEE;
entity test is
Port ( CLK : in STD_LOGIC;
Input : in STD_LOGIC_VECTOR (3 downto 0);
Confirm : in STD_LOGIC;
Output : out STD_LOGIC_VECTOR (7 downto 0));
end test;
architecture Behavioral of test is
type state is (R,S0,S1,S2,S3,S4);
signal pstate, nstate: state;
signal A_sig, B_sig: STD_LOGIC_VECTOR(3 downto 0);
variable temp_var: STD_LOGIC_VECTOR(3 downto 0);
variable tempMult_var,tempProd_var: STD_LOGIC_VECTOR(7 downto 0);
case pstate is
when R =>
nstate <= S0;
tempMult_var := (others => '0');
tempProd_var := (others => '0');
A_sig <= (others => '0');
B_sig <= (others => '0');
Output <= (others => '0');
when S0 =>
nstate <= S0;
if (Confirm = '1') then
A_sig <= Input;
nstate <= S1;
end if;
when S1 =>
nstate <= S1;
if (Confirm = '0') then
nstate <= S2;
end if;
when S2 =>
nstate <= S2;
if (Confirm = '1') then
B_sig <= Input;
nstate <= S3;
end if;
when S3 =>
nstate <= S3;
if (Confirm = '0') then
nstate <= S4;
end if;
when S4 =>
nstate <= S0;
for x in 0 to 3 loop
temp_var := (A_sig AND (B_sig(x)&B_sig(x)&B_sig(x)&B_sig(x) ) );
tempMult_var := "0000" & temp_var;
if (x=0) then tempMult_var := tempMult_var;
elsif (x=1) then tempMult_var := tempMult_var(6 downto 0)&"0";
elsif (x=2) then tempMult_var := tempMult_var(5 downto 0)&"00";
elsif (x=3) then tempMult_var := tempMult_var(4 downto 0)&"000";
end if;
tempProd_var := tempProd_var + tempMult_var;
end loop;
Output <= tempProd_var;
tempProd_var := (others => '0');
end case;
end process;
if RESET = '1' then
pstate <= R;
elsif rising_edge(CLK) then
pstate <= nstate;
end if;
end process;
end Behavioral;
here are the warnings
after "Synthesize - XST"
WARNING:Xst - Property "use_dsp48" is not applicable for this technology.
WARNING:Xst:737 - Found 4-bit latch for signal <B_sig>.
WARNING:Xst:737 - Found 8-bit latch for signal <Output>.
WARNING:Xst:737 - Found 4-bit latch for signal <A_sig>.
after "Implement Design"
WARNING:Route:447 - CLK Net:A_sig_not0001 may have excessive skew because
WARNING:Route:447 - CLK Net:B_sig_not0001 may have excessive skew because
after "Generate Programming File"
WARNING:PhysDesignRules:372 - Gated clock. Clock net A_sig_not0001 is sourced by
WARNING:PhysDesignRules:372 - Gated clock. Clock net B_sig_not0001 is sourced by
WARNING:PhysDesignRules:372 - Gated clock. Clock net Output_or0000 is sourced by
the simulations are correct but the actual board doesn't have the correct output. what might be the problem?

My humble advice:
Combine your two processes into a single clocked process.
That way you avoid a whole category of asynchronous logic mistakes that are easy for a beginner to make to painful to track down.

Also, whenever you see warnings about latches or gated clocks, revisit your code - both are clear indicators that something is most probably wrong.
Latches typically come from combinatorial processes where signals are only assigned in some cases. For instance A_sig is not assigned in S0, if confirm = 0, and will thus cause a latch to be inferred. In this case, just make sure that A_sig is always set to something, no matter the combination of the control signal values.
In this case the gated clocks probably come from your rather complex combinatorial process, but mostly it's from signals generated by logic for clocking synchronous processes. This can lead to all kind of problems (high FPGA clock line usage and timing/routing issues), especially if your not aware that you're creating additional clocking domains. This can mostly be avoided by running the process in question on the main (be it global or local) system clock and using a clock enable to scale it down if necessary.


How to remove redundant processes in VHDL

I am unfortunately new to VHDL but not new to software development. What is the equivalency to functions in VHDL? Specifically, in the code below I need to debounce four push buttons instead of one. Obviously repeating my process code four times and suffixing each of my signals with a number to make them unique for the four instances is not the professional nor correct way of doing this. How do I collapse all this down into one process "function" to which I can "pass" the signals so I can excise all this duplicate code?
-- Debounced pushbutton examples
library IEEE;
entity pushbutton is
counter_size : integer := 19 -- counter size (19 bits gives 10.5ms with 50MHz clock)
CLK : in std_logic; -- input clock
BTN : in std_logic_vector(0 to 3); -- input buttons
AN : out std_logic_vector(0 to 3); -- 7-segment digit anodes ports
LED : out std_logic_vector(0 to 3) -- LEDs
end pushbutton;
architecture pb of pushbutton is
signal flipflops0 : std_logic_vector(1 downto 0); -- input flip flops
signal flipflops1 : std_logic_vector(1 downto 0);
signal flipflops2 : std_logic_vector(1 downto 0);
signal flipflops3 : std_logic_vector(1 downto 0);
signal counter_set0 : std_logic; -- sync reset to zero
signal counter_set1 : std_logic;
signal counter_set2 : std_logic;
signal counter_set3 : std_logic;
signal counter_out0 : std_logic_vector(counter_size downto 0) := (others => '0'); -- counter output
signal counter_out1 : std_logic_vector(counter_size downto 0) := (others => '0');
signal counter_out2 : std_logic_vector(counter_size downto 0) := (others => '0');
signal counter_out3 : std_logic_vector(counter_size downto 0) := (others => '0');
signal button0 : std_logic; -- debounce input
signal button1 : std_logic;
signal button2 : std_logic;
signal button3 : std_logic;
signal result0 : std_logic; -- debounced signal
signal result1 : std_logic;
signal result2 : std_logic;
signal result3 : std_logic;
-- Make sure Mercury BaseBoard 7-Seg Display is disabled (anodes are pulled high)
AN <= (others => '1');
-- Feed buttons into debouncers
button0 <= BTN(0);
button1 <= BTN(1);
button2 <= BTN(2);
button3 <= BTN(3);
-- Start or reset the counter at the right time
counter_set0 <= flipflops0(0) xor flipflops0(1);
counter_set1 <= flipflops1(0) xor flipflops1(1);
counter_set2 <= flipflops2(0) xor flipflops2(1);
counter_set3 <= flipflops3(0) xor flipflops3(1);
-- Feed LEDs from the debounce circuitry
LED(0) <= result0;
LED(1) <= result1;
LED(2) <= result2;
LED(3) <= result3;
-- Debounce circuit 0
process (CLK)
if (CLK'EVENT and CLK = '1') then
flipflops0(0) <= button0;
flipflops0(1) <= flipflops0(0);
if (counter_set0 = '1') then -- reset counter because input is changing
counter_out0 <= (others => '0');
elsif (counter_out0(counter_size) = '0') then -- stable input time is not yet met
counter_out0 <= counter_out0 + 1;
else -- stable input time is met
result0 <= flipflops0(1);
end if;
end if;
end process;
-- Debounce circuit 1
process (CLK)
if (CLK'EVENT and CLK = '1') then
flipflops1(0) <= button1;
flipflops1(1) <= flipflops1(0);
if (counter_set1 = '1') then -- reset counter because input is changing
counter_out1 <= (others => '0');
elsif (counter_out1(counter_size) = '0') then -- stable input time is not yet met
counter_out1 <= counter_out1 + 1;
else -- stable input time is met
result1 <= flipflops1(1);
end if;
end if;
end process;
-- Debounce circuit 2
process (CLK)
if (CLK'EVENT and CLK = '1') then
flipflops2(0) <= button2;
flipflops2(1) <= flipflops2(0);
if (counter_set2 = '1') then -- reset counter because input is changing
counter_out2 <= (others => '0');
elsif (counter_out2(counter_size) = '0') then -- stable input time is not yet met
counter_out2 <= counter_out2 + 1;
else -- stable input time is met
result2 <= flipflops2(1);
end if;
end if;
end process;
-- Debounce circuit 3
process (CLK)
if (CLK'EVENT and CLK = '1') then
flipflops3(0) <= button3;
flipflops3(1) <= flipflops3(0);
if (counter_set3 = '1') then -- reset counter because input is changing
counter_out3 <= (others => '0');
elsif (counter_out3(counter_size) = '0') then -- stable input time is not yet met
counter_out3 <= counter_out3 + 1;
else -- stable input time is met
result3 <= flipflops3(1);
end if;
end if;
end process;
end pb;
VHDL has functions but function calls are expressions and not statements or expression statements as in some programming languages. A function call always returns a value of a type and an expression can't represent a portion of a design hierarchy.
Consider the other subprogram class procedures which are statements instead.
The debouncer processes and associated declarations can also be simplified without using a procedure:
library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
entity pushbutton is
generic (
counter_size: integer := 19 -- The left bound of debounce counters
port (
clk: in std_logic;
btn: in std_logic_vector(0 to 3);
an: out std_logic_vector(0 to 3);
led: out std_logic_vector(0 to 3)
end entity pushbutton;
architecture pb1 of pushbutton is
-- There are two flip flops for each of four buttons:
subtype buttons is std_logic_vector(0 to 3);
type flip_flops is array (0 to 1) of buttons;
signal flipflops: flip_flops;
signal counter_set: std_logic_vector(0 to 3);
use ieee.numeric_std.all;
type counter is array (0 to 3) of
unsigned(counter_size downto 0);
signal counter_out: counter := (others => (others => '0'));
an <= (others => '1');
counter_set <= flipflops(0) xor flipflops(1);
process (clk)
if rising_edge (clk) then
flipflops(0) <= btn;
flipflops(1) <= flipflops(0);
for i in 0 to 3 loop
if counter_set(i) = '1' then
counter_out(i) <= (others => '0');
elsif counter_out(i)(counter_size) = '0' then
counter_out(i) <= counter_out(i) + 1;
led(i) <= flipflops(1)(i);
end if;
end loop;
end if;
end process;
end architecture pb1;
Moving part of the design specification into a procedure:
architecture pb2 of pushbutton is
-- There are two flip flops for each of four buttons:
subtype buttons is std_logic_vector(0 to 3);
type flip_flops is array (0 to 1) of buttons;
signal flipflops: flip_flops;
signal counter_set: std_logic_vector(0 to 3);
use ieee.numeric_std.all;
type counter is array (0 to 3) of
unsigned(counter_size downto 0);
signal counter_out: counter := (others => (others => '0'));
procedure debounce (
-- Can eliminate formals of mode IN within the scope of their declaration:
-- signal counter_set: in std_logic_vector (0 to 3);
-- signal flipflops: in flip_flops;
signal counter_out: inout counter;
signal led: out std_logic_vector(0 to 3)
) is
for i in 0 to 3 loop
if counter_set(i) = '1' then
counter_out(i) <= (others => '0');
elsif counter_out(i)(counter_size) = '0' then
counter_out(i) <= counter_out(i) + 1;
led(i) <= flipflops(1)(i);
end if;
end loop;
end procedure;
an <= (others => '1');
counter_set <= flipflops(0) xor flipflops(1);
process (clk)
if rising_edge (clk) then
flipflops(0) <= btn;
flipflops(1) <= flipflops(0);
-- debounce(counter_set, flipflops, counter_out, led);
debounce (counter_out, led);
end if;
end process;
end architecture pb2;
Here the procedure serves as a collection of sequential statements and doesn't save any lines of code.
Sequential procedure calls can be useful to hide repetitious clutter. The clutter has been removed already by consolidating declarations and using the loop statement. There's a balancing act between the design entry effort, code maintenance effort and user readability, which can also be affected by coding style. Coding style is also affected by RTL constructs implying hardware.
Moving the clock evaluation into a procedure would require the procedure call be be a concurrent statement, similar to an instantiation, which you already have. It doesn't seem worthwhile here should you consolidate signals declared as block declarative items in the architecture body or when using a loop statement.
Note that result and button declarations have been eliminated. Also the use of package numeric_std and type unsigned for the counters prevents inadvertent assignment to other objects with the same subtype. The counter values are treated as unsigned numbers while counter_set for instance is not.
Also there's an independent counter for each input being debounced just as in the original. Without independent counters some events might be lost for independent inputs when a single counter is repetitively cleared.
This code hasn't been validated by simulation, lacking a testbench. With the entity both architectures analyze and elaborate.
There doesn't appear to be anything here other than sequential statements now found in a for loop that would benefit from a function call. Because a function call returns a value the type of that value would either need to be a composite (here a record type) or be split into separate function calls for each assignment target.
There's also the generate statement which can elaborate zero or more copies of declarations and concurrent statements (here a process) as block statements with block declarative items. Any signal used only in an elaborated block can be a block declarative item.
architecture pb3 of pushbutton is
for i in btn'range generate
signal flipflops: std_logic_vector (0 to 1);
signal counter_set: std_logic;
signal counter_out: unsigned (counter_size downto 0) :=
(others => '0');
counter_set <= flipflops(0) xor flipflops(1);
process (clk)
if rising_edge (clk) then
flipflops(0) <= btn(i);
flipflops(1) <= flipflops(0);
if counter_set = '1' then
counter_out <= (others => '0');
elsif counter_out(counter_size) = '0' then
counter_out <= counter_out + 1;
led(i) <= flipflops(1);
end if;
end if;
end process;
end generate;
end architecture pb3;
The OP pointed out an error made in the above code due to a lack of simulation and complexity hidden by abstraction when synthesizing architecture pb2. While the time for the debounce counter was given at 10.5 ms (50 MHz clock) the name of the generic (counter_size) is also actually the left bound of the counter (given as an unsigned binary counter using type unsigned).
The mistake (two flip flops in the synchronizer for each of four buttons) and simply acceding to the OP's naming convention with respect to the counter has been corrected in the above code.
The OP's synthesis error in the comment relates to the requirement there be a matching element for each element on the left hand or right hand of an aassignment statement.
Without synthesizing the code (which the OP did) the error can't be found without simulation. Because the only thing necessary to find the particular error assigning flipflops(0) is the clock a simple testbench can be written:
use ieee.std_logic_1164.all;
entity pushbutton_tb is
end entity;
architecture fum of pushbutton_tb is
signal clk: std_logic := '0';
signal btn: std_logic_vector (0 to 3);
signal an: std_logic_vector(0 to 3);
signal led: std_logic_vector(0 to 3);
wait for 0.5 ms;
clk <= not clk;
if now > 50 ms then
end if;
end process;
entity work.pushbutton (pb2)
generic map (
counter_size => 4 -- FOR SIMULATION
port map (
clk => clk,
btn => btn,
an => an,
led => led
btn <= (others => '0');
wait for 20 ms;
btn(0) <= '1';
wait for 2 ms;
btn(1) <= '1';
wait for 3 ms;
btn(2) <= '1';
wait for 6 ms;
btn(3) <= '1';
end process;
end architecture;
The corrected code and a testbench to demonstrate there are no matching element errors in assignment during simulation.
Simulation was provided for both architectures with identical results.
The generic was used to reduce the size of the debounce counters using a 1 millisecond clock in the testbench (to avoid simulation time with 50 MHz clock events that don't add to the narrative).
Here's the output of the first architecture's simulation:
The caution here is that designs should be simulated. There's a class of VHDL semantic error conditions that are checked only at runtime (or in synthesis).
Added abstraction for reducing 'uniquified' code otherwise identically performing can introduce such errors.
The generate statement wouldn't have that issue using names in a design hierarchy:
The concurrent statements and declarations found in a generate statement are replicated in any generated block statements implied by the generate statement. Each block statement represents a portion of a design hierarchy.
There's been a trade off between design complexity and waveform display organization for debugging.
A design description depending on hiding repetitious detail should be simulated anyway. Here there are two references to the generate parameter i used in selected names, susceptible to the same errors as ranges should parameter substitution be overlooked.
A multiple bit debouncing circuit might look like this:
library IEEE;
use IEEE.std_logic_1164.all;
use IEEE.numeric_std.all;
use work.Utilities.all;
entity Debouncer is
generic (
CLOCK_PERIOD_NS : positive := 10;
DEBOUNCE_TIME_MS : positive := 3;
BITS : positive
port (
Clock : in std_logic;
Input : in std_logic_vector(BITS - 1 downto 0);
Output : out std_logic_vector(BITS - 1 downto 0) := (others => '0')
end entity;
architecture rtl of Debouncer is
genBits: for i in Input'range generate
constant DEBOUNCE_COUNTER_MAX : positive := (DEBOUNCE_TIME_MS * 1000000) / CLOCK_PERIOD_NS;
signal DebounceCounter : signed(DEBOUNCE_COUNTER_BITS downto 0) := to_signed(DEBOUNCE_COUNTER_MAX - 3, DEBOUNCE_COUNTER_BITS + 1);
process (Clock)
if rising_edge(Clock) then
-- restart counter, whenever Input(i) was unstable within DEBOUNCE_TIME_MS
if (Input(i) /= Output(i)) then
DebounceCounter <= DebounceCounter - 1;
DebounceCounter <= to_signed(DEBOUNCE_COUNTER_MAX - 3, DebounceCounter'length);
end if;
-- latch input bit, if input was stable for DEBOUNCE_TIME_MS
if (DebounceCounter(DebounceCounter'high) = '1') then
Output(i) <= Input(i);
end if;
end if;
end process;
end generate;
end architecture;
In stead of a counter size, it expects the user to provide a frequency (as period in nanoseconds) and a debounce time (in milliseconds).
The referenced package implements a log2 function.

Trying to make a 4-bit multiplier in VHDL with 3x4 keypad input and 2x16 LCD to be implemented on a Spartan 3E board

everyone. I'm trying to make a 4-bit multiplier in VHDL. It is to be implemented on a Spartan 3E board using the built-in 2x16 LCD and a 3x4 keypad through a C922 IC.
Using it goes like this: user inputs a number through the keypad, presses a button for confirmation, enters a second number, presses the confirmation button again, and the product shows up on the LCD.
So far, the codes for the keypad+c922 and the LCD are okay. The code for the multiplier is almost okay. The problem is that the confirmation button only works when another number (which is not ultimately used) is pressed.
Here is my code. Following at are screenshots of the Xilinx simulation
library IEEE;
entity multi_4bit is
Port ( CLK : in STD_LOGIC;
Input : in STD_LOGIC_VECTOR (3 downto 0);
Confirm : in STD_LOGIC;
Output : out STD_LOGIC_VECTOR (7 downto 0));
end multi_4bit;
architecture Behavioral of multi_4bit is
type state is (R,S0,S1,S2,S3,S4);
signal pstate, nstate: state;
signal A_sig, B_sig: STD_LOGIC_VECTOR(3 downto 0);
state_transition: process(CLK,RESET)
if RESET = '1' then
pstate <= R;
elsif rising_edge(CLK) then
pstate <= nstate;
end if;
end process;
nstate_output: process(pstate,DAVBL,Input)
variable temp_var: STD_LOGIC_VECTOR(3 downto 0);
variable tempMult_var,tempProd_var: STD_LOGIC_VECTOR(7 downto 0);
case pstate is
when R =>
nstate <= S0;
tempMult_var := (others => '0');
tempProd_var := (others => '0');
A_sig <= (others => '0');
B_sig <= (others => '0');
Output <= (others => '0');
when S0 =>
nstate <= S0;
if (DAVBL = '1') then
A_sig <= Input;
nstate <= S1;
end if;
when S1 =>
nstate <= S1;
if (Confirm = '1') then
nstate <= S2;
end if;
when S2 =>
nstate <= S2;
if (DAVBL = '1') then
B_sig <= Input;
nstate <= S3;
end if;
when S3 =>
nstate <= S3;
if (Confirm = '1') then
nstate <= S4;
end if;
when S4 =>
nstate <= S0;
for x in 0 to 3 loop
temp_var := (A_sig AND (B_sig(x)&B_sig(x)&B_sig(x)&B_sig(x) ) );
tempMult_var := "0000" & temp_var;
if (x=0) then tempMult_var := tempMult_var;
elsif (x=1) then tempMult_var := tempMult_var(6 downto 0)&"0";
elsif (x=2) then tempMult_var := tempMult_var(5 downto 0)&"00";
elsif (x=3) then tempMult_var := tempMult_var(4 downto 0)&"000";
end if;
tempProd_var := tempProd_var + tempMult_var;
end loop;
Output <= tempProd_var;
tempProd_var := (others => '0');
end case;
end process;
end Behavioral;
The simulation when no number is pressed with the confirmation button:
The simulation when a number is pressed with the confirmation button:
I've been going through my code for an hour now but still can't see what's wrong.
Thanks in advance to anyone who can help.
As mentioned by fru1tbat, Brian, and David, you have a problem with the sensitivity list in the combinatorial portion of your state machine. Specifically, you are missing the Confirm input. Since the Confirm input is not in the sensitivity list, the state machine will not "wake up" to evaluate new outputs/state transitions when confirm changes state. You need to wait for another condition (a different button press) in order to evaluate properly and update the output.
There are multiple ways that this can be resolved.
You can make sure you have a complete sensitivity list for your state machine logic
With VHDL-2008, you can use process(all) to automatically list all necessary inputs in the sensitivity list (your compiler support may vary).
You can choose to use a "single process state machine" style, which avoids this issue. However, different styles of state machines have their strengths and pitfalls that need to be considered.

Having troubles with running FSM on Nexys2

I am trying to run a simple FSM where LEDs are scanned. I have applied this logic by shifting the bits to left, used & operator for that. It does not shift at all only the LSB glows and that is it, i slowed down the clock as well, using 1.5Hz clock. Will someone please tell me whats wrong here.
library IEEE;
entity scan is
Port (
clk : in STD_LOGIC;
led : out STD_LOGIC_VECTOR (7 downto 0);
reset : in STD_LOGIC
end scan;
architecture Behavioral of scan is
Type state is
signal n_state : state;
signal c_state : state;
signal input_temp :unsigned (7 downto 0):= "00000001";
--------------------------CURRENT STATE ASSIGNMENT------------------------
STATE_ASSIGNMENT: process (clk, reset)
if (reset = '1') then
c_state <= RESET_ST;
elsif (clk'event and clk = '1') then
c_state <= n_state;
end if;
----------------------------- INTPUT BLOCK--------------------------------
INPUT_BLOCK : process (c_state)
case (c_state) is
when RESET_ST =>
input_temp <= "00000001";
n_state <= S1;
when S1 =>
input_temp <= input_temp (6 downto 0) & '0';
n_state <= S1;
when others =>
n_state <= RESET_ST;
end case;
end process;
----------------------------- OUTPUT BLOCK--------------------------------
OUTPUT_BLOCK : process (c_state, input_temp)
case (c_state) is
when RESET_ST =>
led <= std_logic_vector (input_temp);
when S1 =>
led <= std_logic_vector (input_temp);
when others =>
led <= (others => '1');
end case;
end process OUTPUT_BLOCK;
end Behavioral;
There are two immediately visible things wrong.
First counter is not declared (comment it out, easy enough).
Second, once in S1 n_state <= S1, in other words you go to S1 and sit there. The consequence of this is that process INPUT_BLOCK doesn't have a trigger event - the sensitivity contains only c_state and there is no further change in c_state.
I'd imagine Brian Drummond would be telling you about now to use one process for your FSM. Essentially input_temp should be changed to something with storage and moved into the clocked process.
You could note there isn't anything to detect when input_temp goes static (all '0's), either.
From your comment:
Okay so if i add the next state i.e. n_state in the sensitivity list,
will it work?
No. If you look at the waveform above n_state always contains S1.
Secondly yeah when it goes all 0's I wont see anything,
but what about the shifting part?
Eventually that one '1' will get lost once it has reached input_temp(7).
I defined the outputs explicitly for
each state, should i put a limit here?
There are three things you could do. 1. let the outputs all go to '0's, 2. recirculate the '1' (a Johnson counter) or 3. stop with some LED displaying.
If the states were one hot the new 8 states could drive an LED each. – David Koontz 2 hours ago
Could you by any chance show me an example or something? It would help me more to understand
In general this is not the correct venue for teaching basic VHDL or digital design skills, certainly not in a comment thread. It's a venue for asking and answering specific VHDL questions. See How do I ask a good question?
You asked:
Will someone please tell me whats wrong here.
I answered, including a picture.
Right about here you could note that the waveform image above conflicts with the first paragraph of your question:
It does not shift at all only the LSB glows and that is it, i slowed down the clock as well, using 1.5Hz clock.
If you note in the waveform it shifts exactly once, with your code unmodified (other than to remove the assignment to an undeclared counter which you edited out of your question, see your first comment below).
What you have defined is a two state state machine either reset or shift. It doesn't work because it's not properly written. Essential it describes an intended shift register (input_temp) that currently shifts left and empties. Your state is a flip flop run off of an asynchronous reset, that when released simply toggles to the other state and supposedly enabled shifts.
Implement and 8 bit shift register that shift's left (or is connected in reverse order), and can be implemented with a synchronous load (to "00000001") hooked up to reset. 8 clocks later it's all '0's.
There are nine states defined (one for each of the LEDs that are illuminated and one where all LEDS are extinguished) You can add a 10th state by adding that state flip flop in. You could use 10 flip flops for a one hot state machine, 8 flip flops for just a shift register or 9 to include c_state (and the reset holdover).
I could I suppose generate three different architectures for the above two paragraphs but I'm not going to do that.
Here's the simplest implementation with the least amount of change to your code:
architecture foo of scan is
type state is ( RESET_ST, S1 );
signal n_state: state;
signal c_state: state;
-- signal input_temp: unsigned (7 downto 0):= "00000001";
signal shft_reg: std_logic_vector (7 downto 1);
process (clk, reset)
if reset = '1' then
c_state <= RESET_ST;
-- counter <= (others => '0');
shft_reg <= (others => '0');
elsif clk'event and clk = '1' then
c_state <= n_state;
if c_state = RESET_ST then
shft_reg <= shft_reg (6 downto 1) & '1';
elsif shft_reg /= "1000000" then
shft_reg <= shft_reg (6 downto 1) & '0';
end if;
end if;
end process;
--input_block :
process (c_state)
case (c_state) is
when RESET_ST =>
-- input_temp <= "00000001";
n_state <= S1;
when S1 =>
-- input_temp <= input_temp (6 downto 0) & '0';
n_state <= S1;
when others =>
n_state <= RESET_ST;
end case;
end process;
-- output_block:
-- process (c_state, input_temp)
-- begin
-- case (c_state) is
-- when RESET_ST =>
-- led <= std_logic_vector (input_temp);
-- when S1 =>
-- led <= std_logic_vector (input_temp);
-- when others =>
-- led <= (others => '1');
-- end case;
-- end process;
-- LED0_OUT:
-- led(0) <= '1' when c_state = RESET_ST else '0';
process (c_state, shft_reg)
if c_state = RESET_ST then
led(0) <= '1';
led(0) <= '0';
end if;
led (7 downto 1) <= shft_reg; -- shft_reg(7 downto 1)
end process;
end architecture foo;
library ieee;
use ieee.std_logic_1164.all;
entity scan_tb is
end entity;
architecture foo of scan_tb is
signal clk: std_logic := '0';
signal reset: std_logic := '1';
signal led: std_logic_vector ( 7 downto 0);
entity work.scan
port map (
clk => clk,
led => led,
reset => reset
wait for 0.33 sec; -- one half clock period, 1.5 Hz
clk <= not clk;
if Now > 20 sec then
end if;
end process;
wait until rising_edge(clk);
wait for 0.33 sec;
wait until rising_edge(clk);
reset <= '0';
end process;
end architecture;
And here's what the waveforms look like:
Note the Radix for led has been changed in the waveform to binary.
Also note that the first part of the two waveforms match. I also added a shft_reg state recognizer to freeze shft_reg when led(7) is set.
You could also note there's an optimization. The first LED is driven off c_state, the remaining 7 are driven off the 7 bit shift register (shft_reg). Also of note that there are only 8 flip flops used.
And as sonicwave notes in a comment to your question you should really simulate this stuff first, so here's a simple test bench.
This was simulated, using your entity declaration with the use clause for package numeric_std removed (shft_reg is type std_logic_vector), a new architecture foo and the entity/architecture pair for scan_tb using ghdl-0.31:
david_koontz#Macbook: ghdl -a scan.vhdl
david_koontz#Macbook: ghdl -e scan_tb
david_koontz#Macbook: ghdl -r scan_tb --wave=scan_tb.ghw
On a Mac running OS X 10.9.3, Where scan_tb.ghw is a ghdl specific waveform dump file format highly suited for VHDL.
Now, please no more slipping more questions in on the comments to the question you initially asked. Also you could have commented out the assignment to the undeclared signal counter in your sample code instead of editing it out. It ruins the continuity between questions and answers.
The state assignment process can be written without evaluating c_state:
process (clk, reset)
if reset = '1' then
c_state <= RESET_ST;
-- counter <= (others => '0');
shft_reg <= (others => '0');
elsif clk'event and clk = '1' then
c_state <= n_state;
-- if c_state = RESET_ST then
if shft_reg = "0000000" then
shft_reg <= shft_reg (6 downto 1) & '1';
elsif shft_reg /= "1000000" then
shft_reg <= shft_reg (6 downto 1) & '0';
end if;
end if;
end process;
And it does the same thing.
Now comment a bit more of it out:
process (clk, reset)
if reset = '1' then
c_state <= RESET_ST;
-- counter <= (others => '0');
shft_reg <= (others => '0');
elsif clk'event and clk = '1' then
c_state <= n_state;
-- if c_state = RESET_ST then
if shft_reg = "0000000" then
shft_reg <= shft_reg (6 downto 1) & '1';
-- elsif shft_reg /= "1000000" then
shft_reg <= shft_reg (6 downto 1) & '0';
end if;
end if;
end process;
And make the same decision change in the LEDOUT process:
process (shft_reg)
if shft_reg = "0000000" then
led(0) <= '1';
led(0) <= '0';
end if;
led (7 downto 1) <= shft_reg; -- shft_reg(7 downto 1)
end process;
And you get the scanning LEDs to keep on scanning:
We've switch LED(0) to be dependent on no other shft_reg positions being set to '1' (not '0').

4bit ALU VHDL code

I am writing a code for a 4 bit ALU and I have a problem when I want to write for shift left operation. I have two inputs (operandA and operandB ). I want to convert the operandB into decimal (for example "0010" into '2') and then shift operandA 2 times to the left. my code is compiled but I am not sure that it is true. Thank you in advance.
entity ALU is
reset_n : in std_logic;
clk : in std_logic;
OperandA : in std_logic_vector(3 downto 0);
OperandB : in std_logic_vector(3 downto 0);
Operation : in std_logic_vector(2 downto 0);
Start : in std_logic;
Result_Low : out std_logic_vector(3 downto 0);
Result_High : out std_logic_vector(3 downto 0);
Ready : out std_logic;
Errorsig : out std_logic);
end ALU;
architecture behavior of ALU is
signal loop_nr : integer range 0 to 15;
process (reset_n, clk, operation)
variable tempHigh : std_logic_vector(4 downto 0);
if (reset_n = '0') then
Result_Low <= (others => '0');
Result_High <= (others => '0');
Errorsig <= '0';
elsif (clk'event and clk = '1') then
case operation is
when "001" =>
for i in 0 to loop_nr loop
loop_nr <= to_integer(unsigned(OperandB));
Result_Low <= OperandA(2 downto 0)&'0';
Result_High <= tempHigh(2 downto 0) & OperandA(3);
end loop;
Ready <= '1';
Errorsig <= '0';
when "010" =>
Result_Low <= OperandB(0)& OperandA(3 downto 1);
Result_High <= OperandB(3 downto 1);
Ready <= '1';
when others =>
Result_Low <= (others => '0');
ready <= '0';
Errorsig <= '0';
end case;
end if;
end process;
end behavior;
For shifting left twice the syntax should be the following:
A <= A sll 2; -- left shift logical 2 bits
I don't quite understand why is it required to convert operand B in decimal. It can be used as a binary or decimal value or for that matter hexadecimal value at any point of time irrelevant of the base it was saved in.
The operator sll may not always work as expected before VHDL-2008 (read more
so consider instead using functions from ieee.numeric_std for shifting, like:
y <= std_logic_vector(shift_left(unsigned(OperandA), to_integer(unsigned(OperandB))));
Note also that Result_High is declared in port as std_logic_vector(3 downto
0), but is assigned in line 41 as Result_High <= OperandB(3 downto 1), with
assign having one bit less than size.
Assumption for code is that ieee.numeric_std is used.
The reason you've been urged to use the likes of sll is because in general
synthesis tools don't support loop statements with non-static bounds
(loop_nr). Loops are unfolded which requires a static value to determine how
many loop iterations are unfolded (how much hardware to generate).
As Morten points out your code doesn't analyze, contrary to you assertion
that it compiles.
After inserting the following four lines at the beginning of your code we see
an error at line 41:
library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
--(blank, a spacer that doesn't show up in the code highlighter)
ghdl -a ALU.vhdl
ALU.vhdl:41:26: length of value does not match length of target
ghdl: compilation error
Which looks like
Result_High <= '0' & OperandB(3 downto 1);
was intended in the case statement, choice "010" (an srl equivalent hard
coded to a distance of 1, presumably to match the correct behavior of the sll
equivalent). After which your design description analyzes.
Further there are other algorithm description errors not reflected in VHDL
syntax or semantic errors.
Writing a simple test bench:
library ieee;
use ieee.std_logic_1164.all;
entity alu_tb is
end entity;
architecture foo of alu_tb is
signal reset_n: std_logic := '0';
signal clk: std_logic := '0';
signal OperandA: std_logic_vector(3 downto 0) :="1100"; -- X"C"
signal OperandB: std_logic_vector(3 downto 0) :="0010"; -- 2
signal Operation: std_logic_vector(2 downto 0):= "001"; -- shft right
signal Start: std_logic; -- Not currently used
signal Result_Low: std_logic_vector(3 downto 0);
signal Result_High: std_logic_vector(3 downto 0);
signal Ready: std_logic;
signal Errorsig: std_logic;
entity work.ALU
port map (
reset_n => reset_n,
clk => clk,
OperandA => OperandA,
OperandB => OperandB,
Operation => Operation,
Start => Start,
Result_Low => Result_Low,
Result_High => Result_High,
Ready => Ready,
Errorsig => Errorsig
wait for 10 ns;
clk <= not clk;
if Now > 100 ns then
end if;
end process;
wait for 20 ns;
reset_n <= '1';
end process;
end architecture;
Gives us a demonstration:
The first thing that sticks out is that Result_High gets some 'U's. This is
caused by tempHigh not being initialized or assigned.
The next thing to notice is that the shift result is wrong (both Result_Low
and Result_High). I'd expect you'd want a "0011" in Result_High and "0000" in
You see the result of exactly one left shift - ('U','U','U','1') in
Result_High and "1000" in Result_Low.
This is caused by executing a loop statement in delta cycles (no intervening
simulation time passage). In a process statement there is only one driver for
each signal. The net effect of that is that there is only one future value
for the current simulation time and the last value assigned is going to be
the one that is scheduled in the projected output waveform for the current
simulation time. (Essentially, the assignment in the loop statement to a
signal occurs once, and because successive values depend on assignment
occurring it looks like there was only one assignment).
There are two ways to fix this behavior. The first is to use variables
assigned inside the loop and assign the corresponding signals to the
variables following the loop statement. As noted before the loop bound isn't
static and you can't synthesis the loop.
The second way is to eliminate the loop by executing the shift assignments
sequentially. Essentially 1 shift per clock, signaling Ready after the last
shift occurs.
There's also away to side step the static bounds issue for loops by using a
case statement (or in VHDL 2008 using a sequential conditional signal
assignment of sequential selected signal assignment should your synthesis
tool vendor support them). This has the advantage of operating in one clock.
Note all of these require having an integer variable holding
And all of this can be side stepped when your synthesis tool vendor supports
sll (and srl for the other case) or SHIFT_LEFT and SHIFT_RIGHT from package
numeric_std, and you are allowed to use them.
A universal (pre VHDL 2008) fix without using sll or SHIFT_LEFT might be:
process (reset_n, clk, operation)
variable tempHigh : std_logic_vector(4 downto 0);
variable loop_int: integer range 0 to 15;
if (reset_n = '0') then
Result_Low <= (others => '0');
Result_High <= (others => '0');
Errorsig <= '0';
elsif (clk'event and clk = '1') then
case operation is
when "001" =>
loop_int := to_integer(unsigned(OperandB));
case loop_int is
when 0 =>
Result_Low <= OperandA;
Result_High <= (others => '0');
when 1 =>
Result_Low <= OperandA(2 downto 0) & '0';
Result_High <= "000" & OperandA(3);
when 2 =>
Result_Low <= OperandA(1 downto 0) & "00";
Result_High <= "00" & OperandA(3 downto 2);
when 3 =>
Result_Low <= OperandA(0) & "000";
Result_High <= "0" & OperandA(3 downto 1);
when 4 =>
Result_Low <= (others => '0');
Result_High <= OperandA(3 downto 0);
when 5 =>
Result_Low <= (others => '0');
Result_High <= OperandA(2 downto 0) & '0';
when 6 =>
Result_Low <= (others => '0');
Result_High <= OperandA(1 downto 0) & "00";
when 7 =>
Result_Low <= (others => '0');
Result_High <= OperandA(0) & "000";
when others =>
Result_Low <= (others => '0');
Result_High <= (others => '0');
end case;
-- for i in 0 to loop_nr loop
-- loop_nr <= to_integer(unsigned(OperandB));
-- Result_Low <= OperandA(2 downto 0)&'0';
-- Result_High <= tempHigh(2 downto 0) & OperandA(3);
-- end loop;
Ready <= '1';
Errorsig <= '0';
Which gives:
The right answer (all without using signal loop_nr).
Note that all the choices in the case statement aren't covered by the simple
test bench.
And of course like most things there's more than two ways to get the desired
You could use successive 2 to 1 multiplexers for both Result_High and
Result_Low, with each stage fed from the output of the previous stage (or
OperandA for the first stage) as the A input the select being the appropriate
'bit' from OperandB, and the B input to the multiplexers the previous stage
output shifted by 1 logically ('0' filled).
The multiplexers can be functions, components or procedure statements. By
using a three to one multiplexer you can implement both symmetrical shift
Operation specified operations (left and right). Should you want to include signed shifts,
instead of '0' filled right shifts you can fill with the sign bit value. ...
You should also be assigning Ready <= '0' for those cases where valid
successive Operation values can be dispatched.
And because your comment on one of the answers requires the use of a loop with an integer value:
process (reset_n, clk, operation)
variable tempHigh : std_logic_vector(4 downto 0);
variable tempLow: std_logic_vector(3 downto 0); --added
variable loop_int: integer range 0 to 15; --added
if (reset_n = '0') then
Result_Low <= (others => '0');
Result_High <= (others => '0');
Errorsig <= '0';
elsif (clk'event and clk = '1') then
case operation is
when "001" =>
tempLow := OperandA; --added
tempHigh := (others => '0'); --added
loop_int := to_integer(unsigned(OperandB)); --added
-- for i in 0 to loop_nr loop
-- loop_nr <= to_integer(unsigned(OperandB));
-- Result_Low <= OperandA(2 downto 0)&'0';
-- Result_High <= tempHigh(2 downto 0) & OperandA(3);
-- end loop;
-- More added:
if loop_int /= 0 then
for i in 1 to loop_int loop
tempHigh (3 downto 0) := tempHigh (2 downto 0) & tempLow(3);
-- 'read' tempLow(3) before it's updated
tempLow := tempLow(2 downto 0) & '0';
end loop;
Result_Low <= tempLow;
Result_High <= tempHigh(3 downto 0);
Result_Low <= OperandA;
Result_High <= (others => '0');
end if;
Ready <= '1';
Errorsig <= '0';
Which gives:
And to demonstrate both halves of Result are working OperandA's default value has been changed to "0110":
Also notice the loop starts at 1 instead of 0 to prevent you from having an extra shift and there's a check for non-zero loop_int to prevent the for loop from executing at least once.
And is it possible to make a synthesizable loop in these circumstances?
The loop has to address all possible shifts (the range of loop_int) and test whether or not i falls under the shift threshold:
process (reset_n, clk, operation)
variable tempHigh : std_logic_vector(4 downto 0);
variable tempLow: std_logic_vector(3 downto 0); --added
subtype loop_range is integer range 0 to 15;
variable loop_int: integer range 0 to 15; --added
if (reset_n = '0') then
Result_Low <= (others => '0');
Result_High <= (others => '0');
Errorsig <= '0';
elsif (clk'event and clk = '1') then
case operation is
when "001" =>
tempLow := OperandA; --added
tempHigh := (others => '0'); --added
loop_int := to_integer(unsigned(OperandB)); --added
for i in loop_range loop
if i < loop_int then
tempHigh (3 downto 0) := tempHigh (2 downto 0) & tempLow(3);
-- 'read' tempLow(3) before it's updated
tempLow := tempLow(2 downto 0) & '0';
end if;
end loop;
Result_Low <= tempLow;
Result_High <= tempHigh(3 downto 0);

How to Rewrite FSM not to use Latches

I have an FSM and it works. The synthesizer, however, complains that there are latches for "acc_x", "acc_y", and "data_out" and I understand why and why it is bad. I have no idea, however, how to rewrite the FSM so the state-part goes to the clocked process. Any ideas where to start from? Here is the code of the FSM:
library IEEE;
use IEEE.std_logic_1164.all;
use IEEE.numeric_std.all;
entity storage is
clk_in : in std_logic;
reset : in std_logic;
element_in : in std_logic;
data_in : in signed(11 downto 0);
addr : in unsigned(9 downto 0);
add : in std_logic; -- add = '1' means add to RAM
-- add = '0' means write to RAM
dump : in std_logic;
element_out : out std_logic;
data_out : out signed(31 downto 0)
end storage;
architecture rtl of storage is
component bram is
clk : in std_logic;
we : in std_logic;
en : in std_logic;
addr : in unsigned(9 downto 0);
di : in signed(31 downto 0);
do : out signed(31 downto 0)
end component bram;
type state is (st_startwait, st_add, st_write);
signal current_state : state := st_startwait;
signal next_state : state := st_startwait;
signal we : std_logic;
signal en : std_logic;
signal di : signed(31 downto 0);
signal do : signed(31 downto 0);
signal acc_x : signed(31 downto 0);
signal acc_y : signed(31 downto 0);
ram : bram port map
clk => clk_in,
we => we,
en => en,
addr => addr,
di => di,
do => do
if rising_edge(clk_in) then
if (reset = '1') then
current_state <= st_startwait;
current_state <= next_state;
end if;
end if;
end process;
process(current_state, element_in, add, dump, data_in, do, acc_x, acc_y)
element_out <= '0';
en <= '1';
we <= '0';
di <= (others => '0');
case current_state is
when st_startwait =>
if (element_in = '1') then
acc_x <= resize(data_in, acc_x'length);
next_state <= st_add;
next_state <= st_startwait;
end if;
when st_add =>
if (add = '1') then
acc_y <= acc_x + do;
acc_y <= acc_x;
end if;
next_state <= st_write;
when st_write =>
if (dump = '1') then
data_out <= acc_y;
element_out <= '1';
di <= acc_y;
we <= '1';
end if;
next_state <= st_startwait;
end case;
end process;
end rtl;
This is personal preference, but I think most people on here will agree with me on this one... do not use two processes to control your state machine. The whole previous_state next_state thing is total garbage in my opinion. It's really confusing and it tends to make latches - SURPRISE - You found that out. Try rewriting your state machine with a single clocked process and only one state machine signal.
Here's my attempt at rewriting your state machine. Note that I'm not sure the functionality that I have below will work for you. Simulate it to make sure it behaves the way you expect. For example the signal en is always tied to '1', not sure if you want that...
process (clk_in)
if rising_edge(clk_in) then
element_out <= '0';
en <= '1'; -- this is set to 1 always?
we <= '0';
di <= (others => '0');
case state is
when st_startwait =>
if (element_in = '1') then
acc_x <= resize(data_in, acc_x'length);
state <= st_add;
end if;
when st_add =>
if (add = '1') then
acc_y <= acc_x + do;
acc_y <= acc_x;
end if;
state <= st_write;
when st_write =>
if (dump = '1') then
data_out <= acc_y;
element_out <= '1';
di <= acc_y;
we <= '1';
end if;
state <= st_startwait;
end case;
end if;
end process;
The reason for the inferred latches is that the case in the last process does
not drive all signals in all possible combinations of the sensitive signals.
So the process can finish without altering some of the output data for some of
the signal values to the process. To hold output in this way is the operation
of a latch, so latches are therefore inferred by the synthesis tool.
The latches applies only to acc_x, acc_y, and data_out, since all other
signals are assigned a default value in the beginning of the process.
You can fix this by either driving a default value for the last 3 signals in
the beginning of the process, for example 'X' for all bit to allow synthesis
data_out <= (others => 'X');
acc_x <= (others => 'X');
acc_y <= (others => 'X');
Alternatively can you can ensure that all outputs are driven in all branches of
the case, and you should then also add a when others => branch to the case.
I suggest that you use the assign of default value to all signals, since this
is easier to write and maintain, instead of having to keep track of assign to
all driven signals in all branches of the case.
Clearly you need supplemental (clocked) registers (D flip flops).
Your need to ask yourself "what occurs to acc_x if the FSM is in (let say) state st_add ?". Your answer is "I dont want to modify acc_x in this state". So : write it explicitely, using clocked registers (such as the one used for state ; you can augment the clocked process with these supplemental registers). Do that Everywhere. That is the rule. Otherwise, synthesizers will infer transparent latches to memorize the previous value of acc_x : but these transparent latches violate the synchronous design principles. They structurally imply combinatorial loops in your designs, which are bad.
Put another way : ask yourself what is combinatorial and where are the registers ? If you have registers in mind, code them explicitly. Do not make combinatorial signals assigned and read in the same process.
