I need to create a 4 bit multiplier as a part of a 4-bit ALU in VHDL code, however the requirement is that we have to use repeated addition, meaning if A is one of the four bit number and B is the other 4 bit number, we would have to add A + A + A..., B number of times. I understand this requires either a for loop or a while loop while also having a temp variable to store the values, but my code just doesn't seem to be working and I just don't really understand how the functionality of it would work.
PR and T are temporary buffer standard logic vectors and A and B are the two input 4 bit numbers and C and D are the output values, but the loop just doesn't seem to work. I don't understand how to loop it so it keeps adding the A bit B number of times and thus do the multiplication of A * B.
WHEN "010" =>
PR <= "00000000";
T <= "0000";
WHILE(T < B)LOOP
PR <= PR + A;
T <= T + 1;
END LOOP;
C <= PR(3 downto 0);
D <= PR(7 downto 4);
This will never work, because when a line with a signal assignment (<=) like this one:
PR <= PR + A;
is executed, the target of the signal assignment (PR in this case) is not updated immediately; instead an event (a future change) is scheduled. When is this event (change) actioned? When all processes have suspended (reached wait statements or end process statements).
So, your loop:
WHILE(T < B)LOOP
PR <= PR + A;
T <= T + 1;
END LOOP;
just schedules more and more events on PR and T, but these events never get actioned because the process is still executing. There is more information here.
So, what's the solution to your problem? Well, it depends what hardware you are trying to achieve. Are you trying to achieve a block of combinational logic? Or sequential? (where the multiply takes multiple clock cycles)
I advise you to try not to think in terms of "temporary variables", "for loops" and "while loops". These are software constructions that can be useful, but ultimately you are designing a piece of hardware. You need to try to think about what physical pieces of hardware can be connected together to achieve your design, then how you might describe them using VHDL. This is difficult at first.
You should provide more information about what exactly you want to achieve (and on what kind of hardware) to increase the probability of getting a good answer.
You don't mention whether your multiplier needs to operate on signed or unsigned inputs. Let's assume signed, because that's a bit harder.
As has been noted, this whole exercise makes little sense if implemented combinationally, so let's assume you want a clocked (sequential) implementation.
You also don't mention how often you expect new inputs to arrive. This makes a big difference in the implementation. I don't think either one is necessarily more difficult to write than the other, but if you expect frequent inputs (e.g. every clock cycle), then you need a pipelined implementation (which uses more hardware). If you expect infrequent inputs (e.g. every 16 or more clock cycles) then a cheaper serial implementation should be used.
Let's assume you want a serial implementation, then I would start somewhere along these lines:
library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
entity loopy_mult is
generic(
g_a_bits : positive := 4;
g_b_bits : positive := 4
);
port(
clk : in std_logic;
srst : in std_logic;
-- Input
in_valid : in std_logic;
in_a : in signed(g_a_bits-1 downto 0);
in_b : in signed(g_b_bits-1 downto 0);
-- Output
out_valid : out std_logic;
out_ab : out signed(g_a_bits+g_b_bits-1 downto 0)
);
end loopy_mult;
architecture rtl of loopy_mult is
signal a : signed(g_a_bits-1 downto 0);
signal b_sign : std_logic;
signal countdown : unsigned(g_b_bits-1 downto 0);
signal sum : signed(g_a_bits+g_b_bits-1 downto 0);
begin
mult_proc : process(clk)
begin
if rising_edge(clk) then
if srst = '1' then
out_valid <= '0';
countdown <= (others => '0');
else
if in_valid = '1' then -- (Initialize)
-- Record the value of A and sign of B for later
a <= in_a;
b_sign <= in_b(g_b_bits-1);
-- Initialize countdown
if in_b(g_b_bits-1) = '0' then
-- Input B is positive
countdown <= unsigned(in_b);
else
-- Input B is negative
countdown <= unsigned(-in_b);
end if;
-- Initialize sum
sum <= (others => '0');
-- Set the output valid flag if we're already finished (B=0)
if in_b = 0 then
out_valid <= '1';
else
out_valid <= '0';
end if;
elsif countdown > 0 then -- (Loop)
-- Let's assume the target is an FPGA with efficient add/sub
if b_sign = '0' then
sum <= sum + a;
else
sum <= sum - a;
end if;
-- Set the output valid flag when we get to the last loop
if countdown = 1 then
out_valid <= '1';
else
out_valid <= '0';
end if;
-- Decrement countdown
countdown <= countdown - 1;
else
-- (Idle)
out_valid <= '0';
end if;
end if;
end if;
end process mult_proc;
-- Output
out_ab <= sum;
end rtl;
This is not immensely efficient, but is intended to be relatively easy to read and understand. There are many, many improvements you could make depending on your requirements.
Related
I want to access specific elements of a vector and add their values in one single clock. The cache inputs have been written prior to this calc, so they can be accessed immediately.
type cache_type is array (89 downto 0) of std_logic_vector(14 downto 0);
signal input_cache : cache_type := (others => (others => '0'));
signal cluster_square_sum : integer := 0;
calc: process (clk)
begin
if rising_edge(clk) and cache_ready then
cluster_square_sum <= conv_integer(input_cache(5)) +
conv_integer(input_cache(6)) + conv_integer(input_cache(7)) +
conv_integer(input_cache(12)) + conv_integer(input_cache(13)) +
conv_integer(input_cache(14)) + conv_integer(input_cache(15)) + ...
end if;
end process;
How can I implement this behavior without writing all desired elements explicitly in the addition? I've thought about a variable iterator in the process, but it would not work because of the single clock computation.
Actually, I don't need a process in the first place, I could write my summation directly in the architecture with a "when statement"
cluster_square_sum <= conv_integer(input_cache(5)) +
conv_integer(input_cache(6)) + conv_integer(input_cache(7)) +
conv_integer(input_cache(12)) + conv_integer(input_cache(13)) +
conv_integer(input_cache(14)) + conv_integer(input_cache(15)) + ...
when cache_ready <= '1';
Unfortunately, I don't know a way in implementing my behavior in one of the ways.
Several things about your question are unclear, but one thing about the question is clear : better design would let you use the type system instead of fighting it every line.
One thing that's unclear : what are you storing in the cache? Is it exclusively reserved for numbers, or is this a CPU cache which can store numbers, text, instructions etc?
If it's exclusively for numeric data then
type cache_type is array (89 downto 0) of natural range 0 to 2**14 - 1;
will simplify your sum a lot. Or the equivalent signed integer range, whichever is appropriate for your application. This also makes the design intent clear. Or you can use signed or unsigned from numeric_std.
cluster_square_sum <= input_cache(5) + input_cache(6) + input_cache(7) + ...
when cache_ready <= '1';
It's also unclear what subset of cache you are summing over, especially since this isn't an [MCVE] and appears to cover a non-contiguous range. So I'll just have to guess it's a fixed continuous subset; you could employ a loop.
calc: process (clk)
variable running_total : natural;
begin
if rising_edge(clk) then
if cache_ready then
running_total := 0;
for i in lower_bound to upper_bound loop
running_total := running_total + input_cache(6)(i);
end loop;
cluster_square_sum <= running_total;
end if;
end if;
end process;
The variable allows instant assignment (variable assignment semantics) rather than postponed assignment (signal assignment semantics) so you get a running total.
If the bounds aren't constant (or generics), or the range isn't contiguous, this strategy needs some modification.
Another unclear thing is your speed (latency and throughput) and logic size goals. This will perform every addition in a single cycle, but that cycle is likely to be rather slow. If you potentially need an output every clock cycle, with a realistic clock rate, you'll have to pipeline it.
Or if you have plenty of cycles after cache_ready before you need an output, embed the loop and computation in a state machine triggered by cache_ready. This can perform one addition (and loop iteration) per clock cycle, allowing input_cache to be a Block Ram rather than individually accessible registers, for a very small and fast-clocked design. (And in that case, with only one addition per clock, you can safely use either a variable or a signal for running_total.
In all cases, I would suggest to use a function.
There doesn't seem to be any logic in the elements you want to add, so I made the functions accept a set of element indices.
"Minimal" example:
library IEEE;
use IEEE.std_logic_1164.all;
entity e is
port(clk: in std_logic);
end entity;
architecture a of e is
use IEEE.numeric_std.all;
type cache_type is array(0 to 89) of std_logic_vector(14 downto 0);
signal input_cache : cache_type := (
5 => "000010101010101",
6 => "000001010101010",
7 => "000000101010101",
12 => "000000010101010",
13 => "000000001010101",
14 => "000000000101010",
15 => "000000000010101",
others => (others => '0'));
type integer_array is array(natural range <>) of integer;
constant elements_to_add : integer_array := (5, 6, 7, 12, 13, 14, 15);
signal cluster_square_sum : integer := 0;
function add_elements(input_array : cache_type; element_idxs: integer_array) return integer is
variable output : integer := 0;
begin
for idx in element_idxs'low to element_idxs'high loop
output := output + to_integer(unsigned(input_array(element_idxs(idx))));
end loop;
return output;
end;
begin
calc: process(clk)
begin
if rising_edge(clk) then
cluster_square_sum <= add_elements(input_cache, elements_to_add);
end if;
end process;
end architecture;
Note that you could also do the process-less/direct assignment with this function.
Plus testbench:
entity e_tb is end entity;
library IEEE;
architecture a of e_tb is
use IEEE.std_logic_1164.all;
signal clk : std_logic := '0';
begin
UUT : entity work.e port map (clk => clk);
test: process begin
wait for 1 ns;
clk <= '1';
wait for 1 ns;
clk <= '0';
wait;
end process;
end architecture;
Please note that it if you want to implement this in an FPGA, you will require quite some resources and lose performance. It would then be better to pipeline your design: use multiple clock cycles to add the elements.
This is a branch off of a separate question I asked. I am going to explain more in depth on what I am trying to do and what it is not liking. This is a school project and doesn't need to follow standards.
I am attempting to make the SIMON game. Right now, what I am trying to do is use a switch case for levels and each level is supposed to be faster (hence different frequency dividers). The first level is supposed to be the first frequency and a pattern of LEDs is supposed to light up and disappear. Before I put in a switch case, the first level was by itself (no second level stuff) and it lit up and disappeared like it should. I also used compare = 0 in order to compare in output to an input. (The user is supposed to flip up the switches in the light pattern they saw). This worked when the first level was by itself but now that it is in a switch case, it doesn't like compare. I'm not sure how to get around that in order to compare an output to an input.
The errors I am getting are similar to before:
Error (10821): HDL error at FP.vhd(75): can't infer register for "compare" because its behavior does not match any supported register model
Error (10821): HDL error at FP.vhd(75): can't infer register for "count[0]" because its behavior does not match any supported register model
Error (10821): HDL error at FP.vhd(75): can't infer register for "count[1]" because its behavior does not match any supported register model
Error (10821): HDL error at FP.vhd(75): can't infer register for "count[2]" because its behavior does not match any supported register model
Error (10822): HDL error at FP.vhd(80): couldn't implement registers for assignments on this clock edge
Error (10822): HDL error at FP.vhd(102): couldn't implement registers for assignments on this clock edge
Error (12153): Can't elaborate top-level user hierarchy
I also understand that it doesn't like the rising_edge(toggle) but I need that in order to make the LED pattern light up and disappear.
library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
use ieee.std_logic_unsigned.all;
entity FP is
port(
clk, reset : in std_logic;
QF : out std_logic_vector (3 downto 0);
checkbtn : in std_logic;
Switch : in std_logic_vector(3 downto 0);
sel : in std_logic_vector (1 downto 0);
score : out std_logic_vector (6 downto 0)
);
end FP;
architecture behavior of FP is
signal time_count: integer:=0;
signal toggle : std_logic;
signal toggle1 : std_logic;
signal count : std_logic_vector (2 downto 0);
signal seg : std_logic_vector (3 downto 0);
signal compare : integer range 0 to 1:=0;
type STATE_TYPE is (level1, level2);
signal level : STATE_TYPE;
--signal input : std_logic_vector (3 downto 0);
--signal sev : std_logic_vector (6 downto 0);
begin
process (clk, reset, sel)
begin
if (reset = '0') then
time_count <= 0;
toggle <= '0';
elsif rising_edge (clk) then
case sel is
when "00" =>
if (time_count = 1249999) then
toggle <= not toggle;
time_count <= 0;
else
time_count <= time_count+1;
end if;
when "01" =>
if (time_count = 2499999) then
toggle1 <= not toggle1;
time_count <= 0;
else
time_count <= time_count+1;
end if;
when "10" =>
if (time_count = 4999999) then
toggle <= not toggle;
time_count <= 0;
else
time_count <= time_count+1;
end if;
when "11" =>
if (time_count = 12499999) then
toggle <= not toggle;
time_count <= 0;
else
time_count <= time_count+1;
end if;
end case;
end if;
end process;
Process (toggle, compare, switch)
begin
case level is
when level1 =>
if sel = "00" then
count <= "001";
seg <= "1000";
elsif (rising_edge (toggle)) then
count <= "001";
compare <= 0;
if (count = "001") then
count <= "000";
else
count <= "000";
end if;
end if;
if (switch = "1000") and (compare = 0) and (checkbtn <= '0') then
score <= "1111001";
level <= level2;
else
score <= "1000000";
level <= level1;
end if;
when level2 =>
if sel = "01" then
count <= "010";
seg <= "0100";
elsif (rising_edge (toggle1)) then
count <= "010";
compare <= 1;
if (count = "010") then
count <= "000";
else
count <= "000";
end if;
end if;
if (switch = "0100") and (compare = 1) and (checkbtn <= '0') then
score <= "0100100";
else
score <= "1000000";
level <= level1;
end if;
end case;
case count is
when "000"=>seg<="0000";
when "001"=>seg<="1000";
when "010"=>seg<="0100";
when "011"=>seg<="0110";
when "100"=>seg<="0011";
when others=>seg<="0000";
end case;
end process;
QF <= seg;
end behavior;
Thanks again in advance!
Well... it is hard to tell what is wrong, because this state machine is written in wrong way. You should look for references about proper modeling of FSM in VHDL. One good example is here.
If you use Quartus, you could also look for Altera's description on how to model FSM specifically for their compiler.
I will now give you just two advices. First is that you shouldn't (or mabye even you can't) use is two
if rising_edge (clk)
checks in one process. If your process is supposed to be sensitive on clock edge, write it once at the beginning.
Second thing is that if you want to model FSM with one process with synchronous reset, then put just clk on sensitivity list.
EDIT after question and code edit:
Ok, much better now. But another few things:
Your FSM is still not like it should. Look again at the example in the source I gave you above and edit it to be like there, or make it one process FSM like in example in this link.
Intends! Very important. I couldn't spot some of obvious errors, before I made proper intendation in your code. This leads me to...
Look at the places, there you assign values to count, in particular the if statements. No mater what, you assign the same value of "000".
Similar story with another signal - seg. You assign to it some value in the process, and then at the end of this process there is case statement in which you assign to it some other value, making this previous assignments irrelevant.
Use rising_edge only once in the process, only to clock, and only at the very beginning of the process, or in the way you did in the first process, that has asynchronous reset. In second process you did all this three things.
In sequential process with rising_edge, like the first one, you don't have to put to sensitivity list anything more than clock, and reset if it is asynchronous, like in your case.
Sensitivity list in second process. It is parallel process, so you should put there signals, that you check in a process, and can change outside of it. It is not the case for compare. But there should be signals: level, sel and toggle1.
As I'm still not sure what are you trying to achieve, I will not tell you what exactly to do. Fix your code according to points above, then maybe it will just work.
I am completely new to programming CPLDs and I want to program a latch + counter in Xilinx ISE Project Navigator using VHDL language. This is how it must work and it MUST be only this way: this kind of device gets 2 clock signals. When one of them gets from HIGH to LOW state, data input bits get transferred to outputs and they get latched. When the 2nd clock gets from LOW to HIGH state, the output bits get incremented by 1. Unfortunately my code doesn't want to work....
library ieee;
use ieee.std_logic_1164.all;
use ieee.std_logic_unsigned.all;
entity counter8bit is
port(CLKDA, CLKLD : in std_logic;
D : in std_logic_vector(7 downto 0);
Q : out std_logic_vector(7 downto 0));
end counter8bit;
architecture archi of counter8bit is
signal tmp: std_logic_vector(7 downto 0);
begin
process (CLKDA, CLKLD, D)
begin
if (CLKLD'event and CLKLD='0') then
tmp <= D;
elsif (CLKDA'event and CLKDA='1') then
tmp <= tmp + 1;
end if;
end process;
Q <= tmp;
end archi;
Is there any other way around to achieve this?? Please for replies. Any kind of help/suggestions will be strongly appreciated. Many thanks in advance!!
Based on the added comments on what the counter is for, I came up with the following idea. Whether it would work in reality is hard to decide, because I would need a proper timing diagram for the EPROM interface. Importantly, there could be clock domain crossing issues depending on what restrictions there are on how the two clock signals are asserted; if there can be a falling edge of CLKLD close to a rising edge of CLKDA, the design may not work reliably.
signal new_start_address : std_logic := '0';
signal start_address : std_logic_vector(D'range) := (others => '0');
...
process (CLKLD, CLKDA)
begin
if (CLKDA = '1') then
new_start_address <= '0';
elsif (falling_edge(CLKLD)) then
start_address <= D;
new_start_address <= '1';
end if;
end process;
process (CLKDA)
begin
if (rising_edge(CLKDA)) then
if (new_start_address = '1') then
tmp <= start_address;
else
tmp <= tmp + 1;
end if;
end if;
end process;
I'm not completely clear on the interface, but it could be that the line tmp <= start_address; should become tmp <= start_address + 1;
You may also need to replace your assignment of Q with:
Q <= start_address when new_start_address = '1' else tmp;
Again, it's hard to know for sure without a timing diagram.
So I've been using VHDL to make a register, where it loads in the input X if LOAD is '1' , and outputs the data in serial fashion , basically a parallel in serial out register. The input X is a 4 bit ( 3 downto 0 ) input , what I want to make the program do is constantly output 0 when the register has successfully output all the btis in the input.
It works when "count" is defined as a signal , however , when count is defined as a variable , the output is a constant 0 , regardless of whether load is '1' or not. My code is as shown:
entity qn14 is
Port ( clk : in STD_LOGIC;
reset : in STD_LOGIC;
LOAD : in STD_LOGIC;
X : in STD_LOGIC_VECTOR (3 downto 0);
output : out STD_LOGIC);
end qn14;
architecture qn14_beh of qn14 is
type states is ( IDLE , SHIFT );
signal state : states;
signal count: STD_LOGIC_VECTOR(1 downto 0);
begin
process(clk , reset)
variable temp: STD_LOGIC;
variable data: STD_LOGIC_VECTOR(3 downto 0);
begin
if reset = '1' then
state <= IDLE;
count <= "00";
output <= '0';
elsif clk'event and clk = '1' then
case state is
when IDLE =>
if LOAD = '1' then
data := X;
output <= '0';
state <= SHIFT;
elsif LOAD = '0' then
output <= '0';
end if;
when SHIFT =>
if LOAD ='1' then
output <= '0';
elsif LOAD = '0' then
output <= data( conv_integer(count) );
count <= count + 1;
if (count >= 3) then
state <= IDLE ;
end if;
end if;
end case;
end if;
end process;
end qn14_beh;
Hoping to seek clarification on this.
Thank you.
This may not fully answer your question, but I will cover several issues that I see.
The elsif LOAD = '0' then could just be else unless you're trying to cover the other states (X,U...) but you would still want an else to cover those.
count = 3 is clearer than count >= 3. count is a 2-bit vector, so it can never be greater than 3.
Although you output 0 when LOAD is asserted while in the SHIFT state, you don't actually load a new value. Did you intend to?
Changing count to a variable without changing the position of the assignments will cause your first X -> output sequence to abort a cycle early (you increment count before you test it against 3). It will cause subsequent X -> output sequences to go "X(3), X(0), X(1), X(2)"
The variable temp is never used. The variable data would work just as well as a signal. You do not need the properties of variables for this use. This also brings up the question of why you tried count as variable; unless you need the instant assignments of variables, it is typically better to use a signal because signals are easier to view in (most) simulators and harder to make mistakes with. I hope that you aren't trying to end up with count as a variable, but just have an academic curiosity why it doesn't work.
Have you looked at this in a simulator? Are you changing states properly? Are all the bits of all your inputs strongly driven to a defined value ('1' or '0')?
I don't see anything that would cause the failure you described merely from changing count to a variable from a signal, but the change would cause the undesired behavior described above. My best guess is that your symptom arises from an issue with how you drive your inputs.
I was wondering if there was a method where you can split a n bit long vector for example:
10100111
Into n individual binary units to be used later on? I'm trying to incorporate a method where i can get a 8 bit long vector and have 8 LED's light up depending if that n'th value is 0 or 1. Googling the question returns people cutting up a larger vector into smaller ones, but if we were to have 16 bits for example then i'd have to make 16 separate variables for it work using:
entity test is
port (
myVector : in std_logic_vector(16 downto 0);
LED : out std_logic_vector(16 downto 0)
);
END test
architecture behavior of test is
SIGNAL led1 : std_logic;
...
SIGNAL led16 : std_logic;
BEGIN
led1 <= myVector(0);
...
led16 <= myVector(16);
LED(1) <= '1' when led1 = '1' else '0';
...
LED(16) <= '1' when led16 = '1' else '0';
END behavior
Doesn't seem tidy when it needs to be repeated several times in the code.
If you want to declare and assign to identifiers with different names, then you have to make a line for each name, since there is no loop statement for identifier name manipulation in VHDL.
But VHDL provides arrays, where you can loop over the entries, like with this example code:
...
signal led_sig : std_logic_vector(16 downto 0);
begin
loop_g: for idx in myVector'range generate
led_sig(idx) <= myVector(idx);
LED(idx) <= '1' when led_sig(idx) = '1' else '0';
end generate;
...
Where the assigns are equivalent to the shorter:
led_sig <= myVector;
LED <= led_sig;