I'm new to VHDL and am trying to code up Booth's Multiplication Algorithm. I'm using XILINX and when I synthesize my code, I end up with a lot of warnings:
Upper is assigned but never used,
Product is used but never assigned,
LowerPrevLSB is assigned but never used,
Lower is assigned but never used,
A_2sComp is assigned but never used,
Z has a constant value of 0,
Product has a constant value of 0.
I thought I assigned and wrote the code correctly, but evidently I am not. Any advice and help would be appreciated.
library IEEE;
use IEEE.NUMERIC_STD.ALL;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.STD_LOGIC_arith.ALL;
use IEEE.STD_LOGIC_unsigned.ALL;
-- Uncomment the following library declaration if using
-- arithmetic functions with Signed or Unsigned values
--use IEEE.NUMERIC_STD.ALL;
-- Uncomment the following library declaration if instantiating
-- any Xilinx primitives in this code.
--library UNISIM;
--use UNISIM.VComponents.all;
-- X * Y = Z
entity BoothMultiplier is
generic
(
numBits_X : integer := 8;
numBits_Y : integer := 8
);
port
(
CLK : in std_logic;
X : in std_logic_vector((numBits_X - 1) downto 0);
Y : in std_logic_vector((numBits_Y - 1) downto 0);
Z : out std_logic_vector((numBits_X + numBits_Y - 1) downto 0)
);
end BoothMultiplier;
architecture Behavioral of BoothMultiplier is
-- Two's Complement Function
function TwosComplement(inputNum : std_logic_vector) return std_logic_vector;
function TwosComplement(inputNum : std_logic_vector) return std_logic_vector is
variable temp : std_logic_vector(inputNum'range);
begin
temp := (not inputNum) + 1;
return temp;
end TwosComplement;
-- End Two's Complement Function
-- MIN Function
function MIN(Left, Right : integer) return integer;
function MIN(Left, Right : integer) return integer is
begin
if Left < Right then return Left;
else return Right;
end if;
end Min;
-- End MIN Function
-- MAX Function
function MAX(Left, Right : integer) return integer;
function MAX(Left, Right : integer) return integer is
begin
if Left > Right then return Left;
else return Right;
end if;
end MAX;
-- End MAX Function
-- Signals
signal Upper : std_logic_vector(MAX(numBits_X, numBits_Y) - 1 downto 0)
:= (others => '0');
signal Lower : std_logic_vector(MIN(numBits_X, numBits_Y) - 1 downto 0)
:= (others => '0');
signal LowerPrevLSB : std_logic := '0';
signal Product : std_logic_vector(numBits_X + numBits_Y - 1 downto 0)
:= (others => '0');
signal A, A_2sComp : std_logic_vector(MAX(numBits_X, numBits_y) - 1 downto 0)
:= (others => '0');
signal counter : integer := 0;
-- End Signals
begin
assert Z'length = (X'length + Y'length) report "Bad Product Length" severity failure;
Lower <= X when (numBits_X <= numBits_Y) else Y;
A <= X when (numBits_X > numBits_Y) else Y;
A_2sComp <= TwosComplement(A);
process(CLK)
begin
if rising_edge(CLK) then
if (Lower(0) = '0' and LowerPrevLSB = '1') then
Upper <= Upper + A;
elsif (Lower(0) = '1' and LowerPrevLSB = '0') then
Upper <= Upper + A_2sComp;
end if;
LowerPrevLSB <= Lower(0);
Product <= Upper & Lower;
for i in 0 to Product'length - 2 loop
Product(i) <= Product(i+1);
end loop;
Product(Product'length-1) <= Product(Product'length-1);
Upper <= Product(Product'length - 1 downto MIN(numBits_X, numBits_Y));
Lower <= Product(MIN(numBits_X, numBits_Y) - 1 downto 0);
counter <= counter + 1;
if (counter = MIN(numBits_X, numBits_Y)) then
Z <= Product;
end if;
end if;
end process;
end Behavioral;
In VHDL, successive assignments to the same signal in a process overrides previous assignments, thus:
if (Lower(0) = '0' and LowerPrevLSB = '1') then
Upper <= Upper + A;
elsif (Lower(0) = '1' and LowerPrevLSB = '0') then
Upper <= Upper + A_2sComp;
end if;
...
Upper <= Product(Product'length - 1 downto MIN(numBits_X, numBits_Y));
The first assignments, in the if block, is completely ignored. If you look at your code, assignments to Product, Upper and Lower are overridden.
I suggest you simulate your design before synthesizing your design with Xilinx. It will be much easier to test and debug. For instance, your counter signal is never reset, and will count up to 2^31-1, then wrap to -2^31. What will happen to your design in those cases? Simulation would point out these error easily, leave synthesis for later!
Related
As the title says, I'm getting a compiler exiting error with the if-statement when I use the lines commented out with the "sum" value but not with the "sum" 2 value and I'm not sure why.
Code:
library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
use IEEE.std_logic_unsigned.all;
entity test2 is port
(
a, b : IN unsigned( 3 DOWNTO 0 );
cin : IN unsigned;
sum : OUT unsigned( 4 DOWNTO 0 )
);
end test2;
architecture behavioral of test2 is
signal a_5, b_5, cin_5, sum2 : unsigned(4 downto 0) := (others => '0');
signal x, y : unsigned(4 downto 0) := (others => '0');
signal z : std_logic;
begin
a_5 <= ('0' & a);
b_5 <= ('0' & b);
cin_5 <= ('0' & '0' & '0' & '0' & cin);
sum <= a_5 + b_5 + cin_5;
sum2 <= a_5 + b_5 + cin_5;
process (sum2, b_5)
--process (sum, b_5)
begin
if (sum2 > b_5) then
--if (sum > b_5) then
z <= '1';
else
z <= '0';
end if;
end process;
end behavioral;
For some context:
I'm working on an adder that adds two 4bit numbers and eventually displays the decimal value on a 7seg display.
I want to take the "sum" value and check if it is greater than decimal value 9 and if so then it sets a flag to always have the 7seg display for the 10s value display a 1. (I only need to count up to decimal value 19). I can probably do this another method but I started doing it this way and got stuck here and I think this is something fundamental I am just not understanding.
sum is a port, which has the type "out". Ports out type "out" can't be read. If you want to read an output port, you must use the type "buffer". sum2 instead is a signal, which can always be read
(By the way you should only use numeric_std and not std_logic_unsigned, which is an old solution and not preferred anymore).
I am implementing the following module:
library ieee;
use ieee.std_logic_1164.all;
entity Grant_Logic is
generic (
N : positive := 4
);
Port (
Priority_Logic0 : in std_logic_vector(N-1 downto 0);
Priority_Logic1 : in std_logic_vector(N-1 downto 0);
Priority_Logic2 : in std_logic_vector(N-1 downto 0);
Priority_Logic3 : in std_logic_vector(N-1 downto 0);
Gnt : out std_logic_vector (N-1 downto 0)
);
end Grant_Logic;
architecture Behavioral of Grant_Logic is
begin
gnt(0) <= Priority_Logic0(0) or Priority_Logic1(3) or Priority_Logic2(2) or Priority_Logic3(1);
gnt(1) <= Priority_Logic0(1) or Priority_Logic1(0) or Priority_Logic2(3) or Priority_Logic3(2);
gnt(2) <= Priority_Logic0(2) or Priority_Logic1(1) or Priority_Logic2(0) or Priority_Logic3(3);
gnt(3) <= Priority_Logic0(3) or Priority_Logic1(2) or Priority_Logic2(1) or Priority_Logic3(0);
end Behavioral;
I want to take advantage of for ... generate to implement the same circuit when N changes. I am using Xilinx so Vivado (vhdl'93) does not support custom types for ports in the IP generation.
However, for the architecture I would like to use for ... generate. The issue is that some logic is needed to generate each bit of gnt. What I have so far is:
gen_gnt_vertical: for y in 0 to N-1 generate
constant val, index : integer := 0;
begin
s_result <= '0';
gen_gnt_horizontal: for x in 0 to N-1 generate
begin
LOGIC BASED ON val, x and y to obtain the index
s_result <= s_result or s_Priority_Logic(x)(index);
end generate;
gnt(y) <= s_result;
end generate;
The logic to compute index is:
if(x>0)
{
val = y - x;
if (val < 0)
{
index = N + val;
}
else
index = val;
}
else
{
index = y;
}
I have a script that generates a vhdl file based on N but I would like to do it directly on vhdl. Is that possible?
Thanks for the help
EDIT: As #Tricky replied, a function did the trick. So, I have the following:
function index( N, x,y : natural) return natural is
variable val : integer;
variable index : integer := 0;
begin
if(x>0) then
val := y - x;
if(val < 0) then
index := N + val;
else
index := val;
end if;
else
index := y;
end if;
return index;
end function;
And the architecture:
architecture Behavioral of Grant_Logic is
signal s_Priority_Logic : t_Priority_logic;
signal s_result : std_logic_vector(N-1 downto 0) := (others=>'0');
begin
s_Priority_Logic(0) <= Priority_Logic0;
s_Priority_Logic(1) <= Priority_Logic1;
s_Priority_Logic(2) <= Priority_Logic2;
s_Priority_Logic(3) <= Priority_Logic3;
process(s_Priority_Logic)
variable result : std_logic;
begin
for y in 0 to N-1 loop
result := '0';
for x in 0 to N-1 loop
result := result or s_Priority_logic(x)(index(N, x, y));
end loop;
gnt_g(y) <= result;
end loop;
end process;
end Behavioral;
#Tricky using a function was the correct thing to get the index value. Alo for generate was not correct but for loop. I edited my question to show the final result
I am developing a 8 bit unsigned prime number detector in VHDL, synthesizable, for a project.The objective is to do not only avoiding to use any sort of loops or latches as well as restricting it to only the FPGA 50Mhz clock.
We tried a clock-based using successive divisions but such implementation does not meet the timing requirements in Quartus Timequest when we try to output the result.
When we comment the output it seems to work just fine, and we don't fully understand the why.
library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
entity primeChecker is
generic(NumCyclesWait :integer := 1000);
port( clock: in std_logic;
enable: in std_logic;
reset : in std_logic;
input : in std_logic_vector(7 downto 0);
output : out std_logic_vector(7 downto 0);
isPrime : out std_logic_vector(0 downto 0));
end primeChecker;
architecture arch of primeChecker is
signal count: integer := 0;
signal numToCheck : integer := 0;
signal prime, primeOut : integer := 1;
signal s_out: unsigned(7 downto 0);
signal div : integer := 2;
signal clockCount : unsigned(0 downto 0) := "0";
begin
numToCheck <= to_integer(unsigned(input));
process(clock)
begin
if(rising_edge(clock)) then
if(count = NumCyclesWait) then
if ((numToCheck = 1) OR (numToCheck = 0)) then
prime <= 0; --Not Prime
elsif(numToCheck > 2 and prime =1 and div < numToCheck) then
if (numToCheck rem div = 0) then
-- if the number is divisible
prime <= 0;
end if;
div <= div +1;
end if;
else
count <= count +1;
end if;
if(enable = '1') then
s_out <= to_unsigned(numToCheck,8);
count <= 0;
div <= 2;
primeOut <= prime;
prime <= 1;
else
s_out <= s_out;
end if;
end if;
end process;
isPrime <= std_logic_vector(to_unsigned(primeOut,1));
output <= std_logic_vector(s_out);
end arch ; -- arch
I expect for it not to trigger the "Timing requirement not met" error and for it to fully compile.
For fastest constant time response, I would take a different approach. Your task is to deal with eight bit numbers only and your FPGA probably has more than enough RAM to set up an 8-bit prime number lookup table where each table entry just indicates whether its index is a prime number or not:
type prime_numbers_type is array(0 to 255) of std_ulogic;
constant prime_numbers : prime_numbers_type :=
( '0', '0', '1', '1', '0', '1', ... );
This makes the vital part of your prime number detector overly simple:
is_prime <= prime_numbers(to_integer(unsigned(num_to_check)));
I would probably just write a small Tcl script to set up the lookup table.
I'm attempting to create synthesizable VHDL (function or procedure) for an ASIC (it must be part of the ASIC) that will look for the first '1' in a standard_logic_vector and output which vector position that '1' was in. For example, I have an 8-bit slv of "10001000" (a '1' in position 3 and 7). If I use this slv, the output should be 4 (the output is 1 based).
The actual VHDL will be searching a large slv, up to 512 bits in length. I tried implementing a binary search function but I get synthesis errors that states "Could not synthesize non-constant range values. [CDFG-231] [elaborate]
The non-constant range values are in file '...' on line 61" I indicated in the code below where it complains. I'm not sure how to implement a binary search algorithm without having non-constant range values. How would I modify this code so it's synthesizable?
I have attempted to search for binary search algorithms for HDL for potential code to look at and for my error, but I didn't find anything.
library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
use ieee.std_logic_misc.all;
entity bin_search is
generic (
constant NREGS : positive := 16 -- number of registers
);
port (
clk_i : in std_logic; -- clock
bin_i : in unsigned( NREGS-1 downto 0 ); -- input
en_i : in std_logic; -- input enable
addr_o : out natural range 0 to NREGS -- first binary location
);
end bin_search;
architecture rtl of bin_search is
function f_bin_search( input: unsigned; nob: positive ) return natural is
constant nbits : positive := 2**nob;
variable lower : natural range 0 to 1 := 0;
variable upper : natural range 0 to 1 := 0;
variable idx : natural range 0 to nob := 4;
variable cnt : natural range 0 to nbits := 0;
variable mid : positive range 1 to nbits := nbits/2; --
variable ll : natural range 0 to nbits := 0;
variable ul : positive range 1 to nbits := nbits; --
begin
if input = 0 then
cnt := 0;
return cnt;
else
loop1: while ( idx > 0 ) loop
if ( input( mid-1 downto ll ) > 0 ) then -- <===WHERE SYNTH COMPLAINS
lower := 1;
else
lower := 0;
end if;
if ( input( ul-1 downto mid ) > 0 ) then
upper := 1;
else
upper := 0;
end if;
if ( idx = 1 ) then
if ( lower = 1 ) then
cnt := mid;
else
cnt := ul;
end if;
elsif ( lower = 1 ) then
ul := mid;
mid := ( ( ll+ul )/2 );
elsif ( upper = 1 ) then
ll := mid;
mid := ( ll+ul )/2;
else
cnt := 0;
exit loop1;
end if;
idx := idx-1;
end loop loop1;
return cnt;
end if;
end f_bin_search;
begin
test_proc: process ( clk_i )
begin
if rising_edge( clk_i ) then
if en_i = '1' then
addr_o <= f_bin_search( bin_i, 4 );
end if;
end if;
end process test_proc;
end rtl;
Here's a simple test bench where the input is inc'd by '1'. The addr_o should be the location (1 based) of the input lsb with a '1'.
library ieee;
use ieee.std_logic_1164.all;
use ieee.std_logic_misc.all;
use ieee.numeric_std.all;
entity bin_search_tb is
end bin_search_tb;
architecture behavior of bin_search_tb is
constant NREGS : positive := 16;
signal clk : std_logic;
signal input : unsigned( NREGS-1 downto 0 );
signal start : std_logic;
signal addr : natural range 0 to NREGS;
constant clk_per : time := 1 ns;
signal row : natural range 0 to 2**NREGS-1;
begin
bin_search_inst: entity work.bin_search( rtl )
generic map (
NREGS => NREGS
)
port map (
clk_i => clk, -- master clock
bin_i => input, -- captured events
en_i => start, -- start binary search
addr_o => addr -- addr where the first '1' appears
);
-- master clock process
clk_proc: process
begin
clk <= '0';
wait for clk_per / 2;
clk <= '1';
wait for clk_per / 2;
end process clk_proc;
--
stim1_proc: process
begin
input <= ( others => '0' );
start <= '0';
row <= 1;
wait until clk'event and clk = '1';
loop
wait until clk'event and clk = '1';
input <= to_unsigned( row, input'length );
start <= '1';
wait until clk'event and clk = '1';
start <= '0';
wait for 4*clk_per;
row <= row+1;
end loop;
end process stim1_proc;
end architecture behavior;
Thanks for your assistance!
-Jason
Edited code and added a testbench
Your design will most certainly depend on latency and other performance requirements, but, you could use some combination of or-reduction, sequencers (for mux selection of sliced vectors), shift register, and counters. I drew up a simple circuit that should find your lsb instance of "1" in ~30 clock cycles
The RTL translation that implements this design should be straight forward.
You say that you are thinking in hardware, but in fact you're not. Or you are misleading yourself.
input( mid-1 downto ll ) > 0
is not an OR-reduction, but a comparison operation. You must know > is the larger than comparison operator. The synthesis will therefor infer a comparator. But how many inputs must that comparator have, I ask? Well, there's your problem: it depends on the value of mid, which:
initially depends on the value of nbits, which depends on the value of nob which is a variable input for the function.
is changed within the loop. Thus it's value is not constant.
A hardware component cannot have a variable amount of wires.
But why do you want binary search? Why not keep-it-simple?
library ieee;
use ieee.std_logic_1164.all;
entity detect_one is
generic(
input_size : positive := 512);
port(
input : in std_logic_vector (input_size-1 downto 0);
output : out natural range 0 to input_size);
end entity;
architecture rtl of detect_one is
begin
main: process(input)
begin
output <= 0;
for i in input_size-1 downto 0 loop
if input(i)='1' then
output <= i+1;
end if;
end loop;
end process;
end architecture;
entity detect_one_tb is end entity;
library ieee;
architecture behavior of detect_one_tb is
constant input_size : positive := 512;
use ieee.std_logic_1164.all;
signal input : std_logic_vector (input_size-1 downto 0) := (others => '0');
signal output : integer;
begin
DUT : entity work.detect_one
generic map ( input_size => input_size )
port map(
input => input,
output => output);
test: process begin
wait for 1 ns;
assert (output = 0) report "initial test failure" severity warning;
for i in 0 to input_size-1 loop
input <= (others => '0');
input(i) <= '1';
wait for 1 ns;
assert (output = i+1) report "single ones test failure" severity warning;
end loop;
input <= (others => '1');
wait for 1 ns;
assert (output = 1) report "initial multiple ones test failure" severity warning;
for i in 0 to input_size-2 loop
input(i) <= '0';
wait for 1 ns;
assert (output = i+2) report "multiple ones test failure" severity warning;
end loop;
wait;
end process;
end architecture;
I implemented a Booth modified multiplier in vhdl. I need to make a synthesis with Vivado but it's not possible because of this error:
"complex assignment not supported".
This is the shifter code that causes the error:
entity shift_register is
generic (
N : integer := 6;
M : integer := 6
);
port (
en_s : in std_logic;
cod_result : in std_logic_vector (N+M-1 downto 0);
position : in integer;
shift_result : out std_logic_vector(N+M-1 downto 0)
);
end shift_register;
architecture shift_arch of shift_register is
begin
process(en_s)
variable shift_aux : std_logic_vector(N+M-1 downto 0);
variable i : integer := 0; --solo per comoditÃ
begin
if(en_s'event and en_s ='1') then
i := position;
shift_aux := (others => '0');
shift_aux(N+M-1 downto i) := cod_result(N+M-1-i downto 0); --ERROR!!
shift_result <= shift_aux ;
end if;
end process;
end shift_arch;
the booth multiplier works with any operator dimension. So I can not change this generic code with a specific one.
Please help me! Thanks a lot
There's a way to make your index addressing static for synthesis.
First, based on the loop we can tell position must have a value within the range of shift_aux, otherwise you'd end up with null slices (IEEE Std 1076-2008 8.5 Slice names).
That can be shown in the entity declaration:
library ieee;
use ieee.std_logic_1164.all;
entity shift_register is
generic (
N: integer := 6;
M: integer := 6
);
port (
en_s: in std_logic;
cod_result: in std_logic_vector (N + M - 1 downto 0);
position: in integer range 0 to N + M - 1 ; -- range ADDED
shift_result: out std_logic_vector(N + M - 1 downto 0)
);
end entity shift_register;
What's changed is the addition of a range constraint to the port declaration of position. The idea is to support simulation where the default value of can be integer is integer'left. Simulating your shift_register would fail on the rising edge of en_s if position (the actual driver) did not provide an initial value in the index range of shift_aux.
From a synthesis perspective an unbounded integer requires you take both positive and negative integer values in to account. Your for loop is only using positive integer values.
The same can be done in the declaration of the variable i in the process:
variable i: integer range 0 to N + M - 1 := 0; -- range ADDED
To address the immediate synthesis problem we look at the for loop.
Xilinx support issue AR# 52302 tells us the issue is using dynamic values for indexes.
The solution is to modify what the for loop does:
architecture shift_loop of shift_register is
begin
process (en_s)
variable shift_aux: std_logic_vector(N + M - 1 downto 0);
-- variable i: integer range 0 to N + M - 1 := 0; -- range ADDED
begin
if en_s'event and en_s = '1' then
-- i := position;
shift_aux := (others => '0');
for i in 0 to N + M - 1 loop
-- shift_aux(N + M - 1 downto i) := cod_result(N + M - 1 - i downto 0);
if i = position then
shift_aux(N + M - 1 downto i)
:= cod_result(N + M - 1 - i downto 0);
end if;
end loop;
shift_result <= shift_aux;
end if;
end process;
end architecture shift_loop;
If i becomes a static value when the loop is unrolled in synthesis it can be used in calculation of indexes.
Note this gives us an N + M input multiplexer where each input is selected when i = position.
This construct can actually be collapsed into a barrel shifter by optimization, although you might expect the number of variables involved for large values of N and M might take a prohibitive synthesis effort or simply fail.
When synthesis is successful you'll collapse each output element in the assignment into a separate multiplexer that will match Patrick's
barrel shifter.
For sufficiently large values of N and M we can defined the depth in number of multiplexer layers in the barrel shifter based on the number of bits in a binary expression of the integer range of distance.
That either requires a declared integer type or subtype for position or finding the log2 value of N + M. We can use the log2 value because it would only be used statically. (XST supports log2(x) where x is a Real for determining static values, the function is found in IEEE package math_real). This gives us the binary length of position. (How many bits are required to to describe the shift distance, the number of levels of multiplexers).
architecture barrel_shifter of shift_register is
begin
process (en_s)
use ieee.math_real.all; -- log2 [real return real]
use ieee.numeric_std.all; -- to_unsigned, unsigned
constant DISTLEN: natural := integer(log2(real(N + M))); -- binary lengh
type muxv is array (0 to DISTLEN - 1) of
unsigned (N + M - 1 downto 0);
variable shft_aux: muxv;
variable distance: unsigned (DISTLEN - 1 downto 0);
begin
if en_s'event and en_s = '1' then
distance := to_unsigned(position, DISTLEN); -- position in binary
shft_aux := (others => (others =>'0'));
for i in 0 to DISTLEN - 1 loop
if i = 0 then
if distance(i) = '1' then
shft_aux(i) := SHIFT_LEFT(unsigned(cod_result), 2 ** i);
else
shft_aux(i) := unsigned(cod_result);
end if;
else
if distance(i) = '1' then
shft_aux(i) := SHIFT_LEFT(shft_aux(i - 1), 2 ** i);
else
shft_aux(i) := shft_aux(i - 1);
end if;
end if;
end loop;
shift_result <= std_logic_vector(shft_aux(DISTLEN - 1));
end if;
end process;
end architecture barrel_shifter;
XST also supports ** if the left operand is 2 and the value of i is treated as a constant in the sequence of statements found in a loop statement.
This could be implemented with signals instead of variables or structurally in a generate statement instead of a loop statement inside a process, or even as a subprogram.
The basic idea here with these two architectures derived from yours is to produce something synthesis eligible.
The advantage of the second architecture over the first is in reduction in the amount of synthesis effort during optimization for larger values of N + M.
Neither of these architectures have been verified lacking a testbench in the original. They both analyze and elaborate.
Writing a simple case testbench:
library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
entity shift_register_tb is
end entity;
architecture foo of shift_register_tb is
constant N: integer := 6;
constant M: integer := 6;
signal clk: std_logic := '0';
signal din: std_logic_vector (N + M - 1 downto 0)
:= (0 => '1', others => '0');
signal dout: std_logic_vector (N + M - 1 downto 0);
signal dist: integer := 0;
begin
DUT:
entity work.shift_register
generic map (
N => N,
M => M
)
port map (
en_s => clk,
cod_result => din,
position => dist,
shift_result => dout
);
CLOCK:
process
begin
wait for 10 ns;
clk <= not clk;
if now > (N + M + 2) * 20 ns then
wait;
end if;
end process;
STIMULI:
process
begin
for i in 1 to N + M loop
wait for 20 ns;
dist <= i;
din <= std_logic_vector(SHIFT_LEFT(unsigned(din),1));
end loop;
wait;
end process;
end architecture;
And simulating reveals that the range of position and the number of loop iterations only needs to cover the number of bits in the multiplier and not the multiplicand. We don't need a full barrel shifter.
That can be easily fixed in both shift_register architectures and has the side effect of making the shift_loop architecture much more attractive, it would be easier to synthesize based on the multiplier bit length (presumably M) and not the product bit length (N+ M).
And that would give you:
library ieee;
use ieee.std_logic_1164.all;
entity shift_register is
generic (
N: integer := 6;
M: integer := 6
);
port (
en_s: in std_logic;
cod_result: in std_logic_vector (N + M - 1 downto 0);
position: in integer range 0 to M - 1 ; -- range ADDED
shift_result: out std_logic_vector(N + M - 1 downto 0)
);
end entity shift_register;
architecture shift_loop of shift_register is
begin
process (en_s)
variable shift_aux: std_logic_vector(N + M - 1 downto 0);
-- variable i: integer range 0 to M - 1 := 0; -- range ADDED
begin
if en_s'event and en_s = '1' then
-- i := position;
shift_aux := (others => '0');
for i in 0 to M - 1 loop
-- shift_aux(N + M - 1 downto i) := cod_result(N + M - 1 - i downto 0);
if i = position then -- This creates an N + M - 1 input MUX
shift_aux(N + M - 1 downto i)
:= cod_result(N + M - 1 - i downto 0);
end if;
end loop; -- The loop is unrolled in synthesis, i is CONSTANT
shift_result <= shift_aux;
end if;
end process;
end architecture shift_loop;
Modifying the testbench:
STIMULI:
process
begin
for i in 1 to M loop -- WAS N + M loop
wait for 20 ns;
dist <= i;
din <= std_logic_vector(SHIFT_LEFT(unsigned(din),1));
end loop;
wait;
end process;
gives a result showing the shifts are over the range of the multiplier value (specified by M):
So the moral here is you don't need a full barrel shifter, only one that works over the multiplier range and not the product range.
The last bit of code should be synthesis eligible.
You are trying to create a range using a run-time varying value, and this is not supported by the synthesis tool. cod_result(N+M-1 downto 0); would be supported, because N, M, and 1 are all known at synthesis time.
If you're trying to implement a multiplier, you will get the best result using x <= a * b, and letting the synthesis tool choose the best way to implement it. If you have operands wider than the multiplier widths in your device, then you need to look at the documentation to determine the best route, which will normally involve pipelining of some sort.
If you need a run-time variable shift, look for a 'Barrel Shifter'. There are existing answers on these, for example this one.