Vivado VHDL width mismatch - how can I fix it? - vhdl

Please consider this very simple minimal reproducible code:
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
entity test is
generic ( LENGTH : integer range 1 to 16 := 5 );
Port ( x : in STD_LOGIC;
y : out STD_LOGIC_VECTOR(15 downto 0)
);
end test;
architecture Behavioral of test is
signal a : std_logic_vector (15 downto 0);
signal b : std_logic_vector (LENGTH - 1 downto 0);
signal i : integer range 0 to LENGTH-1 := 1;
begin
y <= a;
process
begin
if i = LENGTH then
i <= 1;
else
a <= a(15 downto i + 1) & b(i downto 0);
end if;
i <= i + 1;
end process;
end Behavioral;
My need is to join some elements of b into a, depending on i. By running the RTL on Vivado, it says:
[Synth 8-690] width mismatch in assignment; target has 16 bits, source has 20 bits
I don't really get why. Anyhow, the overall range will be 15 - (i + 1) + (i - 0) = 15 ... 0 and fits in the 16 bits of output -- what's the deal for 20 bits?
I should say the problem vanishes (obviously) if I use plain constants instead of i, but I still don't get what's going on.

For runtime variable I (as per the question)...
instead of a big CASE, you can use the value of I to generate masks, and evaluate (A and MASKA) or (B and MASKB). Which is equivalent to the multiplexer the synthesis tool would generate if it wasn't broken.
For generic I (it's not fair to move the goalposts in the comments!)
this approach generates unnecessary hardware, which will be optimised out by any competent synthesis tool.
(There are of course other problems with this code; I assume you deleted the clock, taking the MCVE notion a bit too far. You should leave it valid synthesisable code)

Related

Arithmetic operations in vhdl. How to multiply std_logic vector by real number?

For school tutorial I need to make a component that receives integer values in the interval 0 to 1000. The output return S=V*C, where C depends on:
C=1 when V is in range [0,10]
C=0.75 when V is in range [11,100]
C=0.3125 when V is in range [101,1000]
I tried the code below, but it doesn't compile. I guess, I have a problem with types. How should I program a real number to multiply with a std_logic_vector?
library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
entity comp1 is
port(
V: in std_logic_vector(9 downto 0);
S:out std_logic_vector(13 dowto 0));
end comp1;
architecture syn of comp1 is
begin
process is
variable c: unsigned(4 downto 0);
variable r: unsigned(13 downto 0);
begin
if unsigned(V) < 11 then
c:=1;
elsif unsigned(V) < 101 then
c:=0.75;
elsif others =>
c:=0.3125;
end if;
r:=unsigned(V)*C;
S<=std_logic_vector(r(13 downto 0));
end process;
end architecture;
Your question is not fully clear: what do you need the code for. Depending on your answer, there are actually multiple solutions.
Prior problem: representation and rounding
As you already found out, seeing you use numeric_std, is that a std_logic_vector by itself doesn't represent any value. It's just an array of std_logic elements. Therefore you should not do any operation of bare std_logic_vectors.
Now, by casting the vector to an unsigned type, you define it as representing an (unsigned) integer value. But now you get the problem that integers cannot represent fractions. So what's 0.1 in integer? 0. That's easy. But what's 0.9 in integer? Well, that depends on your rounding scheme. If you simply truncate the number, then the answer is 0 again. But using standard rounding (+0.5), the answer is 1. You haven't told us what rounding scheme you want (I don't know if you thought about it)
P.s. why is your S 14 bits wide? If V is 10 bits and the largest factor is 1.0, then S will also only need 10 bits...
Implementations
Let's first define an entity
library ieee;
use ieee.std_logic_1164.all;
entity comp1 is
port(
V : in std_logic_vector(9 downto 0);
S : out std_logic_vector(9 downto 0)
);
end entity;
Real operations
You can just convert everything to floating point (real) and perform you operation. This will solve all rounding for you and you have much freedom. The problem is that real types are not (yet) supported in synthesis. Still, for a test it works as it should
architecture behavior of comp1 is
signal V_real, S_real : real;
use ieee.numeric_std.all;
begin
V_real <= real(to_integer(unsigned(V)));
S_real <=
V_real when V_real <= 10.0 else
V_real * 0.75 when V_real <= 100.0
else V_real * 0.3125;
S <= std_logic_vector(to_unsigned(integer(S_real), S'length));
end architecture;
Fixed-point
With VHDL-2008 they tried to bridge the problem of not having point-representation for synthesis, by introducing synthesizable fixed-point packages. When using these packages, you can even select the rounding scheme you want. This is because rounding requires extra resources and is not always necessary. Warning: Use of the packages takes some getting used to.
architecture behavior of comp1 is
use ieee.fixed_pkg.all;
signal V_fix : ufixed(9 downto 0);
signal C : ufixed(0 downto -15);
signal S_fix : ufixed(10 downto -15); -- range of V*C+1
use ieee.numeric_std.all;
begin
V_fix <= to_ufixed(V, V_fix);
C <= to_ufixed(1, C) when V_fix <= 10 else
to_ufixed(0.75, C) when V_fix <= 100 else
to_ufixed(0.3125, C);
S_fix <= V_fix * C;
S <= std_logic_vector(to_unsigned(S_fix, S'length));
end architecture;
p.s. as mentioned, you need to compile in VHDL-2008 mode for this to work.
Integer arithmetic
If you look at you multiplication factors, you can see that they can be represented by fractions:
0.75 = 3/4
0.3125 = 5/16
This mean you can simply use integer arithmetic to perform the scaling.
architecture behavior of comp1 is
signal V_int, S_int : integer range 0 to 1000;
use ieee.numeric_std.all;
begin
V_int <= to_integer(unsigned(V));
S_int <=
V_int when V_int <= 10 else
V_int*3/4 when V_int <= 100
else V_int*5/16;
S <= std_logic_vector(to_unsigned(S_int, S'length));
end architecture;
NB Integer arithmetic has no rounding scheme, thus numbers are truncated!
Low-level optimizations: Shift-and-add
In the comments Brian referred to using shift and add operations. Going back to the integer arithmetic section of my answer, you see that the denominators are actually powers-of-2, which can be realized using bit-shift operations
x/4 = x/(2^2) = x>>2 (right shift by 2)
x/16 = x/(2^4) = x>>4 (rightshift by 4)
At the same time, the numerators can also be realized using bitshift and add operations
x*3 = x*"11"b => x + x<<1 (left shift by 1)
x*5 = x*"101"b => x + x<<2 (left shift by 2)
Both can be combined in one operations. Note, although you must remember that left shift will throw away the bits shifted out. This can cause a problem, as the fractions of the values are required for correct results. So you need to add bits to calculate the intermediate results.
architecture behavior of comp1 is
use ieee.numeric_std.all;
signal V_uns4, S_uns4 : unsigned(13 downto 0); -- add 4 bits for correct adding
begin
V_uns4 <= resize(unsigned(V),V_uns4'length);
S_uns4 <=
shift_left(V_uns4,4) when V_uns4 <= 10 else
shift_left(V_uns4,3) + shift_left(V_uns4,2) when V_uns4 <= 100 -- "11" >> 2
else shift_left(V_uns4,2) + V_uns4; --"101" >> 4
S <= std_logic_vector(resize(shift_right(S_uns4,4),S'length));
end architecture;
This method will likely require the least resourses in synthesis. But is does require low level optimizations, which require additional design effort.
Testbench
Here's how I tested my code
entity comp1_tb is end entity;
library ieee;
architecture tb of comp1_tb is
use ieee.std_logic_1164.all;
signal V,S : std_logic_vector(9 downto 0);
use ieee.numeric_std.all;
signal V_int, S_int : integer range 0 to 1000;
begin
DUT: entity work.comp1
port map(
V => V,
S => S);
V <= std_logic_vector(to_unsigned(V_int, V'length));
S_int <= to_integer(unsigned(S));
V_stim : process begin
V_int <= 1;
wait for 1 ns;
assert (S_int = 1) report "S should be 1*1 = 1;" severity warning;
V_int <= 10;
wait for 1 ns;
assert (S_int = 10) report "S should be 10*1 = 10;" severity warning;
V_int <= 100;
wait for 1 ns;
assert (S_int = 75) report "S should be 100*0.75 = 75;" severity warning;
V_int <= 1000;
wait for 1 ns;
assert (S_int = 312 OR S_int = 313) report "S should be 1000*0.3125 = 312 or 313;" severity warning;
wait;
end process;
end architecture;
add V signal to process sensitivity list;
use shifting and addititon instead of direct multiplication as Brian said.

VHDL Data Flow description of Gray Code Incrementer

I am trying to write the VHDL code for a Gray Code incrementer using the Data Flow description style. I do not understand how to translate the for loop I used in the behavioral description into the Data Flow description. Any suggestion?
This is my working code in behavioral description
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.NUMERIC_STD.ALL;
entity graycode is
Generic (N: integer := 4);
Port ( gcode : in STD_LOGIC_VECTOR (N-1 downto 0);
nextgcode : out STD_LOGIC_VECTOR (N-1 downto 0));
end graycode;
architecture Behavioral of graycode is
begin
process(gcode)
variable bcode : STD_LOGIC_VECTOR(N-1 downto 0);
variable int_bcode : integer;
begin
for i in gcode'range loop
if(i < gcode'length - 1) then
bcode(i) := gcode(i) XOR bcode(i+1);
else
bcode(i) := gcode(i);
end if;
end loop;
int_bcode := to_integer(unsigned(bcode));
int_bcode := int_bcode + 1;
bcode := std_logic_vector(to_unsigned(int_bcode, N));
for i in gcode'range loop
if(i < gcode'length - 1) then
nextgcode(i) <= bcode(i) XOR bcode(i+1);
else
nextgcode(i) <= bcode(i);
end if;
end loop;
end process;
end Behavioral;
'Dataflow' means 'like it would look in a circuit diagram'. In other words, the flow of data through a real circuit, rather than a high-level algorithmic description. So, unroll your loops and see what you've actually described. Start with N=2, and draw out your unrolled circuit. You should get a 2-bit input bus, with an xor gate in it, followed by a 2-bit (combinatorial) incrementor, followed by a 2-bit output bus, with another xor gate, in it. Done, for N=2.
Your problem now is to generalise N. One obvious way to do this is to put your basic N=2 circuit in a generate loop (yes, this is dataflow, since it just duplicates harwdare), and extend it. Ask in another question if you can't do this.
BTW, your integer incrementor is clunky - you should be incrementing an unsigned bcode directly.
Dataflow means constructed of concurrent statements using signals.
That means using generate statements instead of loops. The if statement can be an if generate statement with an else in -2008 or for earlier revisions of the VHDL standard two if generate statements with the conditions providing opposite boolean results for the same value being evaluated.
It's easier to just promote the exception assignments to their own concurrent signal assignments:
library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
entity graycode is
generic (N: natural := 4); -- CHANGED negative numbers wont be interesting
port (
gcode: in std_logic_vector (N - 1 downto 0);
nextgcode: out std_logic_vector (N - 1 downto 0)
);
end entity graycode;
architecture dataflow of graycode is
signal int_bcode: std_logic_vector (N - 1 downto 0); -- ADDED
signal bcode: std_logic_vector (N - 1 downto 0); -- ADDED
begin
int_bcode(N - 1) <= gcode (N - 1);
TO_BIN:
for i in N - 2 downto 0 generate
int_bcode(i) <= gcode(i) xor int_bcode(i + 1);
end generate;
bcode <= std_logic_vector(unsigned(int_bcode) + 1);
nextgcode(N - 1) <= bcode(N - 1);
TO_GRAY:
for i in N - 2 downto 0 generate
nextgcode(i) <= bcode(i) xor bcode(i + 1);
end generate;
end architecture dataflow;
Each iteration of a for generate scheme will elaborate a block statement with an implicit label of the string image of i concatenated on the generate statement label name string.
In each of these blocks there's a declaration for the iterated value of i and any concurrent statements are elaborated into those blocks.
The visibility rules tell us that any names not declared in the block state that are visible in the enclosing declarative region are visible within the block.
These mean concurrent statements in the block are equivalent to concurrent statement in the architecture body here with a value of i replaced by a literal equivalent.
The concurrent statements in the generate statements and architecture body give us a dataflow representation.
And with a testbench:
library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
entity graycode_tb is
end entity;
architecture foo of graycode_tb is
constant N: natural := 4;
signal gcode: std_logic_vector (N - 1 downto 0);
signal nextgcode: std_logic_vector (N - 1 downto 0);
signal bcode: std_logic_vector (N - 1 downto 0);
begin
DUT:
entity work.graycode
generic map ( N => N)
port map (
gcode => gcode,
nextgcode => nextgcode
);
STIMULi:
process
variable gv: std_logic_vector (N - 1 downto 0);
variable bv: std_logic_vector (N - 1 downto 0);
begin
wait for 10 ns;
for i in 0 to 2 ** N - 1 loop
bv := std_logic_vector(to_unsigned( i, bv'length));
gv(N - 1) := bv (N - 1);
for i in N - 2 downto 0 loop
gv(i) := bv(i) xor bv(i + 1);
end loop;
gcode <= gv;
bcode <= bv;
wait for 10 ns;
end loop;
wait;
end process;
end architecture;
We can see the effects of incrementing int_bcode:

VHDL multiplier which output has the same side of it's inputs

I'm using VHDL for describing a 32 bits multiplier, for a system to be implemented on a Xilinx FPGA, I found on web that the rule of thumb is that if you have inputs of N-bits size, the output must've (2*N)-bits of size. I'm using it for a feedback system, is it posible to has a multiplier with an output of the same size of it's inputs?.
I swear once I found a fpga application, which vhdl code has adders and multipliers blocks wired with signals of the same size. The person who wrote the code told me that you just have to put the result of the product on a 64 bits signal and then the output has to get the most significant 32 bits of the result (which was not necesarily on the most significant 32 bits of the 64 bits signal).
At the time I build a system (apparently works) using the next code:
library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
entity Multiplier32Bits is
port(
CLK: in std_logic;
A,B: in std_logic_vector(31 downto 0);
R: out std_logic_vector(31 downto 0)
);
end Multiplier32Bits;
architecture Behavioral of Multiplier32Bits is
signal next_state: std_logic_vector(63 downto 0);
signal state: std_logic_vector(31 downto 0);
begin
Sequential: process(CLK,state,next_state)
begin
if CLK'event and CLK = '1' then
state <= next_state(61 downto 30);
else
state <= state;
end if;
end process Sequential;
--Combinational part
next_state <= std_logic_vector(signed(A)*signed(B));
--Output assigment
R <= state;
end Behavioral;
I though it was working since at the time I had the block simulated with Active-HDL FPGA simulator, but know that I'm simulating the whole 32 bit system using iSim from Xilinx ISE Design Suite. I found that my output has a big difference from the real product of A and B inputs, which I don't know if it's just the accuracy loose from skipping 32 bits or my code is just bad.
Your code has some problems:
next_state and state don't belong into the sensitivity list
The writing CLK'event and CLK = '1' should be replaced by rising_edge(CLK)
state <= state; has no effect and causes some tools like ISE to misread the pattern. Remove it.
Putting spaces around operators doesn't hurt, but improves readability.
Why do you expect the result of a * b in bits 30 to 61 instead of 0 to 31?
state and next_state don't represent states of a state machine. It's just a register.
Improved code:
architecture Behavioral of Multiplier32Bits is
signal next_state: std_logic_vector(63 downto 0);
signal state: std_logic_vector(31 downto 0);
begin
Sequential: process(CLK)
begin
if rising_edge(CLK) then
state <= next_state(31 downto 0);
end if;
end process Sequential;
--Combinational part
next_state <= std_logic_vector(signed(A) * signed(B));
--Output assigment
R <= state;
end architecture Behavioral;
I totally agree with everything that Paebbels write. But I will explain to you this things about number of bits in the result.
So I will explain it by examples in base 10.
9 * 9 = 81 (two 1 digit numbers gives maximum of 2 digits)
99 * 99 = 9801 (two 2 digit numbers gives maximum of 4 digits)
999 * 999 = 998001 (two 3 digit numbers gives maximum of 6 digits)
9999 * 9999 = 99980001 (4 digits -> 8 digits)
And so on... It is totally the same for binary. That's why output is (2*N)-bits of size of input.
But if your numbers are smaller, then result will fit in same number of digits, as factors:
3 * 3 = 9
10 * 9 = 90
100 * 99 = 990
And so on. So if your numbers are small enough, then result will be 32 bit. Of course, as Paebbels already written, result will be in least significant part of signal.
And as J.H.Bonarius already pointed out, if your input consist not of integer, but fixed point numbers, then you would have to do post shifting. If this is your case, write it in the comment, and I will explain what to do.

Vivado synthesis: complex assignment not supported

I implemented a Booth modified multiplier in vhdl. I need to make a synthesis with Vivado but it's not possible because of this error:
"complex assignment not supported".
This is the shifter code that causes the error:
entity shift_register is
generic (
N : integer := 6;
M : integer := 6
);
port (
en_s : in std_logic;
cod_result : in std_logic_vector (N+M-1 downto 0);
position : in integer;
shift_result : out std_logic_vector(N+M-1 downto 0)
);
end shift_register;
architecture shift_arch of shift_register is
begin
process(en_s)
variable shift_aux : std_logic_vector(N+M-1 downto 0);
variable i : integer := 0; --solo per comoditÃ
begin
if(en_s'event and en_s ='1') then
i := position;
shift_aux := (others => '0');
shift_aux(N+M-1 downto i) := cod_result(N+M-1-i downto 0); --ERROR!!
shift_result <= shift_aux ;
end if;
end process;
end shift_arch;
the booth multiplier works with any operator dimension. So I can not change this generic code with a specific one.
Please help me! Thanks a lot
There's a way to make your index addressing static for synthesis.
First, based on the loop we can tell position must have a value within the range of shift_aux, otherwise you'd end up with null slices (IEEE Std 1076-2008 8.5 Slice names).
That can be shown in the entity declaration:
library ieee;
use ieee.std_logic_1164.all;
entity shift_register is
generic (
N: integer := 6;
M: integer := 6
);
port (
en_s: in std_logic;
cod_result: in std_logic_vector (N + M - 1 downto 0);
position: in integer range 0 to N + M - 1 ; -- range ADDED
shift_result: out std_logic_vector(N + M - 1 downto 0)
);
end entity shift_register;
What's changed is the addition of a range constraint to the port declaration of position. The idea is to support simulation where the default value of can be integer is integer'left. Simulating your shift_register would fail on the rising edge of en_s if position (the actual driver) did not provide an initial value in the index range of shift_aux.
From a synthesis perspective an unbounded integer requires you take both positive and negative integer values in to account. Your for loop is only using positive integer values.
The same can be done in the declaration of the variable i in the process:
variable i: integer range 0 to N + M - 1 := 0; -- range ADDED
To address the immediate synthesis problem we look at the for loop.
Xilinx support issue AR# 52302 tells us the issue is using dynamic values for indexes.
The solution is to modify what the for loop does:
architecture shift_loop of shift_register is
begin
process (en_s)
variable shift_aux: std_logic_vector(N + M - 1 downto 0);
-- variable i: integer range 0 to N + M - 1 := 0; -- range ADDED
begin
if en_s'event and en_s = '1' then
-- i := position;
shift_aux := (others => '0');
for i in 0 to N + M - 1 loop
-- shift_aux(N + M - 1 downto i) := cod_result(N + M - 1 - i downto 0);
if i = position then
shift_aux(N + M - 1 downto i)
:= cod_result(N + M - 1 - i downto 0);
end if;
end loop;
shift_result <= shift_aux;
end if;
end process;
end architecture shift_loop;
If i becomes a static value when the loop is unrolled in synthesis it can be used in calculation of indexes.
Note this gives us an N + M input multiplexer where each input is selected when i = position.
This construct can actually be collapsed into a barrel shifter by optimization, although you might expect the number of variables involved for large values of N and M might take a prohibitive synthesis effort or simply fail.
When synthesis is successful you'll collapse each output element in the assignment into a separate multiplexer that will match Patrick's
barrel shifter.
For sufficiently large values of N and M we can defined the depth in number of multiplexer layers in the barrel shifter based on the number of bits in a binary expression of the integer range of distance.
That either requires a declared integer type or subtype for position or finding the log2 value of N + M. We can use the log2 value because it would only be used statically. (XST supports log2(x) where x is a Real for determining static values, the function is found in IEEE package math_real). This gives us the binary length of position. (How many bits are required to to describe the shift distance, the number of levels of multiplexers).
architecture barrel_shifter of shift_register is
begin
process (en_s)
use ieee.math_real.all; -- log2 [real return real]
use ieee.numeric_std.all; -- to_unsigned, unsigned
constant DISTLEN: natural := integer(log2(real(N + M))); -- binary lengh
type muxv is array (0 to DISTLEN - 1) of
unsigned (N + M - 1 downto 0);
variable shft_aux: muxv;
variable distance: unsigned (DISTLEN - 1 downto 0);
begin
if en_s'event and en_s = '1' then
distance := to_unsigned(position, DISTLEN); -- position in binary
shft_aux := (others => (others =>'0'));
for i in 0 to DISTLEN - 1 loop
if i = 0 then
if distance(i) = '1' then
shft_aux(i) := SHIFT_LEFT(unsigned(cod_result), 2 ** i);
else
shft_aux(i) := unsigned(cod_result);
end if;
else
if distance(i) = '1' then
shft_aux(i) := SHIFT_LEFT(shft_aux(i - 1), 2 ** i);
else
shft_aux(i) := shft_aux(i - 1);
end if;
end if;
end loop;
shift_result <= std_logic_vector(shft_aux(DISTLEN - 1));
end if;
end process;
end architecture barrel_shifter;
XST also supports ** if the left operand is 2 and the value of i is treated as a constant in the sequence of statements found in a loop statement.
This could be implemented with signals instead of variables or structurally in a generate statement instead of a loop statement inside a process, or even as a subprogram.
The basic idea here with these two architectures derived from yours is to produce something synthesis eligible.
The advantage of the second architecture over the first is in reduction in the amount of synthesis effort during optimization for larger values of N + M.
Neither of these architectures have been verified lacking a testbench in the original. They both analyze and elaborate.
Writing a simple case testbench:
library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
entity shift_register_tb is
end entity;
architecture foo of shift_register_tb is
constant N: integer := 6;
constant M: integer := 6;
signal clk: std_logic := '0';
signal din: std_logic_vector (N + M - 1 downto 0)
:= (0 => '1', others => '0');
signal dout: std_logic_vector (N + M - 1 downto 0);
signal dist: integer := 0;
begin
DUT:
entity work.shift_register
generic map (
N => N,
M => M
)
port map (
en_s => clk,
cod_result => din,
position => dist,
shift_result => dout
);
CLOCK:
process
begin
wait for 10 ns;
clk <= not clk;
if now > (N + M + 2) * 20 ns then
wait;
end if;
end process;
STIMULI:
process
begin
for i in 1 to N + M loop
wait for 20 ns;
dist <= i;
din <= std_logic_vector(SHIFT_LEFT(unsigned(din),1));
end loop;
wait;
end process;
end architecture;
And simulating reveals that the range of position and the number of loop iterations only needs to cover the number of bits in the multiplier and not the multiplicand. We don't need a full barrel shifter.
That can be easily fixed in both shift_register architectures and has the side effect of making the shift_loop architecture much more attractive, it would be easier to synthesize based on the multiplier bit length (presumably M) and not the product bit length (N+ M).
And that would give you:
library ieee;
use ieee.std_logic_1164.all;
entity shift_register is
generic (
N: integer := 6;
M: integer := 6
);
port (
en_s: in std_logic;
cod_result: in std_logic_vector (N + M - 1 downto 0);
position: in integer range 0 to M - 1 ; -- range ADDED
shift_result: out std_logic_vector(N + M - 1 downto 0)
);
end entity shift_register;
architecture shift_loop of shift_register is
begin
process (en_s)
variable shift_aux: std_logic_vector(N + M - 1 downto 0);
-- variable i: integer range 0 to M - 1 := 0; -- range ADDED
begin
if en_s'event and en_s = '1' then
-- i := position;
shift_aux := (others => '0');
for i in 0 to M - 1 loop
-- shift_aux(N + M - 1 downto i) := cod_result(N + M - 1 - i downto 0);
if i = position then -- This creates an N + M - 1 input MUX
shift_aux(N + M - 1 downto i)
:= cod_result(N + M - 1 - i downto 0);
end if;
end loop; -- The loop is unrolled in synthesis, i is CONSTANT
shift_result <= shift_aux;
end if;
end process;
end architecture shift_loop;
Modifying the testbench:
STIMULI:
process
begin
for i in 1 to M loop -- WAS N + M loop
wait for 20 ns;
dist <= i;
din <= std_logic_vector(SHIFT_LEFT(unsigned(din),1));
end loop;
wait;
end process;
gives a result showing the shifts are over the range of the multiplier value (specified by M):
So the moral here is you don't need a full barrel shifter, only one that works over the multiplier range and not the product range.
The last bit of code should be synthesis eligible.
You are trying to create a range using a run-time varying value, and this is not supported by the synthesis tool. cod_result(N+M-1 downto 0); would be supported, because N, M, and 1 are all known at synthesis time.
If you're trying to implement a multiplier, you will get the best result using x <= a * b, and letting the synthesis tool choose the best way to implement it. If you have operands wider than the multiplier widths in your device, then you need to look at the documentation to determine the best route, which will normally involve pipelining of some sort.
If you need a run-time variable shift, look for a 'Barrel Shifter'. There are existing answers on these, for example this one.

Multiplication with Fixed point representation in VHDL

For the fixed point arithmatic I represented 0.166 with 0000 0010101010100110 and multiply it with same. for this I wrote the code in VHDL as below. Output is assigned in y which is signed 41bit. For signed Multiplication A(a1,b1)*A(a2,b2)=A(a1+a2+1,b1+b2). However during the simulation its give an error
Target Size 41 and source size 40 for array dimension 0 does not match.
code:
entity file1 is
Port ( y : out signed(40 downto 0));
end file1;
architecture Behavioral of file1 is
signal a : signed(19 downto 0) := "00000010101010100110";
signal b : signed(19 downto 0) := "00000010101010100110";
begin
y<= (a*b); ----error
end Behavioral;
The result of multiplying 19+1 bits to 19+1 bits is 39+1 bits, while your port is 40+1 bit long. For example let's multiply maximum possible values for 19-bits: 0x7FFFF * 0x7FFFF = 0x3FFFF00001 - so it's 39 bits (19 + 19 + carry) for unsigned result and +1 bit for sign.
So you should either "normalize" result by extending it to 1 more bit, which should be equal to the sign of result (bit#40 = bit#39) or just choose 40-bit port as output:
Port ( y : out signed(39 downto 0))
If you really need redundant 41st bit:
begin
y(39 downto 0) <= (a*b)
y(40) <= y(39)
end Behavioral;
Or just use resize function for signeds: How to convert 8 bits to 16 bits in VHDL?

Resources