Or Reduce An Array of Vectors - vhdl

Needs to be placed on a real board, so will have to synthesize.
Using an old VHDL, libraries included:
library IEEE;
use ieee.numeric_std.all;
use ieee.std_logic_misc.all;
Some signals:
type my_array is array (N-1 downto 0) of std_logic_vector(31 downto 0);
signal enable : my_array;
signal ored_enable: std_logic_vector(31 downto 0);
Signals get joined up in a generator:
my_gen: for i in 0 to (N-1) generate
woah: entity work.my_entity
port map(
clk => clk,
enable => enable(i)
end generate;
ored_enable <= or_reduce(enable); -- this fails
I'm just trying to create a std_logic_vector which holds the ored signals from the array. Any ideas how I can simply achieve this?

First, I expect your last line to be a typo and read
ored_enable <= or_reduce(enable);
But this wouldn't work since or_reduce is only defined for std_logic_vector, not array of std_logic_vector. You can create your own reduce function:
function or_reduce(a : my_array) return std_logic_vector is
variable ret : std_logic_vector(31 downto 0) := (others => '0');
for i in a'range loop
ret := ret or a(i);
end loop;
return ret;
end function or_reduce;
Just put it in your architecture's declarations and it should work.


VHDL: big slv array slicing indexed by integer (big mux)

I want to slice a std_logic_vector in VHDL obtaining parts of it of fixed dimensions.
The general problem is:
din N*M bits
dout M bits
sel clog2(N) bits
Expected behaviour in an example (pseudocode): input 16 bit, want to slice it in 4 subvectors of 4bit each.
signal in: std_logic_vector(N*M-1 downto 0);
signal sel: integer;
-- with sel = 0
output <= in(N-1:0);
--with sel = 1 output <= in(2N-1:N)
-- with sel = 2
output <= in(3N-1:2N)
--with sel = M-1
output <= in(M*N-1:(M-1)N)
I know a couples of way to do this, but I don't know which one is the best practice and give the best results in synthesis.
the entity
din: in std_logic_vector(15 downto 0);
dout: out std_logic_vector(3 downto 0);
sel: in std_logic_vecotor(1 downto 0)
case sel is
when "00" => dout <= din(3:0);
when "01" => dout <= din(7:4);
when "10" => dout <= din(11:8);
when "11" => dout <= din(15:12);
when others => ....`
It clearly implement a mux, but it's not generic at all and If the input gets big it's really hard to write and to codecover.
sel_int <= to_integer(unsigned(sel));
dout <= din(4*(sel_int+1) - 1 downto 4*sel_int);
Extremely easy to write and to mantain, BUT it can have problems when the input is not a power of 2. For example, if I want to slice a 24bit vector in chunks of 4, what happen when the integer conversion of sel brings to the index 7?
sel_int <= to_integer(unsigned(sel));
for i in 0 to 4 generate
din_slice(i) <= din(4*(i+1)-1 downto 4*i);
end generate dout <= din_slice(sel_int);
I'm searching a solution that is general enough to be used with various input/output relationships and safe enough to be synthesized consistently everytime.
The Case statement is the only one with the Others case (that feels really safe), the other solutions rely on the slv to integer conversion and indexing that feels really comfortable but not so reliable.
Which solution would you use?
practical usecase
I have a 250bit std_logic_vector and I need to select 10 contigous bits inside of it starting from a certain point from 0 to 239. How can I do that in a way that is good for synthesis?
There is another option that is accepted by tools that allow VHDL 2008 (which includes Vivado and Prime Pro). You can use an unconstrained 2d type from a package:
type slv_array_t is array(natural range <>) of std_logic_vector; --vhdl 2008 unconstrained array type
then you can simply select which port you want. And it is as generic as you like.
library ieee;
use ieee.std_logic_1164.all;
use work.my_pkg.all;
entity mux is
generic (
N : natural;
M : natural
port (
sel : in natural;
ip : in slv_array_t (N-1 downto 0)(M-1 downto 0);
op : out std_logic_vector (M-1 downto 0);
end entity;
architecture rtl of mux is
op <= ip(sel);
end architecture;
First you must extend the incoming data to be sure to have always as much bits as you need for connecting all multiplexer inputs (see the code below, process p_extend).
This will not create any logic at synthesis.
Second you must convert the resulting vector into an array, which you can access later by an index (see the code below, process p_create_array).
Again this will not create any logic at synthesis.
At last you must access this array by the select input signal (see the code below, process p_mux).
library ieee;
use ieee.std_logic_1164.all;
entity mux is
generic (
g_data_width : natural := 250;
g_slice_width : natural := 10;
g_sel_width : natural := 5;
g_start_point : natural := 27
port (
d_i : in std_logic_vector(g_data_width-1 downto 0);
sel_i : in std_logic_vector(g_sel_width-1 downto 0);
d_o : out std_logic_vector(g_slice_width-1 downto 0)
end entity mux;
library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
architecture struct of mux is
signal data : std_logic_vector(g_slice_width * 2**g_sel_width-1 downto 0);
type t_std_logic_slice_array is array (natural range <>) of std_logic_vector(g_slice_width-1 downto 0);
signal mux_in : t_std_logic_slice_array (2**g_sel_width-1 downto 0);
p_extend: process(d_i)
for i in 0 to g_slice_width * 2**g_sel_width-1 loop
if i+g_start_point<g_data_width then
data(i) <= d_i(i+g_start_point);
data(i) <= '0';
end if;
end loop;
end process;
p_create_array: process (data)
for i in 0 to 2**g_sel_width-1 loop
mux_in(i) <= data((i+1)*g_slice_width-1 downto i*g_slice_width);
end loop;
end process;
p_mux: d_o <= mux_in(to_integer(unsigned(sel_i)));
end architecture;

TestBench for Bitwise Operators

Can someone help me to create a TestBench Program for the below Program, please?
library ieee;
use ieee.std_logic_1164.all;
entity bitwise is
port( a,b : in std_logic_vector(4 downto 0);
result1, result2, result3, result4, result5, result6 : out std_logic_vector(4 downto 0));
end bitwise;
architecture arch of bitwise is
result1 <= a and b;
result2 <= a or b;
result3 <= a xor b;
result4 <= not a;
result5 <= to_stdlogicvector(to_bitvector(a) sll 1);
result6 <= to_stdlogicvector(to_bitvector(a) srl 1);
end arch;
My Test Bench Program is below: I am stuck to in the Stimulus process where we have to test each and every possibility. It could be either a loop version or just testing possible numbers for each operator.
USE ieee.std_logic_1164.all;
entity test_bitwise is
end test_bitwise;
architecture behavior of test_bitwise is
component bitwise;
port( a,b : in std_logic_vector(4 downto 0);
result1, result2, result3, result4 : out std_logic_vector(4 downto 0));
end component;
signal tb_a : std_logic_vector(4 downto 0) := (others => '0');
`signal tb_b : std_logic_vector(4 downto 0) := (others => '0');
signal tb_result1 : std_logic_vector(7 downto 0);
signal tb_result2 : std_logic_vector(7 downto 0);
signal tb_result3 : std_logic_vector(7 downto 0);
signal tb_result4 : std_logic_vector(7 downto 0);
U1_Test : entity work.test_bitwise(behavioral)
port map (a => tb_a,
b => tb_b,
result1 <= tb_result1,
result2 <= tb_result2,
result3 <= tb_result3,
result4 <= tb_result4);
stim_proc : process
end process;
end behavior;
As others have stated in the comments, you should provide some input yourself. What have you tried and why didn't it succeed? If you have hard time to find out what to try and how to start, you could begin by doing the following. And if you don't succeed, you can then edit your question or post a new one so the other members can help you.
Use a for loop to iterate over each and every possibility. Writing all the possible values to test by hand would be exhausting.
Because you have two inputs, use two nested for loops inside your process. One iterates the values for input a and the other one for b. Check here how a for loop is written.
Inside the loops, assign values to your signals tb_a and tb_b. The loop indices are integers, so you have to convert them to std_logic_vector type before assigning. Check here for a short tutorial about VHDL conversions.
Add some delay after each iteration with wait.
Print the output values for example to simulator console with report, or you can even use assert statement.

Query on VHDL generics in packages

I have written a simple VHDL code to add two matrices containing 32 bit floating point numbers. The matrix dimensions have been defined in a package. Currently, I specify the matrix dimensions in the vhdl code and use the corresponding type from the package. However, I would like to use generic in the design to deal with matrices of different dimensions. For this I would have to somehow use the right type defined in the package. How do I go about doing this?
My current VHDL code is as below.
library IEEE;
use work.mat_pak.all;
entity newproj is
Port ( clk : in STD_LOGIC;
clr : in STD_LOGIC;
start : in STD_LOGIC;
A_in : in t2;
B_in : in t2;
AplusB : out t2;
parallel_add_done : out STD_LOGIC);
end newproj;
architecture Behavioral of newproj is
sclr : IN STD_LOGIC;
signal temp_out: t2 := (others=>(others=>(others=>'0')));
signal add_over: t2bit:=(others=>(others=>'0'));
signal check_all_done,init_val: std_logic:='0';
init_val <= '1';
g0: for k in 0 to 1 generate
g1: for m in 0 to 1 generate
add_instx: add port map(A_in(k)(m), B_in(k)(m), clk, clr, start, temp_out(k)(m), add_over(k)(m));
end generate;
end generate;
g2: for k in 0 to 1 generate
g3: for m in 0 to 1 generate
check_all_done <= add_over(k)(m) and init_val;
end generate;
end generate;
AplusB <= temp_out;
parallel_add_done <= check_all_done;
end process;
end Behavioral;
My package is as below
library IEEE;
use IEEE.STD_LOGIC_1164.all;
package mat_pak is
subtype small_int is integer range 0 to 2;
type t22 is array (0 to 1) of std_logic_vector(31 downto 0);
type t2 is array (0 to 1) of t22; --2*2 matrix
type t22bit is array (0 to 1) of std_logic;
type t2bit is array (0 to 1) of t22bit; --2*2 matrix bit
type t33 is array (0 to 2) of std_logic_vector(31 downto 0);
type t3 is array (0 to 2) of t33; --3*3 matrix
end mat_pak;
Any suggestions would be welcome. Thank you.
There are some logical issues with your design.
First, there's some maximum number of ports for a sub-hierarchy a design can tolerate, you have 192 'bits' of matrix inputs and outputs. Do you really believe this number should be configurable?
At some point it will only fit in the very large FPGA devices, and shortly thereafter not fit there either.
Imagining some operation taking a variable number of clocks in add and parallel_add_done signifies when an aplusb datum is available comprised of elements of the matrix array contributed by all instantiated add components, the individual rdy signals are ANDed together. If the adds all take the same amount of time you could take the rdy from anyone of them (If you silicon is not that deterministic it would not be usable, there are registers in add).
The nested generate statements all assign the result of the AND between add_over(k,m) and init_val (which is a synthesis constant of 1). The effect or wire ANDing add_over(k.m) bits together (which doesn't work in VHDL and is likely not achievable in synthesis, either).
Note I also showed the proper indexing method for the two dimensional arrays.
Using Jonathan's method of sizing matrixes:
library ieee;
use ieee.std_logic_1164.all;
package mat_pak is
type matrix is array (natural range <>, natural range <>)
of std_logic_vector(31 downto 0);
type bmatrix is array (natural range <>, natural range <>)
of std_logic;
end package mat_pak;
library ieee;
use ieee.std_logic_1164.all;
use work.mat_pak.all;
entity newproj is
generic ( size: natural := 2 );
port (
clk: in std_logic;
clr: in std_logic;
start: in std_logic;
a_in: in matrix (0 to size - 1, 0 to size - 1);
b_in: in matrix (0 to size - 1, 0 to size - 1);
aplusb: out matrix (0 to size - 1, 0 to size - 1);
parallel_add_done: out std_logic
end entity newproj;
architecture behavioral of newproj is
component add
port (
a: in std_logic_vector(31 downto 0);
b: in std_logic_vector(31 downto 0);
clk: in std_logic;
sclr: in std_logic;
ce: in std_logic;
result: out std_logic_vector(31 downto 0);
rdy: out std_logic
end component;
signal temp_out: matrix (0 to size - 1, 0 to size - 1)
:= (others => (others => (others => '0')));
signal add_over: bmatrix (0 to size - 1, 0 to size - 1)
:= (others => (others => '0'));
for k in 0 to size - 1 generate
for m in 0 to size - 1 generate
add_instx: add
port map (
a => a_in(k,m),
b => b_in(k,m),
clk => clk,
sclr => clr,
ce => start,
result => temp_out(k,m),
rdy => add_over(k,m)
end generate;
end generate;
aplusb <= temp_out;
process (add_over)
variable check_all_done: std_logic;
check_all_done := '1';
for k in 0 to size - 1 loop
for m in 0 to size -1 loop
check_all_done := check_all_done and add_over(k,m);
end loop;
end loop;
parallel_add_done <= check_all_done;
end process;
end architecture behavioral;
We find that we really want to AND the various rdy outputs (add_over array) together. In VHDL -2008 this can be done with the unary AND, otherwise you're counting on a synthesis tool to flatten the AND (and they generally do).
I made the assignment to aplusb a concurrent assignment.
So I dummied up an add entity with an empty architecture, the above then analyzes, elaborates and simulates, which shows that none of the connectivity has length mismatches anywhere.
I'm not quite sure to understand perfectly, but I'll try to answer anyway ;)
You can use unconstrained array like this:
package mat_pak is
type matrix is array(natural range <>, natural range <>) of std_logic_vector(31 downto 0);
end package mat_pack;
entity newproj is
Generic ( size : natural );
Port ( clk : in STD_LOGIC;
clr : in STD_LOGIC;
start : in STD_LOGIC;
A_in : in matrix(0 to size-1, 0 to size-1);
B_in : in matrix(0 to size-1, 0 to size-1);
AplusB : out matrix(0 to size-1, 0 to size-1);
parallel_add_done : out STD_LOGIC);
end newproj;

how to check for any carry generated while adding std_logic_vector using operator overloading?

I am trying to add two std_logic_vectors using the notation given below:-
library IEEE;
entity adder is
port( a:in std_logic_vector(31 downto 0);
b:in std_logic_vector(31 downto 0);
o:out std_logic_vector(31 downto 0));
end adder;
architecture Behavioral of adder is
end Behavioral;
One possibility is to generate the result with carry, and then split that afterwards, like:
architecture Behavioral of adder is
signal c_o : std_logic_vector(o'length downto 0); -- Result with carry
signal c : std_logic; -- Carry only
c_o <= ('0' & a) + b; -- Result with carry; extended with '0' to keep carry
o <= c_o(o'range); -- Result without carry
c <= c_o(c_o'left); -- Carry only
end Behavioral;
You can do this. The carry is not saved, but it's being reported that there was an overflow.
function "+" (Add1: std_logic_vector; Add2: std_logic_vector) return std_logic_vector is
variable big_sum: bit_vector(Add1'LENGTH downto 0);
big_sum = Add1 + Add2;
assert big_sum(Add1'LENGTH) = 0
report "overflow"
severity warning;
return big_sum(Add1'LENGTH-1 downto 0);
Of course you'll need to define a new package and also include that package in your already existing file.
Although I suggest you use unsigned/signed on your ports (and have a clock cycle of latency).
If you want the carry
o_with_carry <= std_logic_vector('0'&unsigned(a)+unsigned(b));
o_carry <= o_with_carry(o_with_carry'high);
o <= o_with_carry(o'range);

Counting down from an input value in VHDL

I'm trying to assign the value of input aa to the signal t in the code below. It compiles successfully, but there is a warning:
WARNING[9]: C:/Modeltech_5.7f/examples/hassan1.vhd(14): (vcom-1013) Initial value of "t" depends on value of signal "aa".
Here is the code:
library IEEE;
use IEEE.std_logic_1164.all;
use IEEE.std_logic_unsigned.all ;
use ieee.numeric_std.all;
entity counter is
port(clk :in std_logic;
reset : in std_logic;
aa: in std_logic_vector(3 downto 0);
check : out std_logic_vector(3 downto 0));
end counter;
architecture imp of counter is
signal i:std_logic_vector(3 downto 0):="0000";
signal t:std_logic_vector(3 downto 0):=aa;
if rising_edge(clk) and (t>0) then
end if;
end process;
end imp;
What should I be doing in order to decrement the input 'aa' in the process? The program is meant to decrement the value at input aa to 0.
It looks like you are trying to implement a down-counter with a load input. In such a counter, when load_enable = '1' you should register the load input value (aa in your case) into an internal signal. When load_enable = '0', you would decrement this count value. Here is a code example that does that:
library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std_unsigned.all;
entity down_counter is
port (
clock: in std_logic;
reset: in std_logic;
load_enable: in std_logic;
load_data: in std_logic_vector(3 downto 0);
output: out std_logic_vector(3 downto 0)
architecture rtl of down_counter is
signal count: std_logic_vector(3 downto 0);
process (clock, reset) begin
if reset then
count <= (others => '0');
elsif rising_edge(clock) then
if load_enable then
count <= load_data;
count <= count - 1;
end if;
end if;
end process;
output <= count;
For the record, the code above can be improved, but I didn't want to throw too much stuff at once. It is probably a good idea to use an integer instead of std_logic_vector for your count signal. Also you should check if the count proceeds as you expected, since the example uses the numeric_std_unsigned package. I'd recommend that you change it to numeric_std once you understand the code completely.
