VHDL: Generate a generic case statement with adjustable amount of cases - vhdl

I want an approximation of the Tanh function by saving the values in a LUT (by this I am doing a quantization). I want to choose the Number of entries in the LUT.
As an not-correct example, I imagine a code like
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.NUMERIC_STD.ALL;
use ieee.fixed_pkg.all;
entity tanh_lut is
generic (
MIN_RANGE: real := 0.0; -- Minimum value of x
MAX_RANGE: real := 5.0; -- Maximum value of x
DATA_RANGE_int: positive:= 8;
DATA_RANGE_frac: positive:= 8;
);
Port ( DIN : in sfixed(DATA_RANGE_int-1 downto -(DATA_RANGE_frac-1));
DOUT : out sfixed(DATA_RANGE_int-1 downto -(DATA_RANGE_frac-1))
end tanh_lut;
architecture Behavioral of tanh_lut is
begin
lut_gen: for i in 0 to LUT_SIZE-1 generate
constant x_val : real := MIN_RANGE + (MAX_RANGE - MIN_RANGE) * i / (LUT_SIZE-1);
constant x_val_next : real := MIN_RANGE + (MAX_RANGE - MIN_RANGE) * (i+1) / (LUT_SIZE-1);
constant y_val : real := tanh(x_val);
if DIN>=x_val_previous AND DIN<x_val then
DOUT <= to_sfixed(tanh(y_val),DOUT ) ;
END IF
end generate;
end Behavioral;
Per example, if I want 4 entries in the range 0 to 3, I want that it is synthesizing a code like:
if DIN>0 AND DIN<=1 then
DOUT <= to_sfixed(0, DOUT);
else DIN>1 AND DIN<=2 then
DOUT <= to_sfixed(0.76159415595, DOUT);
else DIN>2 AND DIN<=3 then
DOUT <= to_sfixed(0.96402758007, DOUT);
else DIN>3 AND DIN<=4 then
DOUT <= to_sfixed(0.99505475368, DOUT);
End if
Is there any way that a code like this or a code which implements the idea behind this is possible?
A simple LUT with addresses is not possible because the addresses are always integer and DIN is fixed point, e.g., 1.5
The other possibility would be two LUTs, one for mapping the Input to an address, another for mapping the address to the LUT entry, e.g., LUT1: 1.5=> address 5, LUT2: address 5 => 0.90. But by this I would double the amount of resources what I dont want
My requirements: things like the tanh(x) should not be synthesized, only the final value of tanh(x). It shoudl also be hardware efficient

It does not matter if you use a nested „if-elsif“ construct or if you use a new „if“ construct for each check.
So you can create a loop like this:
for i in 0 to c_number_of_checks-1 loop
if c_boundaries(i)<DIN and DIN<=c_boundaries(i+1) then
DOUT <= c_output_values(i);
end if;
end loop;
Of course you must provide the constants c_number_of_checks and c_boundaries, c_output_values. This can be done by:
constant c_number_of_checks : natural := 4;
type array_of_your_data_type is array (natural range <>) of your_data_type;
constant c_boundaries : array_of_your_data_type(c_number_of_checks downto 0) := init_c_boundaries(c_number_of_checks);
constant c_output_values : array_of_your_data_type(c_number_of_checks-1 downto 0) := init_c_output_values(c_number_of_checks);
This means you will need the functions init_c_boundaries, init_c_output_values, which create arrays of values, which can initialize the constant c_boundaries and c_output_values.
But this is not complicated (you can use from ieee.math_real the function TANH), as the functions need not to be synthesizable, as they are called only during compile time.
As you see, you will have some effort. So perhaps it is easier to follow the other suggestions. If you do so (value as address of a LUT) you should think about automatic ROM inference, which is provided by several tool chains and will give you a very efficient (small) hardware.

Related

VHDL Integer Range Output Bus Width

I'm currently working on writing a simple counter in VHDL, trying to genericize it as much as possible. Ideally I end up with a counter that can pause, count up/down, and take just two integer (min, max) values to determine the appropriate bus widths.
As far as I can tell, in order to get an integer of a given range, I just need to delcare
VARIABLE cnt: INTEGER RANGE min TO max := 0
Where min and max are defined as generics (both integers) in the entity. My understanding of this is that if min is 0, max is 5, for example, it will create an integer variable of 3 bits.
My problem is that I actually want to output this integer. So, naturally, I write
counterOut : OUT INTEGER RANGE min TO max
But this does not appear to be doing what I need. I'm generating a schematic block in Quartus Prime from this, and it creates a bus output from [min...max]. For example, if min = 0, max = 65, it outputs a 66 bit bus. Instead of the seven bit bus it should.
If I restricted the counter to unsigned values I might be able to just math out the output bus size, but I'd like to keep this as flexible as possible, and of course I'd like to know what I'm actually doing wrong and how to do it properly.
TL;DR: I want a VHDL entity to take generic min,max values, and generate an integer output bus of the required width to hold the range of values. How do?
If it matters, I'm using Quartus Prime Lite Edition V20.1.0 at the moment.
Note: I know I can use STD_LOGIC_VECTOR instead, but it is going to simulate significantly slower and is less easy to use than the integer type as far as I have read. I can provide more of my code if necessary, but it's really this one line that's the problem as far as I can tell.
I originally posted this on Stackexchange, but I think Stackoverflow might be a better place since it's more of a programming than a hardware problem.
EDIT: Complete code shown below
LIBRARY ieee;
USE ieee.std_logic_1164.all;
USE ieee.numeric_std.all;
USE ieee.std_logic_signed.all;
ENTITY Counter IS
GENERIC (modulo : INTEGER := 32;
min : INTEGER := 0;
max : INTEGER := 64);
PORT( pause : IN STD_LOGIC;
direction : IN STD_LOGIC; -- 1 is up, 0 is down
clk : IN STD_LOGIC;
counterOut : OUT INTEGER RANGE min TO max --RANGE 0 TO 32 -- THIS line is the one generating an incorrect output bus width
);
END ENTITY Counter;
-- or entity
ARCHITECTURE CounterArch OF Counter IS
BEGIN
PROCESS(direction, pause, clk)
VARIABLE cnt : INTEGER RANGE min TO max := 0;
VARIABLE dir : INTEGER;
BEGIN
IF direction = '1' THEN
dir := 1;
ELSE
dir := -1;
END IF;
IF clk'EVENT AND clk = '1' THEN
IF pause = '0'THEN
IF (cnt = modulo AND direction = '1') THEN
cnt := min; -- If we're counting up and hit modulo, reset to min value.
ELSIF (cnt = min AND direction = '0') THEN
cnt := modulo; --Counting down hit 0, go back to modulo.
ELSE
cnt := cnt + dir;
END IF;
END IF;
END IF;
counterOut <= cnt;
END PROCESS;
END ARCHITECTURE CounterArch;

VHDL - How to compare two bit_vectors for dynamic table lookup

I'm storing two tables in two signals. One table keeps the key (address) and the other keeps the value corresponding to the key. I need to compare an input to the key and, if they match, return the value stored.
The reason why I need this is for a dynamic lookup table for branch instruction prediction. In the fetch stage of a processor I get the input Instruction_Address and I return a branch_To_Address and a branch_Prediction. Initially I want to store 16 predictions/branch addresses and use a circular buffer ring to overwrite as needed.
I've been trying to use a FOR with a nested IF to search for the key inside keyTable.
The whole module seems to work fine, except when I compare two bit_vectors with the IF statement. I need this twice (one on read and another on write) and hence I need to "sweep" the keysTable so I can see if the address that is being looked up has an entry.
I noticed the error upon simulation, where the ELSE clause is being called always regardless of the keysTable having the right entries.
Verifiable example:
library IEEE;
use ieee.numeric_bit.all;
entity branch_prediction_table is
generic (
addrSize : NATURAL := 4;
tableSize : NATURAL := 4);
port (
clock : in bit;
input_addr: in bit_vector(addrSize-1 downto 0);
return_value : out bit );
end branch_prediction_table;
architecture branch_table of branch_prediction_table is
signal keysTable : bit_vector(addrSize*tableSize-1 downto 0) := ( others => '0');
signal valuesTable : bit_vector(tableSize*2-1 downto 0) := ( others => '0');
begin
tableProc: process(clock) is
variable valueFromTable : bit;
begin
if rising_edge(clock) then
search_table: for iR in (tableSize-1) to 0 loop
if (keysTable(addrSize*(iR+1)-1 downto addrSize*iR) = input_addr) then
valueFromTable := valuesTable((iR+1)*2-1);
EXIT search_table;
else
valueFromTable := '0';
end if;
end loop search_table;
return_value <= valueFromTable;
end if; -- rising_edge(clock)
end process tableProc;
end branch_table;
with verifiable testbench simulation TCL:
add wave -position insertpoint \
sim:/branch_prediction_table/addrSize \
sim:/branch_prediction_table/clock \
sim:/branch_prediction_table/input_addr \
sim:/branch_prediction_table/keysTable \
sim:/branch_prediction_table/return_value \
sim:/branch_prediction_table/tableSize \
sim:/branch_prediction_table/valuesTable
force -freeze sim:/branch_prediction_table/valuesTable 11111111 0
force -freeze sim:/branch_prediction_table/keysTable 1111101001100011 0
force -freeze sim:/branch_prediction_table/clock 0 0, 1 {5000 ps} -r {10 ns}
run 10 ns
force -freeze sim:/branch_prediction_table/input_addr 1010 0
run 20 ns
force -freeze sim:/branch_prediction_table/input_addr 1111 0
run 10 ns
and testbench simulation result showing that error is indeed in the IF:
I have tried converting them with to_integer(unsigned(bit_vector1)) = to_integer(unsigned(bit_vector2)) with no avail
As user1155120 pointed out:
The problem lies within search_table: for iR **in** (tableSize-1) to 0 loop
It should've been "down to" as L > R. Since I used "in" with L>R, that produces a null range and the for loop iteration is said to be complete.
(IEEE Std 1076-2008 5.2 Scalar types, "A range specifies a subset of values of a scalar type. A range is said to be a null range if the specified subset is empty. The range L to R is called an ascending range; if L > R, then the range is a null range. The range L downto R is called a descending range; if L < R, then the range is a null range.").
10.10 Loop statement "For the execution of a loop with a for iteration scheme, the discrete range is first evaluated. If the discrete range is a null range, the iteration scheme is said to be complete, ..."

Synthesizable VHDL recursion, Vivado: simulator has terminated in an unexpected manner

I would like to implement a count min sketch with minimal update and access times.
Basically an input sample is hashed by multiple (d) hash functions and each of them increments a counter in the bucket that it hits. When querying for a sample, the counters of all the buckets corresponding to a sample are compared and the value of the smallest counter is returned as a result.
I am trying to find the minimum value of the counters in log_2(d) time with the following code:
entity main is
Port ( rst : in STD_LOGIC;
a_val : out STD_LOGIC_VECTOR(63 downto 0);
b_val : out STD_LOGIC_VECTOR(63 downto 0);
output : out STD_LOGIC_VECTOR(63 downto 0);
. .
. .
. .
CM_read_ready : out STD_LOGIC;
clk : in STD_LOGIC);
end main;
architecture Behavioral of main is
impure function min( LB, UB: in integer; sample: in STD_LOGIC_VECTOR(long_length downto 0)) return STD_LOGIC_VECTOR is
variable left : STD_LOGIC_VECTOR(long_length downto 0) := (others=>'0');
variable right : STD_LOGIC_VECTOR(long_length downto 0) := (others=>'0');
begin
if (LB < UB)
then
left := min(LB, ((LB + UB) / 2) - 1, sample);
right := min(((LB + UB) / 2) - 1, UB, sample);
if (to_integer(unsigned(left)) < to_integer(unsigned(right)))
then
return left;
else
return right;
end if;
elsif (LB = UB)
then
-- return the counter's value so that it can be compared further up in the stack.
return CM(LB, (to_integer(unsigned(hasha(LB)))*to_integer(unsigned(sample))
+ to_integer(unsigned(hashb(LB)))) mod width);
end if;
end min;
begin
CM_hashes_read_log_time: process (clk, rst)
begin
if (to_integer(unsigned(instruction)) = 2)
then
output <= min(0, depth - 1, sample);
end if;
end if;
end process;
end Behavioral;
When I run the above code, I get the following errors:
The simulator has terminated in an unexpected manner. Please review
the simulation log (xsim.log) for details.
[USF-XSim-62] 'compile' step failed with error(s). Please check the
Tcl console output or '/home/...sim/sim_1/behav/xsim/xvhdl.log' file
for more information.
[USF-XSim-62] 'elaborate' step failed with error(s). Please check the
Tcl console output or
'/home/...sim/sim_1/synth/func/xsim/elaborate.log' file for more
information.
I was not able to find any file called xsim.log and xvhdl.log was empty, but elaborate.log had some content:
Vivado Simulator 2018.2
Copyright 1986-1999, 2001-2018 Xilinx, Inc. All Rights Reserved.
Running: /opt/Xilinx/Vivado/2018.2/bin/unwrapped/lnx64.o/xelab -wto c199c4c74e8c44ef826c0ba56222b7cf --incr --debug typical --relax --mt 8 -L xil_defaultlib -L secureip --snapshot main_tb_behav xil_defaultlib.main_tb -log elaborate.log
Using 8 slave threads.
Starting static elaboration
Completed static elaboration
INFO: [XSIM 43-4323] No Change in HDL. Linking previously generated obj files to create kernel
Removing the following line solves the above errors:
output <= min(0, depth - 1, sample);
My questions:
Why am I not able to simulate this code?
Will this code be synthsizable once it is working?
Is there a better (and/or faster) way to obtain the minimum of all relevant hash buckets?
not that I was able to find any real world use for recursion, but just to surprise #EML (as requested in the comments above): you actually can define recursive hardware structures in VHDL.
In Quartus at least, this only works if you give the compiler a clear indication of the maximum recursion depth, otherwise it will try to unroll the recursion to any possible input, eventually dying from a stack overflow:
entity recursive is
generic
(
MAX_RECURSION_DEPTH : natural
);
port
(
clk : in std_ulogic;
n : in natural;
o : out natural
);
end recursive;
architecture Behavioral of recursive is
function fib(max_depth : natural; n : natural) return natural is
variable res : natural;
begin
if max_depth <= 1 then
res := 0;
return res;
end if;
if n = 0 then
res := 0;
elsif n = 1 or n = 2 then
res := 1;
else
res := fib(max_depth - 1, n - 1) + fib(max_depth - 1, n - 2);
end if;
return res;
end function fib;
begin
p_calc : process
begin
wait until rising_edge(clk);
o <= fib(MAX_RECURSION_DEPTH, n);
end process;
end Behavioral;
With a MAX_RECURSION_DEPTH of 6, this generates one single combinational circuit with more than 500 LEs (so the pracical use is probably very limited), but at least it works.
Is recursion possible in VHDL?
I would say, yes, but not recursion as we know it. That's the short answer. I have code (if anyone is interested that implements Quicksort) and it will synthesize quite happily. If anyone knows about Quicksort, it normally won't be anywhere near the context of synthesis. But I managed to do it.
The trick (which is vexatious and hard to follow) is to emulate recursion with a strange state machine that backtracks to the beginning state, after pushing a "state" onto a (hardware) stack. You can synthesize this sort of data structure quite easily if you want.
I recall some fascinating stuff written by Thatcher, Goguen and Wright about semantic transformations from one kind of coding domain to others (different models of computation, in short).
It does strike me that this is possibly a genesis point for actual recursive expressions in a more general sense. But do be warned, it's very difficult.

VHDL pass range to procedure

I'm writing my own package to deal with generic matrix-like objects due to unavailability of VHDL-2008 (I'm only concerned with compilation and simulation for the time being).
My aim is getting a matrix M_out from a matrix M_in such that:
M_out(i downto 0, j downto 0) <= M_in(k+i downto k, l+j downto l);
using a subroutine of sort. For, let's say, semantic convenience and analogy with software programming languages my subroutine prototype should ideally look something like this:
type matrix is array(natural range <>, natural range <>) of std_logic;
...
procedure slice_matrix(signal m_out: out matrix;
constant rows: natural range<>;
constant cols: natural range<>;
signal m_in: in matrix);
The compiler does however regard this as an error:
** Error: custom_types.vhd(9): near "<>": syntax error
** Error: custom_types.vhd(9): near "<>": syntax error
Is it possible to pass a range as an argument in some way or shall I surrender and pass 4 separate indexes to calculate it locally?
An unconstrained index range natural range <> is not a VHDL object of class signal, variable, constant, or file. Thus it can not be passed into a subprogram. I wouldn't implement a slice operations as a procedure, because it's a function like behavior.
An implementation for working with matrices and slices thereof is provided by the PoC-Library. The implementation is provided in the vectors package.
function slm_slice(slm : T_SLM; RowIndex : natural; ColIndex : natural; Height : natural; Width : natural) return T_SLM is
variable Result : T_SLM(Height - 1 downto 0, Width - 1 downto 0) := (others => (others => '0'));
begin
for i in 0 to Height - 1 loop
for j in 0 to Width - 1 loop
Result(i, j) := slm(RowIndex + i, ColIndex + j);
end loop;
end loop;
return Result;
end function;
More specialized functions to slice off a row or column can be found in that file too. It also provides procedures to assign parts of a matrix.
This package works in simulation and synthesis.
Unfortunately, slicing multi dimensional arrays will not be part of VHDL-2017. I'll make sure it's discuss for VHDL-202x again.
Passing ranges into a subprogram will be allowed in VHDL-2017. The language change LCS 2016-099 adds this capability.

Unsigned logic, vector and addition - How?

I'm creating a program counter that is supposed to use only unsigned numbers.
I have 2 STD_LOGIC_VECTOR and a couple of STD_LOGIC. Is there anything I need to do so that they only use unsigned? At the moment I only have library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
I also need to increase one of the binary vectors by 1 under certain conditions (as you probably have guessed by now). Would you be so kind to explain how to perform such actions (using unsigned and adding up one) considering one of the vectors is output with 32 bits.
I'm guessing (I tried) Output <= Output + 1; won't do. Oh and I'm using a process.
In brief, you can add the ieee.numeric_std package to your architecture (library ieee; use ieee.numeric_std.all;) and then do the addition using:
Output <= std_logic_vector(unsigned(Output) + 1);
to convert your std_logic_vector to an unsigned vector, increment it, and finally convert the result back to an std_logic_vector.
Note that if Output is an output port, this won't work because you can't access the value of an output port within the same block. If that is the case, you need to add a new signal and then assign Output from that signal, outside your process.
If you do need to add a signal, it might be simpler to make that signal a different type than std_logic_vector. For example, you could use an integer or the unsigned type above. For example:
architecture foo of bar is
signal Output_int : integer range 0 to (2**Output'length)-1;
begin
PR: process(clk, resetn)
begin
if resetn='0' then
Output_int <= 0;
elsif clk'event and clk='1' then
Output_int <= Output_int + 1;
end if;
end process;
Output <= std_logic_vector(to_unsigned(Output_int, Output'length));
end foo;
Output_int is declared with a range of valid values so that tools will be able to determine both the size of the integer as well as the range of valid values for simulation.
In the declaration of Output_int, Output'length is the width of the Output vector (as an integer), and the "**" operator is used for exponentiation, so the expression means "all unsigned integers that can be expressed with as many bits as Output has".
For example, for an Output defined as std_logic_vector(31 downto 0), Output'length is 32. 232-1 is the highest value that can be expressed with an unsigned 32-bit integer. Thus, in the example case, the range 0 to (2**Output'length)-1 resolves to the range 0...4294967295 (232=4294967296), i.e. the full unsigned range that can be expressed with 32 bits.
Note that you'll need to add any wrapping logic manually: VHDL simulators will produce an error when you've reached the maximum value and try to increment by one, even if the synthesized logic will cleanly wrap around to 0.

Resources