I have VHDL code where I'm trying to multiply pixel values. I have the following entity:
entity xGradient is
port(
clk : in std_logic;
x11, x12, x13, x21, x22, x23, x31, x32, x33 : in integer range 0 to 255;
gradientInX : out integer range 0 to 255
);
end xGradient;
I do the following operation on these:
gradientInX <= x11 + 2*x21 + x31 - x13 - 2*x23 - x33;
The issue as you may notice is that the values obtained from the operation exceed the range of possible values my output integer can be. My simulation crashes when this happens. I don't know the right word for what I'm trying to do and therefore can't seem to figure out how to fix this problem and it's driving me crazy. I basically just want to truncate, or reduce the size of the output to ensure it fits into the gradientInX output port. I tried using a process block like this:
-- declared signals
signal gradientMem: integer;
--beginning of my architecture
begin
SobelOperator : process(gradientMem)
begin
gradientMem <= x11 + 2*x21 + x31 - x13 - 2*x23 - x33;
if (gradientMem > 255) then
gradientMem <= 255;
elsif (gradientMem < 0) then
gradientMem <= 0;
end if;
end process SobelOperator;
then assigning gradientMem to gradientInX but it doesn't work for some reason. Any help would really be appreciated. Note: I haven't included all the VHDL code as I thought it would be unnecessarily long. The error message I get is that the signal resulting value from the operation is out of range (since the resulting value is negative and the output has a range of only 0 to 255).
Matthew
This is one of the few situations where you should use a variable instead of a signal.
SobelOperator : process(x11, x21, x31, x13, x23, x33)
variable gradientMem : integer;
begin
gradientMem := x11 + 2*x21 + x31 - x13 - 2*x23 - x33;
if (gradientMem > 255) then
gradientMem := 255;
elsif (gradientMem < 0) then
gradientMem := 0;
end if;
gradientInX <= gradientMem;
end process SobelOperator;
By the way, this operation is called clipping.
If you want to truncate, the process should be written like:
SobelOperator : process(x11, x21, x31, x13, x23, x33)
variable gradientMem : integer;
begin
gradientMem := x11 + 2*x21 + x31 - x13 - 2*x23 - x33;
gradientInX <= gradientMem mod 256;
end process SobelOperator;
As #Paebbels suggests in the comment, most likely the range of gradientMem is not sufficient to handle the output of the operation.
Thing to note: in the integer operations, the intermediate range is (mostly) considered as 32b signed and the range is checked in the assignment. This gives quite a lot to speculate for the synthesis tool and can yield much larger design than optimal.
To overcome this, ieee.numeric_std.all is used for unsigned and signed types and operations. This gives absolute control for the intermediate vector lengths. This also forces you to gain understanding on what's going on in addition, subtraction, multiplication etc which typically helps in RTL design.
However, a thing to note is that in vector form, the operations are evaluated in pairs using the longest of the two as intermediate vector length.
For example:
process
variable i1 : integer range 0 to 15 := 15;
variable i2 : integer range 0 to 15 := 9;
variable i3 : integer range 0 to 15 := 11;
variable i4 : integer range 0 to 31 := 2;
variable iSum : integer range 0 to 63;
variable u1 : unsigned(3 downto 0) := to_unsigned(15,4);
variable u2 : unsigned(3 downto 0) := to_unsigned(9,4);
variable u3 : unsigned(3 downto 0) := to_unsigned(11,4);
variable u4 : unsigned(5 downto 0) := to_unsigned(2,6);
variable uSum : unsigned(5 downto 0);
begin
iSum := i1 + i2 + i3 + i4;
uSum := u1 + u2 + u3 + u4;
write(output, "int: " & to_string(iSum) & lf &
"us : " & to_string(to_integer(uSum)) & lf);
wait;
end process;
Gives:
# int: 37
# us : 5
Using parenthesis solves the problem:
uSum := (u1 + (u2 + (u3 + u4)));
An error is in the sensitivity list of your process SobelOperator : process(gradientMem). Here must be signals which influence to result signal(s) in other words this is a list of signals a process is sensitive to. So there should be x11, x21, x31, x13, x23 and x33 like SobelOperator : process(x11, x21, x31, x13, x23, x33)
Related
I'm currently working on writing a simple counter in VHDL, trying to genericize it as much as possible. Ideally I end up with a counter that can pause, count up/down, and take just two integer (min, max) values to determine the appropriate bus widths.
As far as I can tell, in order to get an integer of a given range, I just need to delcare
VARIABLE cnt: INTEGER RANGE min TO max := 0
Where min and max are defined as generics (both integers) in the entity. My understanding of this is that if min is 0, max is 5, for example, it will create an integer variable of 3 bits.
My problem is that I actually want to output this integer. So, naturally, I write
counterOut : OUT INTEGER RANGE min TO max
But this does not appear to be doing what I need. I'm generating a schematic block in Quartus Prime from this, and it creates a bus output from [min...max]. For example, if min = 0, max = 65, it outputs a 66 bit bus. Instead of the seven bit bus it should.
If I restricted the counter to unsigned values I might be able to just math out the output bus size, but I'd like to keep this as flexible as possible, and of course I'd like to know what I'm actually doing wrong and how to do it properly.
TL;DR: I want a VHDL entity to take generic min,max values, and generate an integer output bus of the required width to hold the range of values. How do?
If it matters, I'm using Quartus Prime Lite Edition V20.1.0 at the moment.
Note: I know I can use STD_LOGIC_VECTOR instead, but it is going to simulate significantly slower and is less easy to use than the integer type as far as I have read. I can provide more of my code if necessary, but it's really this one line that's the problem as far as I can tell.
I originally posted this on Stackexchange, but I think Stackoverflow might be a better place since it's more of a programming than a hardware problem.
EDIT: Complete code shown below
LIBRARY ieee;
USE ieee.std_logic_1164.all;
USE ieee.numeric_std.all;
USE ieee.std_logic_signed.all;
ENTITY Counter IS
GENERIC (modulo : INTEGER := 32;
min : INTEGER := 0;
max : INTEGER := 64);
PORT( pause : IN STD_LOGIC;
direction : IN STD_LOGIC; -- 1 is up, 0 is down
clk : IN STD_LOGIC;
counterOut : OUT INTEGER RANGE min TO max --RANGE 0 TO 32 -- THIS line is the one generating an incorrect output bus width
);
END ENTITY Counter;
-- or entity
ARCHITECTURE CounterArch OF Counter IS
BEGIN
PROCESS(direction, pause, clk)
VARIABLE cnt : INTEGER RANGE min TO max := 0;
VARIABLE dir : INTEGER;
BEGIN
IF direction = '1' THEN
dir := 1;
ELSE
dir := -1;
END IF;
IF clk'EVENT AND clk = '1' THEN
IF pause = '0'THEN
IF (cnt = modulo AND direction = '1') THEN
cnt := min; -- If we're counting up and hit modulo, reset to min value.
ELSIF (cnt = min AND direction = '0') THEN
cnt := modulo; --Counting down hit 0, go back to modulo.
ELSE
cnt := cnt + dir;
END IF;
END IF;
END IF;
counterOut <= cnt;
END PROCESS;
END ARCHITECTURE CounterArch;
I would like to implement a count min sketch with minimal update and access times.
Basically an input sample is hashed by multiple (d) hash functions and each of them increments a counter in the bucket that it hits. When querying for a sample, the counters of all the buckets corresponding to a sample are compared and the value of the smallest counter is returned as a result.
I am trying to find the minimum value of the counters in log_2(d) time with the following code:
entity main is
Port ( rst : in STD_LOGIC;
a_val : out STD_LOGIC_VECTOR(63 downto 0);
b_val : out STD_LOGIC_VECTOR(63 downto 0);
output : out STD_LOGIC_VECTOR(63 downto 0);
. .
. .
. .
CM_read_ready : out STD_LOGIC;
clk : in STD_LOGIC);
end main;
architecture Behavioral of main is
impure function min( LB, UB: in integer; sample: in STD_LOGIC_VECTOR(long_length downto 0)) return STD_LOGIC_VECTOR is
variable left : STD_LOGIC_VECTOR(long_length downto 0) := (others=>'0');
variable right : STD_LOGIC_VECTOR(long_length downto 0) := (others=>'0');
begin
if (LB < UB)
then
left := min(LB, ((LB + UB) / 2) - 1, sample);
right := min(((LB + UB) / 2) - 1, UB, sample);
if (to_integer(unsigned(left)) < to_integer(unsigned(right)))
then
return left;
else
return right;
end if;
elsif (LB = UB)
then
-- return the counter's value so that it can be compared further up in the stack.
return CM(LB, (to_integer(unsigned(hasha(LB)))*to_integer(unsigned(sample))
+ to_integer(unsigned(hashb(LB)))) mod width);
end if;
end min;
begin
CM_hashes_read_log_time: process (clk, rst)
begin
if (to_integer(unsigned(instruction)) = 2)
then
output <= min(0, depth - 1, sample);
end if;
end if;
end process;
end Behavioral;
When I run the above code, I get the following errors:
The simulator has terminated in an unexpected manner. Please review
the simulation log (xsim.log) for details.
[USF-XSim-62] 'compile' step failed with error(s). Please check the
Tcl console output or '/home/...sim/sim_1/behav/xsim/xvhdl.log' file
for more information.
[USF-XSim-62] 'elaborate' step failed with error(s). Please check the
Tcl console output or
'/home/...sim/sim_1/synth/func/xsim/elaborate.log' file for more
information.
I was not able to find any file called xsim.log and xvhdl.log was empty, but elaborate.log had some content:
Vivado Simulator 2018.2
Copyright 1986-1999, 2001-2018 Xilinx, Inc. All Rights Reserved.
Running: /opt/Xilinx/Vivado/2018.2/bin/unwrapped/lnx64.o/xelab -wto c199c4c74e8c44ef826c0ba56222b7cf --incr --debug typical --relax --mt 8 -L xil_defaultlib -L secureip --snapshot main_tb_behav xil_defaultlib.main_tb -log elaborate.log
Using 8 slave threads.
Starting static elaboration
Completed static elaboration
INFO: [XSIM 43-4323] No Change in HDL. Linking previously generated obj files to create kernel
Removing the following line solves the above errors:
output <= min(0, depth - 1, sample);
My questions:
Why am I not able to simulate this code?
Will this code be synthsizable once it is working?
Is there a better (and/or faster) way to obtain the minimum of all relevant hash buckets?
not that I was able to find any real world use for recursion, but just to surprise #EML (as requested in the comments above): you actually can define recursive hardware structures in VHDL.
In Quartus at least, this only works if you give the compiler a clear indication of the maximum recursion depth, otherwise it will try to unroll the recursion to any possible input, eventually dying from a stack overflow:
entity recursive is
generic
(
MAX_RECURSION_DEPTH : natural
);
port
(
clk : in std_ulogic;
n : in natural;
o : out natural
);
end recursive;
architecture Behavioral of recursive is
function fib(max_depth : natural; n : natural) return natural is
variable res : natural;
begin
if max_depth <= 1 then
res := 0;
return res;
end if;
if n = 0 then
res := 0;
elsif n = 1 or n = 2 then
res := 1;
else
res := fib(max_depth - 1, n - 1) + fib(max_depth - 1, n - 2);
end if;
return res;
end function fib;
begin
p_calc : process
begin
wait until rising_edge(clk);
o <= fib(MAX_RECURSION_DEPTH, n);
end process;
end Behavioral;
With a MAX_RECURSION_DEPTH of 6, this generates one single combinational circuit with more than 500 LEs (so the pracical use is probably very limited), but at least it works.
Is recursion possible in VHDL?
I would say, yes, but not recursion as we know it. That's the short answer. I have code (if anyone is interested that implements Quicksort) and it will synthesize quite happily. If anyone knows about Quicksort, it normally won't be anywhere near the context of synthesis. But I managed to do it.
The trick (which is vexatious and hard to follow) is to emulate recursion with a strange state machine that backtracks to the beginning state, after pushing a "state" onto a (hardware) stack. You can synthesize this sort of data structure quite easily if you want.
I recall some fascinating stuff written by Thatcher, Goguen and Wright about semantic transformations from one kind of coding domain to others (different models of computation, in short).
It does strike me that this is possibly a genesis point for actual recursive expressions in a more general sense. But do be warned, it's very difficult.
I am using Quartus 2 13.0 sp1(32 bit). The code compiles correctly but when I want to create symble I get an error.
I tried to check the error on google but did not find it. As I understand the problem is with integer f_in. for some reason when I put f_in in clk_out_full_num I get this error. Why do I get this error if the code compiles correctly?
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.STD_LOGIC_UNSIGNED.ALL;
use IEEE.Numeric_Std.ALL;
---------->>>>>>> this block is used to divide clock by 2 for every bit<<<<<<<<<<-------------
---------->>>>>>> this block also is used to manuely choose the ferquincy of the signal <<<<-------
entity Clock_Divider is
GENERIC( resulation : INTEGER := 4 ); --size of the binary input numbers in bits
port (
Frequency_num : in STD_LOGIC_VECTOR(resulation-1 downto 0);-- this is used to change the ferquincy using manuel buttens
f_in: in integer;
clk_in : in std_logic ;
clr : in std_logic;
clk_out: out std_logic;
clk_out_full_num: out std_logic_vector (f_in downto 0)
);
end Clock_Divider;
architecture logic_clock_divider of Clock_Divider is
signal q:std_logic_vector (16 downto 0);
signal bit_number : integer;-- this signal is used to convert Frequency_num(STD_LOGIC_VECTOR)--- to >>> integer
begin
bit_number <= to_integer(unsigned(Frequency_num)); --the converstion it self Frequency_num(STD_LOGIC_VECTOR)--- to >>> integer
--clock_divider
process(clk_in,clr )
begin
if (clk_in'event and clk_in='1') then
q<= q+1;
end if;
end process;
clk_out <=q(bit_number);
clk_out_full_num<=q;
end logic_clock_divider ;
Internal Error: Sub-system: VRFX, File:
/quartus/synth/vrfx/vrfx_analyzer_impl.cpp, Line: 2967 port_constraint
Stack Trace:
0x52a03: VRFX_ANALYZER_IMPL::vhdl_set_port_and_parameter_to_hdb_entity + 0xb33
0x48a71: VRFX_ANALYZER_IMPL::analyze + 0x2f1
0x483dc: VRFX_ANALYZER::analyze + 0xc
0x6a3f8: SGN_ANALYZER::analyze + 0x148
0x721a0: SGN_ANALYZER::process_curr_vrfx_file + 0x410
0x72965: SGN_ANALYZER::process_curr_file + 0x355
0x11a49: sgn_source_file_processing + 0x89
0x47d5: qsyn_execute_sgn + 0x2a5
0x1c924: QSYN_FRAMEWORK::execute_core + 0x104
0x1f12f: QSYN_FRAMEWORK::execute + 0x15f
0x11562: qexe_get_tcl_sub_option + 0x1f32
0x13a38: qexe_process_cmdline_arguments + 0x488
0x13bd4: qexe_standard_main + 0x84
0x19dd6: qsyn_main + 0xa6
0x4e21: msg_main_thread + 0x11
0x1c98: _thr_final_wrapper + 0x8
0x5515: msg_thread_wrapper + 0x85
0x3921: mem_thread_wrapper + 0x31
0x60f1: msg_exe_main + 0x81
0x1ba1c: _main + 0x1c
0x24cd7: __ftol2 + 0x1e1
0x162c3: BaseThreadInitThunk + 0x23
0x61f68: RtlSubscribeWnfStateChangeNotification + 0x438
0x61f33: RtlSubscribeWnfStateChangeNotification + 0x403
End-trace
Quartus II 32-bit Version 13.0.1 Build 232 06/12/2013 SJ Web Edition
Service Pack Installed:
1
I don´t know how you synthesize your code. When I copy the code into my Vivavo the IDE shows two errors:
15 - f_in cannot be used within its own interface list
19 - f_in is illegal in an expression
If you want a variable width for your input use a generic like you do it before.
Generic ( InputSize : INTEGER := 4;
OutputSize : INTEGER := 5
);
Port ( Frequency_num : in STD_LOGIC_VECTOR(InputSize - 1 downto 0);
clk_in : in STD_LOGIC;
clr : in STD_LOGIC;
clk_out: out STD_LOGIC;
clk_out_full_num: out STD_LOGIC_VECTOR(OutputSize - 1 downto 0)
);
You have to set a fix value for the hardware synthesis. You can´t create a hardware with a variable bus width!
Set the width of q to the width of clk_out_full_num
signal q:std_logic_vector (OutputSize - 1 downto 0);
and you can synthesize your code.
I'm trying to create a flexible array of constants. I want to use a 2D array which may sometimes be for example a 2x1, 2x2, 3x2 array etc. For example:
type int_2d_array is array (integer range<>, integer range<>) of integer;
constant M : positive := 2;
constant nMax : positive := 1;
constant n : int_2d_array(M - 1 downto 0, nMax - 1 downto 0) := ( (1) , (2) ); -- wrong
error: type int_2d_array does not match with the integer literal
If I do this, it doesn't complain:
type int_2d_array is array (integer range<>, integer range<>) of integer;
constant M : positive := 2;
constant nMax : positive := 2;
constant n : int_2d_array(M - 1 downto 0, nMax - 1 downto 0) := ( ( 0,1 ) , ( 2,2 )); -- accepted
Is the first example even possible using a 2D array?
The LRM (section 9.3.3 Aggregates) states:
Aggregates containing a single element association
shall always be specified using named association in order to distinguish them from parenthesized expressions.
So, this is OK:
constant n : int_1d_array(0 downto 0) := ( 0 => 1 );
and this is not:
constant n : int_1d_array(0 downto 0) := ( 1 );
http://www.edaplayground.com/x/6a4
I managed to compile the first example in the following ugly way:
type int_2d_array is array (integer range<>, integer range<>) of integer;
constant M : positive := 2;
constant nMax : positive := 1;
constant n : int_2d_array(M - 1 downto 0, nMax - 1 downto 0) := ( (others => 1) , (others => 2) );
Strange behavior, indeed.
I used Xlinix ISE 14.1 to write the following code.
I found the syntax to be correct but the xilinx IDE shows errors at line 27 and 30.
I am trying to find the first partial derivatives of a matrix of numbers which is similar to finding the edges in an image.
The function by2i is used to convert the bytes (i.e. bits) to integer number.
In this VHDL code I am getting error messages:
"ERROR:HDLCompiler:806 B:/gxgyVHDL.vhd" Line 27: Syntax error near "return".
"ERROR:HDLCompiler:806 - "B:/gxgyVHDL.vhd" Line 30: Syntax error near ","".
I am unable to correct these errors as I know very little in VHDL. I learned basic programming in VHDL like implementing MUX, counters etc.
This is the first time I am writing a program for image processing And I'm not sure whether this program works like expected but it works well matlab and python.
Please help to correct these errors.
Here is vhdl code:
enter code here
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.STD_LOGIC_UNSIGNED.ALL;
use IEEE.numeric_std.ALL;
use IEEE.math_real.ALL;
entity getGxGy is
generic (width : natural := 66;
height : natural := 130); --width and height of the image window.
Port( pxl : in STD_LOGIC_VECTOR(7 downto 0);
clk : in bit;
fv : out real); --need to configure 'fv' signal properly to appropriate bit vector.
end getHOGfv;
architecture behave of getGxGy is
function by2i (b : STD_LOGIC_VECTOR(7 downto 0)) return natural is
variable num : natural;
begin
num := 0;
for i in b'Range loop
if b(i) = '1' then
num := num + 2**i;
end if;
end loop
return num
end by2i;
type bufarr is array (1 to height, 1 to width) of natural;
type gxgy is array (1 to height-2, 1 to width-2) of integer;
--signal tempfv : mat4;
process(clk, pxl)
variable buf: bufarr;
variable gx, gy: gxgy;
begin
--Buffer to store/create 64*128 pixels/widthindowidth
for h in 2 to height-1 loop
for w in 2 to width-1 loop
buf(h)(w) := by2i(pxl);
end loop;
end loop;
--1pixel padding
for w in 1 to width loop
buf(1)(w) := 0;
end loop;
for w in 1 to width loop
buf(height)(w) := 0;
end loop;
for h in 2 to height-1 loop
buf(h)(1) := 0;
end loop;
for h in 2 to height-1 loop
buf(h)(width) := 0;
end loop;
--compute gradients
for h in 2 to height-1 loop
for w in 2 to width-1 loop
gx(h)(w) := buf(h+1)(w)-buf(h-1)(w);
gy(h)(w) := buf(h)(w+1)-buf(h)(w-1);
mag(h)(w) := abs(gx(h)(w)+gy(h)(w));
ang(h)(w) := gy(h)(w)/gx(h)(w);
end loop;
end loop;
end process;
end behave;
Several problems:
Your entity names do not match. That is, entity getGxGy does not match end getHOGfv;
You are missing a trailing ; on the end loop in by2i
You are missing a trailing ; on the return in by2i
You are missing a begin statement in your architecture (between the type gxgy and the process(clk, pxl)
Your syntax for the use of multidimensional arrays is wrong. Rather than buf(1)(w), it should be buf(1, 2).
Neither mag nor ang are defined.
When you have a large number of errors, it can be difficult to track down the exact cause. Often the compilers get confused at reporting the errors. Start with the first one, fix it, and re-compile. Continue until things cleanup.
Also, a point of clarification. You don't need by2i. You can use numeric_std to do the converstion (thanks to scary_jeff for pointing this out). Use to_integer(unsigned(pxl)) to do the conversion.
And one further point. Do not use both std_logic_unsigned and numeric_std at the same time. numeric_std is the standard way to use signed and unsigned numbers. std_logic_unsigned was a vendor specific extension that is not standard.
Edit: You used the following syntax to define your arrays:
type bufarr is array (1 to height, 1 to width) of natural;
This is fine. And as I noted above you have to use the buf(h, w) syntax. But you could define it differently, such as:
type width_array is array(1 to width) of natural;
type bufarr is array(1 to height) of width_array;
Which you could then index using buf(h)(w).
I prefer the former.
In addition to the syntax items and missing declarations noted by PlayDough there are two superfluous context clauses for packages numeric_std (which should not be mixed with the Synopsys arithmetic pages std_logic_unsigned) and math_real (which isn't yet used).
After all the changes are edited in:
library ieee;
use ieee.std_logic_1164.all;
use ieee.std_logic_unsigned.all;
-- use ieee.numeric_std.all;
-- use ieee.math_real.all;
entity getgxgy is
generic (width : natural := 66;
height : natural := 130); -- width and height of the image window.
port( pxl : in std_logic_vector(7 downto 0);
clk : in bit;
fv : out real); -- need to configure 'fv' signal properly to appropriate bit vector.
end getgxgy; -- WAS gethogfv;
architecture behave of getgxgy is
function by2i (b : std_logic_vector(7 downto 0)) return natural is
variable num : natural;
begin
num := 0;
for i in b'range loop
if b(i) = '1' then
num := num + 2 ** i;
end if;
end loop; -- MISSING ';'
return num; -- MISSING ';'
end function by2i;
type bufarr is array (1 to height, 1 to width) of natural;
type gxgy is array (1 to height - 2, 1 to width - 2) of integer;
--signal tempfv : mat4;
begin -- for architecture modiy WAS MISSING
process (clk, pxl)
variable buf: bufarr;
variable gx, gy: gxgy;
variable mag, ang: gxgy; -- MISSING DECLARATIONS
begin
--buffer to store/create 64*128 pixels/widthindowidth
for h in 2 to height - 1 loop
for w in 2 to width - 1 loop
buf(h, w) := by2i(pxl); -- WAS buf(h)(w)
end loop;
end loop;
--1pixel padding
for w in 1 to width loop
buf(1, w) := 0; -- WAS buf(1)(w)
end loop;
for w in 1 to width loop
buf(height, w) := 0; -- WAS buf(height)(w)
end loop;
for h in 2 to height - 1 loop
buf(h, 1) := 0; -- WAS buf(h)(1)
end loop;
for h in 2 to height - 1 loop
buf(h, width) := 0; -- WAS buf(h)(width)
end loop;
--compute gradients
for h in 2 to height - 1 loop
for w in 2 to width - 1 loop
gx(h, w) := buf(h + 1, w) - buf(h - 1, w); -- WAS gx(h)(w), buf(h+1)(w) and buf(h-1)(w)
gy(h, w) := buf(h, w + 1) - buf(h, w - 1); -- WAS gy(h)(w), buf(h)(w+1) and buf(h)(w-1)
mag(h, w) := abs(gx(h, w) + gy(h, w)); -- WAS mag(h)(w), x(h)(w) and gy(h)(w)
ang(h, w) := gy(h, w) / gx(h, w); --WAS ang(h)(w), gy(h)(w) and gx(h)(w)
end loop;
end loop;
end process;
end architecture behave;
your code analyzes and elaborates, noting there is no assignment to fv, type REAL is not synthesis eligible and there is no synthesis eligible use of clk.
If clk were std_logic (or std_ulogic) you could use the std_logic_1164 function rising_edge.
Adding a recognized sequential logic RTL construct for a clock edge gives:
process (clk) -- pxl NOT NEEDED , pxl)
variable buf: bufarr;
variable gx, gy: gxgy;
variable mag, ang: gxgy; -- MISSING DECLARATIONS
begin
if clk'event and clk = '1' then
--buffer to store/create 64*128 pixels/widthindowidth
for h in 2 to height - 1 loop
for w in 2 to width - 1 loop
buf(h, w) := conv_integer(pxl); -- WAS buf(h)(w)
end loop; -- CHANGED to use conv_integer
end loop;
--1pixel padding
for w in 1 to width loop
buf(1, w) := 0; -- WAS buf(1)(w)
end loop;
for w in 1 to width loop
buf(height, w) := 0; -- WAS buf(height)(w)
end loop;
for h in 2 to height - 1 loop
buf(h, 1) := 0; -- WAS buf(h)(1)
end loop;
for h in 2 to height - 1 loop
buf(h, width) := 0; -- WAS buf(h)(width)
end loop;
--compute gradients
for h in 2 to height - 1 loop
for w in 2 to width - 1 loop
gx(h, w) := buf(h + 1, w) - buf(h - 1, w); -- WAS gx(h)(w), buf(h+1)(w) and buf(h-1)(w)
gy(h, w) := buf(h, w + 1) - buf(h, w - 1); -- WAS gy(h)(w), buf(h)(w+1) and buf(h)(w-1)
mag(h, w) := abs(gx(h, w) + gy(h, w)); -- WAS mag(h)(w), x(h)(w) and gy(h)(w)
ang(h, w) := gy(h, w) / gx(h, w); --WAS ang(h)(w), gy(h)(w) and gx(h)(w)
end loop;
end loop;
end if;
end process;
also noting the switch to the package std_logic_unsigned function conv_integer from using function by2i.
So these changes along with deleting the function by2i analyzes.
Genning up a testbench to look for bounds errors:
library ieee;
use ieee.std_logic_1164.all;
entity getgxgy_tb is
end entity;
architecture foo of getgxgy_tb is
signal pxl: std_logic_vector(7 downto 0) := (others => '0');
signal clk: bit;
signal fv: real;
begin
DUT:
entity work.getgxgy
port map (
pxl => pxl,
clk => clk,
fv => fv
);
CLOCK:
process
begin
wait for 10 ns;
clk <= not clk;
if now > 120 ns then
wait;
end if;
end process;
end architecture;
And we elaborate and run the testbench and get a run time error!
The error is division by zero in the assignment to ang, so your algorithm needs a bit of work still.
Blocking that with an if statement and we find there's a bounds error in the assignment:
gx(h, w) := buf(h + 1, w) - buf(h - 1, w); -- WAS gx(h)(w), buf(h+1)(w) and buf(h-1)(w)
And that's caused by hitting w = 65 when
type gxgy is array (1 to height - 2, 1 to width - 2) of integer;
type gxgy's second dimension corresponding to w has a range to width - 2 while w reaches width - 1 which is out of bounds.
So a bit more algorithmic expression tuning still to do.
It isn't particularly clear what you intend to register. If it's just fv that could occur in a different process, with the current processes sensitivity list set to just pxl and gx, gy, mag and ang made into signals.
It's likely that all the abs, multiplies and divides may not fit in a target FPGA, requiring operations be spread over some number of clocks using common resources for arithmetic operations. VHDL describes hardware and every operator invocation or function call can imply it's own hardware.
In synthesis a loop statement has it's sequence of statements 'unrolled' and where no interdependencies are found produce separate hardware. For h in 2 to height - 1 and w in 2 to width - 1 ranges in your nested loops your generic values are implying 8001 subtracts for each of gx and gy, abs and addition for mag and divides for ang, all from changing the value of pxl. This tells us your hardware isn't going to fit in any FPGA without sharing resources over some number of clocks, a time and space tradeoff.
So not only does your algorithm need a bit work, you need to take implementation resources into account.
You don't program in VHDL you describe hardware.