Multiply Accumulate Unit

Multiply Accumulate Unit - vhdl

I have a MAC unit for a transport triggered architecture processor, but for some reason. the multiply-accumulate part isn't working. I tested it separately and it worked, but when simulated out_reg_int is always 0 even though op_a and op_b are the correct values and I have included Numeric_STD.
-- These are just for the question
num_in_reg := 2;
num_reg := 4;
data_size := 71;
data_width := 64;
SIGNAL op_a, op_b : SIGNED(data_width-1 DOWNTO 0);
Signal out_reg_int, out_reg_out_int : SIGNED((data_width*2)-1 DOWNTO 0);
TYPE bus_array3 IS ARRAY (num_reg-1 DOWNTO num_in_reg) OF Signed(data_width-1 DOWNTO 0);
SIGNAL out_reg_in: bus_array3; -- for output register inputs
SIGNAL result: bus_array3; -- for results
op_a <= SIGNED(reg_out(0)(data_width-1 DOWNTO 0)); -- operand A
op_b <= SIGNED(reg_out(1)(data_width-1 DOWNTO 0)); -- operand B (trigger register)
-- Multiply Accumulate Operation
out_reg_int <= (op_a * op_b) + out_reg_out_int; -- multiply operands and add to current result
out_reg_out_int <= result(3) & result(2); -- combining the output registers lower and higher parts
out_reg_in(2) <= out_reg_int(data_width-1 downto 0); -- lower part of the result
out_reg_in(3) <= out_reg_int((data_width*2)-1 downto data_width); -- upper part of the result
------------------------------------------------------
-- Basically a register for pre place and route simulation.
------------------------------------------------------
result_regs: FOR J IN num_in_reg TO num_reg-1 GENERATE
Razor_result : entity work.Razor(Behavioral)
Generic Map (data_size => data_width)
Port Map (clk => CLK,
rst => Test_rst,
data => std_logic_vector(out_reg_in(J)),
wen => out_reg_en(J),
signed(data_out) => result(J),
error => Razor_error_int(J));
End Generate;
In the Testbench op_a is x64 and op_b is x100 and out_reg_int, out_reg_in and out_reg_out_int are all 0.
The full code for the unit is This :
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.NUMERIC_STD.ALL;
use work.DigEng.All;
entity MAC is
Generic (data_size : INTEGER := 71; -- size of data
num_in_reg: INTEGER := 2; -- number of input registers
data_width: INTEGER := 64; --size of data minus parity bits
num_reg : INTEGER := 4);-- number of input and output registers
PORT (CLK : IN STD_LOGIC;
rst : IN STD_LOGIC; -- reset
en_all : IN STD_LOGIC; -- enable for all sequential elements
test_mode : IN STD_LOGIC; -- Hamming Fault Tolerance Testing
reg_input : IN STD_LOGIC_VECTOR(data_size-1 downto 0);
reg_output : OUT STD_LOGIC_VECTOR(data_size-1 downto 0);
error_en : IN STD_LOGIC; -- Hamming Error enable
we_regs : IN STD_LOGIC_VECTOR(num_reg-1 DOWNTO 0); -- registers write enable
re_regs : IN STD_LOGIC_VECTOR(num_reg-1 DOWNTO 0); -- registers read enable
Razor_error : OUT STD_LOGIC; -- Timing error occured
HW_Error : OUT STD_LOGIC); -- Error signal for FU replacement
end MAC;
architecture Behavioral of MAC is
---------------------------
-- Components declaration
---------------------------
-- Declaring D flipflop
COMPONENT d_ff IS
PORT(clk : IN STD_LOGIC;
rst : IN STD_LOGIC;
en : IN STD_LOGIC;
data: IN STD_LOGIC;
q : OUT STD_LOGIC);
END COMPONENT;
--------------------
-- Internal signals
--------------------
-- Type declaration to array of buses for Outputs of all registers
TYPE bus_array IS ARRAY (num_reg-1 DOWNTO 0) OF STD_LOGIC_VECTOR(data_size-1 DOWNTO 0);
SIGNAL reg_out: bus_array; -- for registers outputs
TYPE bus_array4 IS ARRAY (num_in_reg-1 DOWNTO 0) OF STD_LOGIC_VECTOR(data_size-1 DOWNTO 0);
SIGNAL reg_input_int: bus_array4; -- for registers outputs
TYPE bus_array3 IS ARRAY (num_reg-1 DOWNTO num_in_reg) OF Signed(data_width-1 DOWNTO 0);
SIGNAL out_reg_in: bus_array3; -- for output register inputs
SIGNAL result: bus_array3; -- for results
TYPE bus_array5 IS ARRAY (num_reg-1 DOWNTO num_in_reg) OF STD_LOGIC_VECTOR(data_size-1 DOWNTO 0);
Signal post_op_hamming_in : bus_array5;
-- Input registers enables
TYPE reg_en IS ARRAY (num_in_reg-1 DOWNTO 0) OF STD_LOGIC;
SIGNAL in_reg_en: reg_en;
-- Output registers enables
TYPE reg_en2 IS ARRAY (num_reg-1 DOWNTO num_in_reg) OF STD_LOGIC;
SIGNAL out_reg_en: reg_en2;
-- Oprand A, operand B and operation trigger
SIGNAL op_a, op_b : SIGNED(data_width-1 DOWNTO 0);
Signal test_Input1, test_Input2, test_Output_High, test_Output_low : STD_LOGIC_VECTOR(data_width-1 DOWNTO 0);
SIGNAL t_op, test_rst, test_en, test_rst_int, test_en_delayed : STD_LOGIC; -- enable for the output registers
Signal out_reg_int, out_reg_out_int : SIGNED((data_width*2)-1 DOWNTO 0);
SIGNAL In_Hamming_out : STD_LOGIC_VECTOR(data_size-1 DOWNTO 0);
Signal Razor_error_int : STD_LOGIC_VECTOR(Num_reg-1 downto 0);
BEGIN
-----------------------------------------------
-- Logic for the sequential elements enable
-----------------------------------------------
-- For input registers
input_regs_en: FOR H IN 0 TO num_in_reg-1 GENERATE
in_reg_en(H) <= (we_regs(H) AND en_all) OR Test_en;
END GENERATE;
-- Instantiate D flipflop
Inst_d_ff: d_ff
PORT MAP(clk => clk ,
rst => rst,
en => en_all, -- delay the trigger signal for a clock cycle
data => we_regs(1), -- delayed reg(1) write enable is used as trigger
q => t_op);
-- For output registers
output_regs_en: FOR I IN num_in_reg TO num_reg-1 GENERATE
out_reg_en(I) <= (t_op AND we_regs(1) AND en_all) OR Test_en ;
END GENERATE;
-------------------------------------------------
-- Instantiate hamming module to check input data
-------------------------------------------------
In_Hamming_module: entity work.Hamming_Module(Behavioral)
Generic Map (data_size => data_size)
PORT MAP(Data_in => reg_input,
Data_out => In_Hamming_out,
Error_en => error_en,
Error_sig => OPEN);
--------------------------------------------------------
-- Razor Shadow Latch Unit for checking Timing on Input
--------------------------------------------------------
input_regs: FOR J IN 0 TO num_in_reg-1 GENERATE
Razor_In : entity work.Razor(Behavioral)
Generic Map (data_size => data_size)
Port Map (clk => CLK,
rst => rst,
data => reg_input_int(J),
wen => in_reg_en(J),
data_out => reg_out(J),
error => Razor_error_int(J));
END GENERATE;
------------------------------------------------------
-- Razor Shadow Latch Unit for checking Timing post OP
------------------------------------------------------
result_regs: FOR J IN num_in_reg TO num_reg-1 GENERATE
Razor_result : entity work.Razor(Behavioral)
Generic Map (data_size => data_width)
Port Map (clk => CLK,
rst => Test_rst,
data => std_logic_vector(out_reg_in(J)),
wen => out_reg_en(J),
signed(data_out) => result(J),
error => Razor_error_int(J));
End Generate;
------------------------------------------------------
-- Instantiate hamming module to recalculate
-- parity values after operations
------------------------------------------------------
hamming_post_ops: FOR J IN num_in_reg TO num_reg-1 GENERATE
post_op_hamming_in(J) <= std_logic_vector(resize(result(J), data_size));
hamming_post_ops: entity work.Hamming_Module(Behavioral)
Generic Map (data_size => data_size)
PORT MAP(Data_in => post_op_hamming_in(J),
Data_out => reg_out(J),
Error_en => '0',
Error_sig => OPEN);
End Generate;
----------------------------
-- MAC Testing
----------------------------
Mac_Tester_Unit : entity work.MAC_Tester(Behavioral)
Generic Map(data_size => data_size, -- size of data
num_in_reg => num_in_reg, -- number of input registers
data_width => data_width, -- size of data minus parity bits
num_reg => num_reg) -- number of input and output registers)
Port Map(clk => clk,
en_all => Test_en,
Input_reg1 => test_Input1,
Input_reg2 => test_Input2,
Output_reg_lower => test_Output_High,
Output_reg_higher => test_Output_low,
rst => test_rst_int);
--------------------------------------------------------
-- Assigning the output of input registers to operands
--------------------------------------------------------
op_a <= SIGNED(reg_out(0)(data_width-1 DOWNTO 0)); -- operand A
op_b <= SIGNED(reg_out(1)(data_width-1 DOWNTO 0)); -- operand B (trigger register)
-- Multiply Accumulate Operation
out_reg_int <= (op_a * op_b) + out_reg_out_int; -- multiply operands and add to current result
out_reg_out_int <= result(3) & result(2); -- combining the output registers lower and higher parts
out_reg_in(2) <= out_reg_int(data_width-1 downto 0); -- lower part of the result
out_reg_in(3) <= out_reg_int((data_width*2)-1 downto data_width); -- upper part of the result
----------------------------------------------------------------
-- Tri-state buffer to output the chosen register output to the
-- central data bus
----------------------------------------------------------------
connections2: FOR J IN num_in_reg TO num_reg-1 GENERATE
reg_output <= reg_out(J) WHEN re_regs(J) = '1' ELSE (OTHERS => 'Z');
END GENERATE;
Razor_error <= Razor_error_int(0) OR Razor_error_int(1) OR Razor_error_int(2) OR Razor_error_int(3);
-----------------------------
-- Test Vector Assignment
-----------------------------
Test_en <= '1' When we_regs = "0000" AND re_regs = "0000" AND en_all = '1' AND error_en = '1' else '0';
Test_rst <= Test_rst_int When Test_en = '1' else rst;
reg_input_int(0)(data_width-1 downto 0) <= test_Input1 when Test_en = '1' else
In_Hamming_out(data_width-1 downto 0);
reg_input_int(1)(data_width-1 downto 0) <= test_Input2 when Test_en = '1' else
In_Hamming_out(data_width-1 downto 0);
------------------------------------------------------
-- Instantiate D flipflop for delaying output testing
------------------------------------------------------
Test_d_ff: d_ff
PORT MAP(clk => clk ,
rst => rst,
en => en_all,
data => Test_en, -- delay Test_en for 1 clock cycle for output check
q => Test_en_delayed);
HW_Error <= '0' When Test_en_delayed <= '0' OR (signed(test_Output_High) = result(3) AND signed(test_Output_low) = result(2)) else
'1';
end Behavioral;

Related

VHDL Changing and holding the signal in if statement

Im super new with VHDL and I have an assigned project to do. Basically what I'm aiming is to display 2 numbers and substract and add them with the help of a switch. (On FPGA Board)
For example:
Assume I have a signal A with the bit value 9 and B with 2, whenever I open the switch it will operate A-B and display 7. The problem is, when I close the switch, I get 9 instead of 7. (It doesn't hold the value) What I want is to display all the substraction results with opening and closing the same switch : 9,7,5,3,1
What I did so far:
I coded a decoder for the seven segment display
I have a 5 bit bitslice adder substractor implementation
In my main module, I have them as components and instantiations
In main module, I have signals for the two numbers which I declared
as the input values of my bitslice instantiatons
In the process, I assigned the default values to my signals (9 and 2)
Inside of an if statement, I assigned the signal A to the result
signal (which is assigned to adder-substractor instantiations bit by bit) And assigned the signal for seven segment to result.
It works when I open the switch, but when I close the switch default value holds. How can I update the value of A with the result each time?
My Code:
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
-- Uncomment the following library declaration if using
-- arithmetic functions with Signed or Unsigned values
--use IEEE.NUMERIC_STD.ALL;
-- Uncomment the following library declaration if instantiating
-- any Xilinx primitives in this code.
--library UNISIM;
--use UNISIM.VComponents.all;
entity main is
Port (
S : in STD_LOGIC_VECTOR (1 downto 0);
Cin : in STD_LOGIC;
Cout: out STD_LOGIC;
StartGameSwitch : IN std_logic;
SevenSegControl : OUT std_logic_vector(7 downto 0):=x"ff";
SevenSegBus : OUT std_logic_vector(7 downto 0);
clk : IN std_logic);
end main;
architecture Behavioral of main is
--COMPONENTS-------------------------------------------------------------------------------------------------------
COMPONENT sevenSegment
PORT(
A : IN std_logic_vector(4 downto 0);
B : IN std_logic_vector(4 downto 0);
C : IN std_logic_vector(4 downto 0);
D : IN std_logic_vector(4 downto 0);
E : IN std_logic_vector(4 downto 0);
F : IN std_logic_vector(4 downto 0);
G : IN std_logic_vector(4 downto 0);
H : IN std_logic_vector(4 downto 0);
SevenSegControl : OUT std_logic_vector(7 downto 0);
SevenSegBus : OUT std_logic_vector(7 downto 0);
clk : IN std_logic
);
END COMPONENT;
COMPONENT logic
PORT(
A : IN std_logic;
B : IN std_logic;
COld : IN std_logic;
S : IN std_logic_vector(1 downto 0);
AG : IN std_logic;
BG : IN std_logic;
CNew : OUT std_logic;
NumberBit : OUT std_logic;
negativeSign : OUT std_logic
);
END COMPONENT;
COMPONENT comparator
PORT(
A : IN std_logic_vector(4 downto 0);
B : IN std_logic_vector(4 downto 0);
AG : OUT std_logic;
BG : OUT std_logic
);
END COMPONENT;
--SIGNALS-------------------------------------------------------------------------------------------------------
signal sA,sB,sC,sD,sE,sF,sG,sH: std_logic_vector (4 downto 0) ;
signal logicLed,result,negativeSign,sevenSegmentResult,sevenSegmentNegativeSign: std_logic_vector (4 downto 0);
signal c0,c1,c2,c3,c4: std_logic;
signal AG,BG: std_logic;
signal HP1,HP2,ATK1,ATK2: std_logic_vector (4 downto 0);
--CONSTANTS-------------------------------------------------------------------------------------------------------
constant charO:std_logic_vector(4 downto 0):= "01010"; --10
constant charP:std_logic_vector(4 downto 0):= "01011"; --11
constant charE:std_logic_vector(4 downto 0):= "01100"; --12
constant charN:std_logic_vector(4 downto 0):= "01101"; --13
constant charF:std_logic_vector(4 downto 0):= "01110"; --14
constant charI:std_logic_vector(4 downto 0):= "01111"; --15
constant charG:std_logic_vector(4 downto 0):= "10000"; --16
constant charH:std_logic_vector(4 downto 0):= "10001"; --17
constant charT:std_logic_vector(4 downto 0):= "10010"; --18
constant charA: std_logic_vector(4 downto 0):= "10011";--19
constant char0:std_logic_vector(4 downto 0):= "00000";
constant char1:std_logic_vector(4 downto 0):= "00001";
constant char2:std_logic_vector(4 downto 0):= "00010";
constant char3:std_logic_vector(4 downto 0):= "00011";
constant char4:std_logic_vector(4 downto 0):= "00100";
constant char5:std_logic_vector(4 downto 0):= "00101";
constant char6:std_logic_vector(4 downto 0):= "00110";
constant char7:std_logic_vector(4 downto 0):= "00111";
constant char8:std_logic_vector(4 downto 0):= "01000";
constant char9:std_logic_vector(4 downto 0):= "01001";
constant charEmpty:std_logic_vector(4 downto 0):= "11111";
begin
--INSTANTIATIONS-------------------------------------------------------------------------------------------------------
Inst_logic1: logic PORT MAP(
A =>HP2(0) ,
B =>ATK1(0) ,
COld =>c0 ,
CNew =>c1,
NumberBit =>result(0) ,
S =>S,
AG => AG,
BG => BG,
negativeSign => negativeSign(0)
);
Inst_logic2: logic PORT MAP(
A =>HP2(1) ,
B =>ATK1(1) ,
COld =>c1 ,
CNew =>c2,
NumberBit =>result(1) ,
S =>S,
AG => AG,
BG => BG,
negativeSign => negativeSign(1)
);
Inst_logic3: logic PORT MAP(
A =>HP2(2) ,
B =>ATK1(2) ,
COld =>c2 ,
CNew =>c3,
NumberBit =>result(2) ,
S =>S,
AG => AG,
BG => BG,
negativeSign => negativeSign(2)
);
Inst_logic4: logic PORT MAP(
A =>HP2(3) ,
B =>ATK1(3) ,
COld =>c3 ,
CNew =>c4,
NumberBit =>result(3) ,
S =>S,
AG => AG,
BG => BG,
negativeSign => negativeSign(3)
);
Inst_logic5: logic PORT MAP(
A =>HP2(4) ,
B =>ATK1(4) ,
COld =>c4 ,
CNew =>Cout,
NumberBit =>result(4) ,
S =>S,
AG => AG,
BG => BG,
negativeSign => negativeSign(4)
);
Inst_comparator: comparator PORT MAP(
A => HP2,
B => ATK1,
AG => AG,
BG => BG
);
--GAME LOGIC-------------------------------------------------------------------------------------------------------
process(StartGameSwitch)
begin
--DEFAULT VALUES FOR HP1,HP2,ATK1,ATK2--
-- I WANT HP1 AND HP2 TO CHANGE ACCORDING TO SWITCHES --
HP2 <= char9;
ATK1 <= char2;
HP1 <= char9;
ATK2 <= char3;
if(StartGameSwitch = '0') then -- OPEN P35 ON FPGA
sA <= charO; -- s_ are the signals for the seven segment
sB <= charP;
sC <= charE;
sD <= charN;
sE <= charEmpty;
sF <= charP;
sG <= char7;
sH <= char8;
else -- WHEN P35 IS OPENED ( WHEN THE GAME STARTS)
sA <= HP1; --HP POINT FOR P1
sB <= charEmpty;
sC <= ATK1; --Attack POINT FOR P1
sD <= charEmpty;
sE <= charEmpty;
sF <= ATK2; --Attack POINT FOR P2
sG <= charEmpty;
sH <= HP2; --HP POINT FOR P2
if(S = "01") then -- WHEN PLAYER 1 ATTACKS PLAYER 2 ( HP2 - ATK1 = RESULT (i.e 9-2 = 7))
HP2 <= result;
sH <= HP2; -- When SWITCH IS OPEN, IT SHOWS 7 WITHOUT ANY PROBLEM
end if;
-- HOWEVER, WHEN THE SWITCH IS AGAIN BACK TO 00, sH displays 9 instead of 7, HOW CAN I SAVE THE VALUE OF HP2?
end if;
end process;
-- SIGNALS ASSIGNED TO DISPLAY
Inst_sevenSegment: sevenSegment PORT MAP(
A =>sA, --1ST PLAYER HEALTH
B =>sB, -- 1ST PLAYER DMG
C =>sC ,
D =>sD ,
E =>sE ,
F =>sF ,
G =>sG , -- 2ND PLAYER DMG
H =>sH , --2ND PLAYER HEALTH
SevenSegControl =>SevenSegControl ,
clk => clk,
SevenSegBus => SevenSegBus
);
end Behavioral;
Thanks

The problem is in your game logic process. Since your HP and ATK signals are in the top level of the process, they essentially get a continuous assignment to the default values all the time. The only time this changes is when you open the switch and you see the expected result displayed. When you close the switch again, the values get assigned their defaults again, rather than retaining the previous arithmetic result.
If you assign those signals their defaults only when you trigger StartGameSwitch (see below), then you should see the correct display when using your switch.
On a side note, your process sensitivity list was incomplete. The logic in your process depends on the switch signal S, so this should be included in the sensitivity list (as I've shown below). It doesn't really matter for synthesis, only for correctly simulating your code.
process(StartGameSwitch,S)
begin
--DEFAULT VALUES FOR HP1,HP2,ATK1,ATK2--
-- I WANT HP1 AND HP2 TO CHANGE ACCORDING TO SWITCHES --
--HP2 <= char9;
--ATK1 <= char2;
--HP1 <= char9;
--ATK2 <= char3;
if(StartGameSwitch = '0') then -- OPEN P35 ON FPGA
HP2 <= char9; --Assign default values here instead
ATK1 <= char2;
HP1 <= char9;
ATK2 <= char3;
sA <= charO; -- s_ are the signals for the seven segment
sB <= charP;
sC <= charE;
sD <= charN;
sE <= charEmpty;
sF <= charP;
sG <= char7;
sH <= char8;
else -- WHEN P35 IS OPENED ( WHEN THE GAME STARTS)

Why won't a VHDL "inout" signal be assigned a value when used as an output

I have a VHDL description for a bridge, and the bidirectional signal "mem_data_port0" is not getting assigned any value regardless of what I write to it. The pins on the FPGA are assigned accordingly, but no output.
I have the code below (its for an FPGA going into a larger system, so comments will reflect other system components that are not the FPGA)
FYI: The FPGA is a Lattice LCMXO2-7000HC
Any tips on assigning "mem_data_port0"?
library ieee;
use ieee.std_logic_1164.all;
use ieee.std_logic_arith.all;
use ieee.std_logic_unsigned.all;
--use ieee.std_logic_signed.all;
use ieee.std_logic_arith.all;
use ieee.numeric_std.all;
library machxo2;
use machxo2.all;
entity SinglePhasePowerAnalyzerBridge is
port(
output0 : out std_logic; -- dummy, unassigned outputs
output1 : out std_logic;
output2 : out std_logic;
output3 : out std_logic;
output4 : out std_logic;
output5 : out std_logic;
output6 : out std_logic;
output7 : out std_logic;
accq_data_in : in std_logic_vector (15 downto 0); -- accquisition data in
accq_clk : in std_logic; -- accquisition clock in
accq_data_ready : in std_logic; -- data ready in 0: sending voltage/current data, 1: sending frequency data
accq_reset : in std_logic; -- accquisition reset (active low)
accq_voltage_current : in std_logic; -- accquisition select for voltage and current 0: voltage, 1: current
buffer_data_port0 : out std_logic_vector (15 downto 0); -- buffer data
buffer_address_port0 : in std_logic_vector(12 downto 0); -- buffer address low bits
buffer_address_high_port0 : in std_logic_vector(2 downto 0); -- buffer address high bits
buffer_high_byte_en_port0 : in std_logic; -- high byte enable
buffer_low_byte_en_port0 : in std_logic; -- low byte enable
buffer_write_en_port0 : in std_logic; -- write enable
buffer_output_en_port0 : in std_logic; -- output enable
buffer_memory_en_port0 : in std_logic; -- memory enable
buffer_interrupt_out : out std_logic;
-- pins going to external SRAM memory
mem_data_port0 : inout std_logic_vector (15 downto 0);
mem_address_port0 : out std_logic_vector(12 downto 0);
mem_address_high_port0 : out std_logic_vector (3 downto 0);
mem_memory_en_port0 : out std_logic := '1';
mem_output_en_port0 : out std_logic := '1';
mem_write_en_port0 : out std_logic := '1';
mem_high_byte_en_port0 : out std_logic := '0';
mem_low_byte_en_port0 : out std_logic := '0';
debug_out : out std_logic -- debug output
);
end SinglePhasePowerAnalyzerBridge;
architecture rtl of SinglePhasePowerAnalyzerBridge is
signal frequency_storage_buffer : std_logic_vector (15 downto 0); -- frequency buffer
signal voltage_storage_pointer : integer range 0 to 8191;
signal current_storage_pointer : integer range 0 to 8191;
signal signal_accq_data_in : std_logic_vector (15 downto 0);
signal signal_accq_clk : std_logic;
signal signal_inverse_accq_clk : std_logic;
signal signal_accq_data_ready : std_logic;
signal signal_accq_reset : std_logic;
signal signal_accq_voltage_current : std_logic;
signal signal_delayed_inverse_accq_clk : std_logic;
signal signal_buffer_data_port0 : std_logic_vector (15 downto 0);
signal signal_buffer_address_port0 : std_logic_vector(12 downto 0);
signal signal_buffer_address_high_port0 : std_logic_vector(2 downto 0);
signal signal_buffer_high_byte_en_port0 : std_logic;
signal signal_buffer_low_byte_en_port0 : std_logic;
signal signal_buffer_write_en_port0 : std_logic;
signal signal_buffer_output_en_port0 : std_logic;
signal signal_buffer_memory_en_port0 : std_logic;
begin
signal_accq_data_in <= accq_data_in; -- connect all the pins to internal signals
signal_accq_clk <= accq_clk;
signal_accq_data_ready <= accq_data_ready;
signal_accq_reset <= accq_reset;
signal_accq_voltage_current <= accq_voltage_current;
signal_inverse_accq_clk <= not signal_accq_clk;
--debug_out <= signal_delayed_inverse_accq_clk;
signal_buffer_address_port0 <= buffer_address_port0;
signal_buffer_address_high_port0 <= buffer_address_high_port0;
signal_buffer_high_byte_en_port0 <= buffer_high_byte_en_port0;
signal_buffer_low_byte_en_port0 <= buffer_low_byte_en_port0;
signal_buffer_write_en_port0 <= buffer_write_en_port0;
signal_buffer_output_en_port0 <= buffer_output_en_port0;
signal_buffer_memory_en_port0 <= buffer_memory_en_port0;
buffer_interrupt_out <= not signal_accq_data_ready;
output2 <= signal_accq_data_ready; -- dummy outputs, so the input pins are not left uncommected
output3 <= signal_accq_clk;
output4 <= signal_accq_reset;
output5 <= signal_accq_voltage_current;
general_event : process(signal_accq_clk, signal_accq_data_ready, signal_accq_data_in, signal_accq_voltage_current, signal_accq_reset,
signal_buffer_memory_en_port0, signal_buffer_output_en_port0, signal_buffer_write_en_port0, signal_buffer_address_high_port0, signal_buffer_address_port0,
signal_buffer_data_port0, frequency_storage_buffer, signal_accq_reset, signal_accq_data_ready, voltage_storage_pointer, current_storage_pointer)
begin
if(signal_accq_data_ready = '0') then -- when data from the acquisition controller comes in
mem_output_en_port0 <= '1'; -- disable memory output
mem_memory_en_port0 <= '0'; -- enable memory
mem_write_en_port0 <= signal_inverse_accq_clk; -- send the acquisition clock to the memory write enable pin
if(signal_accq_reset = '1') then -- if reset is not activated...
accq_clk_edge : if(rising_edge(signal_accq_clk)) then -- process on clock rising edge
if(signal_accq_voltage_current = '1') then -- if sending current data
mem_data_port0 <= signal_accq_data_in; -- store the data in the current buffer
mem_address_port0 <= std_logic_vector(to_unsigned(current_storage_pointer, 13));
mem_address_high_port0 <= "0001"; -- sending current data to memory
current_storage_pointer <= (current_storage_pointer + 1); -- increment the counter
elsif (signal_accq_voltage_current = '0') then -- do the same if sending voltage data
--mem_data_port0 <= signal_accq_data_in; -- store the data in the voltage buffer
mem_data_port0 <= "1111111111111111";
mem_address_port0 <= std_logic_vector(to_unsigned(voltage_storage_pointer, 13));
mem_address_high_port0 <= "0000"; -- sending voltage data to memory
voltage_storage_pointer <= (voltage_storage_pointer + 1); -- increment the counter
end if;
end if accq_clk_edge;
else -- if reset is activated...
voltage_storage_pointer <= 0; -- reset everything to 0
current_storage_pointer <= 0; -- reset everything to 0
end if;
--end process accq_event;
elsif (signal_accq_data_ready = '1') then -- if data ready is high, the buffer is in read mode
mem_data_port0 <= "ZZZZZZZZZZZZZZZZ"; -- set memory data lines to input, or read mode
mem_write_en_port0 <= '1'; -- disable writing to the memory
mem_output_en_port0 <= '0'; -- enable memory output
mem_address_port0 <= signal_buffer_address_port0; -- select the appropriate addreess
mem_address_high_port0 (2 downto 0) <= signal_buffer_address_high_port0; -- do the same with the high bits
if(rising_edge(signal_accq_clk) and signal_accq_reset = '1') then -- if the accquisition MCU is writing with the data ready pin high
frequency_storage_buffer <= signal_accq_data_in; -- store the frequency value that it's sending
end if;
if(signal_accq_reset = '0') then -- reset as before if reset is enabled
voltage_storage_pointer <= 0; -- reset everything to 0
current_storage_pointer <= 0; -- reset everything to 0
end if;
if(signal_buffer_memory_en_port0 = '0' and signal_buffer_write_en_port0 = '1' and signal_accq_data_ready = '1' and signal_accq_reset = '1') then -- memory enabled and write enable high...
case signal_buffer_address_high_port0 is
when "000" => signal_buffer_data_port0 <= mem_data_port0; -- output data to downstream MCUs as needed
when "001" => signal_buffer_data_port0 <= mem_data_port0;
when "010" => signal_buffer_data_port0 <= frequency_storage_buffer;
when "011" => signal_buffer_data_port0 <= std_logic_vector(to_unsigned(voltage_storage_pointer, 16));
when "100" => signal_buffer_data_port0 <= std_logic_vector(to_unsigned(current_storage_pointer, 16));
when "111" => signal_buffer_data_port0 <= "1010000001010110"; -- 0xA056
when others=>
end case;
end if;
if(signal_buffer_output_en_port0 = '0') then
--buffer_data_port0(7 downto 0) <= signal_buffer_data_port0 (15 downto 8);
--buffer_data_port0(15 downto 8) <= signal_buffer_data_port0 (7 downto 0);
buffer_data_port0 <= signal_buffer_data_port0;
else
buffer_data_port0 <= "ZZZZZZZZZZZZZZZZ";
end if;
end if;
end process general_event;
output6 <= signal_buffer_high_byte_en_port0;
output7 <= signal_buffer_low_byte_en_port0;
end rtl;

I've tried your code on Modelsim. Although I can't comment on the correctness of mem_data_port0's behavior, it does get assigned values depending on the other relevant signals, so for the out direction it works.
If you're talking about the fact that you cannot assign values to it from outside, all I could think of is that you forgot to assign it the high impedance in input mode, but you do, so that's out.
An explanation could be that your entity is not the top level entity, which would render inout ports unusable (inout has no meaning inside the FPGA, only at the design's top level).

Multiple read of register in VHDL and encapsulation leads to wrong value

I currently confront one problem with reading two registers and send their value via proxy to tile on FPGA. There are three input channels for encoded signals which consis of pulses with frequency of 50khz, then the signal were sampled by a local 100Mhz clock on FPGA board. In the module two channels were counted for their pulses and I use bit shift for encapsulation to put the read value from two counter into one 32bits std_bit_vector to send via proxy.
If I only transfer counting value of one channel and sent them via proxy signals, it behaviors always correctly. However when reading two counting value from two register and doing the encapsulation, the 16 higher bits of the shifted value is always increasing faster than it supposed to be. In another word, counting process ifself is correct while there is also no problems with communication between counting module and tile.
I dont know the problem is caused by the process reading too fast from two registers and leads to synchronization problem. One channel input signal only increases by one per 500 counts, while another counting 500 per rotation.One rotation is controlled by hand on the encoder (THen u can imagine the low frequency of the input signals)
THen encapsulation were put in last two lines of the module.
THanks in advance.
THe module of counting is design as following:
library rdt_dtl_proxy_targetlib;
use rdt_dtl_proxy_targetlib.rdt_dtl_proxy_target_cmp_pkg.all;
library ieee;
use ieee.std_logic_1164.all;
use ieee.std_logic_ARITH.all;
use ieee.std_logic_unsigned.all;
use ieee.numeric_std.all;
entity dtl_pmod_rotary is
generic (
WIDTH_DTL_CMD_ADDR : natural:= 32;
WIDTH_DTL_CMD_BLOCK_SIZE : natural:= 5;
WIDTH_DTL_DATA : natural:= 32;
WIDTH_DTL_WR_MASK : natural:= 4
);
port (
clk : in std_logic;
rst_n : in std_logic;
rotary_a : in std_logic;
rotary_b : in std_logic;
rotary_i : in std_logic;
dtl_cmd_valid_t_proxy0 : in std_logic;
dtl_cmd_accept_t_proxy0 : out std_logic;
dtl_cmd_addr_t_proxy0 : in std_logic_vector(WIDTH_DTL_CMD_ADDR - 1 downto 0);
dtl_cmd_read_t_proxy0 : in std_logic;
dtl_cmd_block_size_t_proxy0 : in std_logic_vector(WIDTH_DTL_CMD_BLOCK_SIZE - 1 downto 0);
dtl_wr_valid_t_proxy0 : in std_logic;
dtl_wr_last_t_proxy0 : in std_logic;
dtl_wr_accept_t_proxy0 : out std_logic;
dtl_wr_data_t_proxy0 : in std_logic_vector(WIDTH_DTL_DATA - 1 downto 0);
dtl_wr_mask_t_proxy0 : in std_logic_vector(WIDTH_DTL_WR_MASK - 1 downto 0);
dtl_rd_last_t_proxy0 : out std_logic;
dtl_rd_valid_t_proxy0 : out std_logic;
dtl_rd_accept_t_proxy0 : in std_logic;
dtl_rd_data_t_proxy0 : out std_logic_vector(WIDTH_DTL_DATA - 1 downto 0)
);
end dtl_pmod_rotary;
architecture rtl of dtl_pmod_rotary is
signal dtl_rd_data_15 : std_logic_vector(WIDTH_DTL_DATA - 1 downto 0);
signal dtl_rd_accept_14 : std_logic;
signal dtl_cmd_block_size_6 : std_logic_vector(WIDTH_DTL_CMD_BLOCK_SIZE - 1 downto 0);
signal dtl_wr_data_10 : std_logic_vector(WIDTH_DTL_DATA - 1 downto 0);
signal dtl_cmd_read_5 : std_logic;
signal dtl_rd_last_12 : std_logic;
signal dtl_rst_n_1 : std_logic;
signal dtl_wr_valid_7 : std_logic;
signal dtl_wr_accept_9 : std_logic;
signal dtl_wr_last_8 : std_logic;
signal dtl_cmd_addr_4 : std_logic_vector(WIDTH_DTL_CMD_ADDR - 1 downto 0);
signal dtl_cmd_valid_2 : std_logic;
signal dtl_clk_0 : std_logic;
signal dtl_rd_valid_13 : std_logic;
signal dtl_wr_mask_11 : std_logic_vector(WIDTH_DTL_WR_MASK - 1 downto 0);
signal dtl_cmd_accept_3 : std_logic;
-- signals for counting
signal cnt_r : std_logic_vector(WIDTH_DTL_DATA-1 downto 0) := X"00000000";
signal cnt_r_a : std_logic_vector(WIDTH_DTL_DATA-1 downto 0) := X"00000000";
signal cnt_r_b : std_logic_vector(WIDTH_DTL_DATA-1 downto 0) := X"00000000";
signal cnt_r_i : std_logic_vector(WIDTH_DTL_DATA-1 downto 0) := X"00000000";
signal cnt_nxt_a : std_logic_vector(WIDTH_DTL_DATA-1 downto 0) := X"00000000";
signal cnt_nxt_b : std_logic_vector(WIDTH_DTL_DATA-1 downto 0) := X"00000000";
signal cnt_nxt_i : std_logic_vector(WIDTH_DTL_DATA-1 downto 0) := X"00000000";
signal cnt_ref_i : std_logic_vector(WIDTH_DTL_DATA-1 downto 0) := X"00000000";
signal rotary_a_r : std_logic; --register storing data from rotary_a.
signal rotary_a_2r : std_logic; --register storing data from rotary_a_r
signal rotary_b_r : std_logic; --register storing data from rotary_b.
signal rotary_b_2r : std_logic; --register storing data from rotary_b_r
signal rotary_i_r : std_logic; --register storing data from rotary_i.
signal rotary_i_2r : std_logic; --register storing data from rotary_i_r
begin
dtl_clk_0 <= clk;
dtl_rst_n_1 <= rst_n;
dtl_cmd_valid_2 <= dtl_cmd_valid_t_proxy0;
dtl_cmd_accept_t_proxy0 <= dtl_cmd_accept_3;
dtl_cmd_addr_4 <= dtl_cmd_addr_t_proxy0;
dtl_cmd_read_5 <= dtl_cmd_read_t_proxy0;
dtl_cmd_block_size_6 <= dtl_cmd_block_size_t_proxy0;
dtl_wr_valid_7 <= dtl_wr_valid_t_proxy0;
dtl_wr_last_8 <= dtl_wr_last_t_proxy0;
dtl_wr_accept_t_proxy0 <= dtl_wr_accept_9;
dtl_wr_data_10 <= dtl_wr_data_t_proxy0;
dtl_wr_mask_11 <= dtl_wr_mask_t_proxy0;
dtl_rd_last_t_proxy0 <= dtl_rd_last_12;
dtl_rd_valid_t_proxy0 <= dtl_rd_valid_13;
dtl_rd_accept_14 <= dtl_rd_accept_t_proxy0;
dtl_rd_data_t_proxy0 <= dtl_rd_data_15;
-- Begin child instances
proxy0 : rdt_dtl_proxy_target
generic map (
DTL_DATA_WIDTH => WIDTH_DTL_DATA,
DTL_ADDR_WIDTH => WIDTH_DTL_CMD_ADDR,
DTL_BLK_SIZE_WIDTH => WIDTH_DTL_CMD_BLOCK_SIZE,
DTL_WR_MASK_WIDTH => WIDTH_DTL_WR_MASK
)
port map (
dtl_clk => dtl_clk_0,
dtl_rst_n => dtl_rst_n_1,
dtl_cmd_accept_t => dtl_cmd_accept_3,
dtl_cmd_addr_t => dtl_cmd_addr_4,
dtl_cmd_block_size_t => dtl_cmd_block_size_6,
dtl_cmd_read_t => dtl_cmd_read_5,
dtl_cmd_valid_t => dtl_cmd_valid_2,
dtl_rd_accept_t => dtl_rd_accept_14,
dtl_rd_data_t => dtl_rd_data_15,
dtl_rd_last_t => dtl_rd_last_12,
dtl_rd_valid_t => dtl_rd_valid_13,
dtl_wr_accept_t => dtl_wr_accept_9,
dtl_wr_data_t => dtl_wr_data_10,
dtl_wr_last_t => dtl_wr_last_8,
dtl_wr_mask_t => dtl_wr_mask_11,
dtl_wr_valid_t => dtl_wr_valid_7,
cmd_accept => '1', --target accept handshake,input port of Proxy
cmd_addr => open,
cmd_block_size => open,
cmd_read => open, --output port of Proxy
cmd_valid => open, --output port of Proxy
rd_accept => open, --output port of Proxy, return the accept info from initializer
--rd_data => x"00000003", --test value
rd_data => cnt_r, --input port of proxy, for sending the counter value
rd_last => '1',
rd_valid => '1',
wr_accept => '0', --return value to Proxy indicating prepared for receiving written data from initiator
wr_data => open, --receive data from Proxy that written by initializer
wr_last => open,
wr_mask => open,
wr_valid => open
);
-- Counting pulses
-- combinatorial process
comb_counter_process_a : process(rotary_a_r, rotary_a_2r)
begin
if rotary_a_r = '1' and rotary_a_2r = '0' then
cnt_nxt_a <= cnt_r_a + 1;
else
cnt_nxt_a <= cnt_r_a;
end if;
end process;
-- sequential process with synchronous reset
seq_counter_process_a : process(clk)
begin
if rising_edge(clk) then
if rst_n = '0' then
cnt_r_a <= (others => '0');
rotary_a_r <= '0';
rotary_a_2r <= '0';
else
-- registering
cnt_r_a <= cnt_nxt_a ;
rotary_a_2r <= rotary_a_r;
rotary_a_r <= rotary_a;
--cnt_r <= cnt_r_a(8 downto 0);
end if;
end if;
end process;
comb_counter_process_b : process(rotary_b_r, rotary_b_2r)
begin
if rotary_b_r = '1' and rotary_b_2r = '0' then
cnt_nxt_b <= cnt_r_b + 1;
else
cnt_nxt_b <= cnt_r_b;
end if;
end process;
-- sequential process with synchronous reset
seq_counter_process_b : process(clk)
begin
if rising_edge(clk) then
if rst_n = '0' then
cnt_r_b <= (others => '0');
rotary_b_r <= '0';
rotary_b_2r <= '0';
else
-- registering
cnt_r_b <= cnt_nxt_b;
rotary_b_2r <= rotary_b_r;
rotary_b_r <= rotary_b;
--cnt_r(31 downto 16) <= X"ABCD";
end if;
end if;
end process;
comb_counter_process_i : process(rotary_i_r, rotary_i_2r)
begin
if rotary_i_r = '1' and rotary_i_2r = '0' then
cnt_nxt_i <= cnt_r_i + 1;
else
cnt_nxt_i <= cnt_r_i;
end if;
end process;
-- sequential process with synchronous reset
seq_counter_process_i : process(clk)
begin
if rising_edge(clk) then
if rst_n = '0' then
cnt_r_i <= (others => '0');
rotary_i_r <= '0';
rotary_i_2r <= '0';
else
-- registering
cnt_r_i <= cnt_nxt_i;
rotary_i_2r <= rotary_i_r;
rotary_i_r <= rotary_i;
--cnt_r(15 downto 0) <= X"1234"; --the digits for cnt_r_i may change
end if;
end if;
end process;
cnt_r(31 downto 16) <= cnt_r_b(15 downto 0);
cnt_r(15 downto 0) <= cnt_r_i(15 downto 0);
end rtl;

nested generate statements for 32 x 8 register VHDL

My circuit has a grid of 32 x 8 D flip flops. each row should be producing a 32 bit vectors that contain the Q values from the D-ff's - which are then sent to a 8x1 MUX. The following code is me trying to properly generate the 32 x 8 D flip flops and test if I can get a vector out of them (the 32 bit I0 vector).
the circuit I'm trying to write the implementation for can be seen in the figure posted in this question:
test bench of a 32x8 register file VHDL
library ieee;
use ieee.std_logic_1164.all;
entity REG is
port (
REG_WRT : in std_logic;
WRT_REG_NUM : in std_logic_vector(2 downto 0);
WRT_DATA : in std_logic_vector(31 downto 0);
READ_REG_A : in std_logic_vector(2 downto 0);
READ_REG_B : in std_logic_vector(2 downto 0);
PORT_A : out std_logic_vector(31 downto 0);
PORT_B : out std_logic_vector(31 downto 0)
);
end REG;
architecture BEHV_32x8_REG of REG is
-- decoder component
component DCDR
port (
I_in : in std_logic_vector(2 downto 0);
O_out : out std_logic_vector(7 downto 0)
);
end component;
-- D flip flop component
component D_FF
port (
D_in : in std_logic;
CLK : in std_logic;
Q_out : out std_logic;
QN_out : out std_logic -- Q not
);
end component;
-- MUX copmonent
component MUX
port (
S_in : in std_logic_vector(2 downto 0);
I7, I6, I5, I4, I3, I2, I1, I0 : in std_logic_vector(31 downto 0);
O_out : out std_logic_vector(31 downto 0)
);
end component;
-- internal signals used
signal I_in : std_logic_vector(2 downto 0);
signal O_out : std_logic_vector(7 downto 0);
signal CLK_vals : std_logic_vector(7 downto 0);
signal MUXA_O_out : std_logic_vector(31 downto 0);
signal MUXB_O_out : std_logic_vector(31 downto 0);
-- two arrays of eight 32 bit vectors - the Q and QN outputs of all D_FFs
type reg_array is array (0 to 7) of std_logic_vector(31 downto 0);
signal Q, QN: reg_array;
begin
-- decoder instance
DCDR1 : DCDR port map(I_in, O_out);
GEN_D_FF:
for ROW in 0 to 7 generate
begin
GEN_D_FF0:
for COL in 0 to 31 generate
begin
DFF_X : D_FF port map(WRT_DATA(COL), CLK_vals(ROW), Q(ROW)(COL), QN(ROW)(COL));
end generate GEN_D_FF0;
end generate GEN_D_FF;
DCDR_AND : process(O_out, REG_WRT)
begin
I_in <= WRT_REG_NUM;
for I in 0 to 7 loop
CLK_vals(I) <= O_out(I) and not REG_WRT;
end loop;
end process DCDR_AND;
-- MUX instances
MUX_A : MUX port map(READ_REG_A, Q(7), Q(6), Q(5), Q(4), Q(3), Q(2), Q(1), Q(0), MUXA_O_out);
MUX_B : MUX port map(READ_REG_B, Q(7), Q(6), Q(5), Q(4), Q(3), Q(2), Q(1), Q(0), MUXB_O_out);
process(MUXA_O_out, MUXB_O_out)
begin
PORT_A <= MUXA_O_out;
PORT_B <= MUXB_O_out;
end process;
end BEHV_32x8_REG;
When I simulate the above code in ModelSim, I don't get any output for I0. Where is my design flawed? Am I violating any VHDL best practices? Assuming I can get this to function properly, how could I get 8 different 32-bit vectors (from each row of flip flops) to send to the MUX(s)?
I appreciate any advice I receive!
EDIT: I've updated the code to reflect advice given in the answers

You have 8 ROWs of 32 bit COLs connected to I0. Without a reset input to the D_FFs at best you'd have to write to all 8 rows to get 'X's instead of 'U's.
Your MUX isn't instantiated for either read port. If you were to implement the array value:
type reg_array is array (0 to 7) of std_logic_vector(31 downto 0);
signal Q,QN: reg_array;
These would replace I0 and Q_out.
From the referenced answer (you apparently just marked as useful - thanks) you could replace I0(COL) and QN_out(COL) in the D_FF instantiation in the inner generate statement with Q(ROW)(COL) and QN(ROW)(COL).
Note If you're not using the Q NOT outputs of the D_FFs you can either not provide them as ports or not connect them (open). You could also use Q outputs for one read MUX and QN outputs for the other read MUX, inverting the output of that MUX. With only two ports you aren't reducing the load significantly, you could just use Q.
For a MUX using signal Q defined as the reg_array above the MUX inputs would be Q(0) through Q(7) and the output would be either PORT_A or PORT_B. S_in would be connected to READ_REG_A orREAD_REG_B respectively.
One thing that's not apparent from reading your VHDL design description is why there is wait for 10 ns in your process DCDR_AND? It delays the write past the low baud of CLK (the low portion of clock). In a zero timed model you could simply used not CLK instead of CLK (CLK_vals(I) <= O_out(I) and not CLK, delete the wait for 10 ns; line). For a timed model resulting from synthesis, wait can't be synthesized. If you were to intend to synthesize, CLK can be used if you can count on input holdover should WR_DATA be clock synchronous.
And then your model has discretely instantiated D_FFs and uses MUXes for read ports.
I resisted the urge to modify your code and show it on the off chance you're doing the same class exercise. If any of this is unclear ask in a comment on this answer and I'll add to the answer, clearly mark any corrections to it or demonstrate code if necessary.
In response to the comment "...yet I still see UUU's for PORT_A and PORT_B"
Note that REG_WRT already shows as an inverted clock from the test bench from the previous effort, so I removed the preceding not in process DCDR_AND, otherwise using the previous effort's test bench unchanged other than matching your port names.
Also notice that the PORT_A and PORT_B outputs remain uninitialized until the address (READ_REG_A or READ_REG_B) is written to, which was the point of that particular test bench.
The idea behind writing to the flip flops (collectively 8 32 bit registers) in the middle of a clock cell (REG_WRT) was to avoid clock skew issues, in the case of the test bench caused by writing inputs based on delay values instead of on clock edges.
You could similarly have stimulus in a clocked process, which might require balancing clock delays to insure WRT_DATA and WRT_REG_NUM are valid at the right time. This is also cured by using not REG_WRT.
If you make REG_WRT in the test bench an upright clock instead of inverted you can leave the not in process DCDR_AND.
There's also concurrent signal assignment statements which incidentally can go in generate statements allowing the process DCDR_AND to folded into the first generate statement:
GEN_D_FF:
for ROW in 0 to 7 generate
begin
GEN_D_FF0:
for COL in 0 to 31 generate
begin
DFF_X:
D_FF
port map(
D_in => WRT_DATA(COL),
CLK => CLK_vals(ROW),
Q_out => Q(ROW)(COL),
QN_out => QN(ROW)(COL)
);
end generate;
DCDR_AND:
CLK_vals(ROW) <= O_out(ROW) and REG_WRT;
end generate;
-- DCDR_AND:
-- process (O_out, REG_WRT)
-- begin
--
-- I_in <= WRT_REG_NUM;
-- for I in 0 to 7 loop
-- CLK_vals(I) <= O_out(I) and REG_WRT;
-- end loop;
-- end process;
And could also be used on the PORT_A and PORT_B assignments instead of inside the process statement. You could also assign PORT_A and PORT_B as actuals to O_out in the two MUX instantiations, doing away with either a concurrent signal assignment or a process, for example:
MUX_A:
MUX
port map (
S_in => READ_REG_A,
I7 => Q(7),
I6 => Q(6),
I5 => Q(5),
I4 => Q(4),
I3 => Q(3),
I2 => Q(2),
I1 => Q(1),
I0 => Q(0),
O_out => Port_A
);
You could do this because you aren't using the read port data internally and the ports are mode out.
And while doing this I found that eliminating the assignment to I_in as above can cause all the 'U's on your read ports which can be cured similarly:
DCDR1:
DCDR
port map (
I_in => WRT_REG_NUM,
O_out => O_out
);
Allowing signal declarations for I_in, MUXA_O_out and MUXB_O_out to be eliminated:
-- internal signals used
-- signal I_in: std_logic_vector(2 downto 0);
signal O_out: std_logic_vector(7 downto 0);
signal CLK_vals: std_logic_vector(7 downto 0);
-- signal MUXA_O_out: std_logic_vector(31 downto 0);
-- signal MUXB_O_out: std_logic_vector(31 downto 0);
I didn't have a case of always having 'U's on the read ports except when I had accidentally eliminated the inclusion of WRT_REG_NUM in CLK_vals as noted above.
I didn't quite finish prettifying your code:
library ieee;
use ieee.std_logic_1164.all;
entity DCDR is
port (
I_in: in std_logic_vector (2 downto 0);
O_out: out std_logic_vector (7 downto 0)
);
end entity;
architecture foo of DCDR is
signal input: std_logic_vector (2 downto 0);
begin
input <= TO_X01Z(I_in);
O_out <= "00000001" when input = "000" else
"00000010" when input = "001" else
"00000100" when input = "010" else
"00001000" when input = "011" else
"00010000" when input = "100" else
"00100000" when input = "101" else
"01000000" when input = "110" else
"10000000" when input = "111" else
(others => 'X');
end architecture;
library ieee;
use ieee.std_logic_1164.all;
entity D_FF is
port (
D_in: in std_logic;
CLK: in std_logic;
Q_out: out std_logic;
QN_out: out std_logic
);
end entity;
architecture foo of D_FF is
signal Q: std_logic;
begin
FF:
process (CLK)
begin
if CLK'EVENT and CLK = '1' then
Q <= D_in;
end if;
end process;
Q_out <= Q;
QN_out <= not Q;
end architecture;
library ieee;
use ieee.std_logic_1164.all;
entity MUX is
port (
S_in: in std_logic_vector(2 downto 0);
I7, I6, I5, I4, I3, I2, I1, I0: in std_logic_vector(31 downto 0);
O_out: out std_logic_vector(31 downto 0)
);
end entity;
architecture foo of MUX is
begin
O_out <= I0 when S_in = "000" else
I1 when S_in = "001" else
I2 when S_in = "010" else
I3 when S_in = "011" else
I4 when S_in = "100" else
I5 when S_in = "101" else
I6 when S_in = "110" else
I7 when S_in = "111" else
(others => 'X');
end architecture;
library ieee;
use ieee.std_logic_1164.all;
entity REG is
port (
REG_WRT: in std_logic;
WRT_REG_NUM: in std_logic_vector(2 downto 0);
WRT_DATA: in std_logic_vector(31 downto 0);
READ_REG_A: in std_logic_vector(2 downto 0);
READ_REG_B: in std_logic_vector(2 downto 0);
PORT_A: out std_logic_vector(31 downto 0);
PORT_B: out std_logic_vector(31 downto 0)
);
end REG;
architecture BEHV_32x8_REG of REG is
-- decoder component
component DCDR
port (
I_in: in std_logic_vector(2 downto 0);
O_out: out std_logic_vector(7 downto 0)
);
end component;
-- D flip flop component
component D_FF
port (
D_in: in std_logic;
CLK: in std_logic;
Q_out: out std_logic;
QN_out: out std_logic -- Q not
);
end component;
-- MUX component
component MUX
port (
S_in: in std_logic_vector(2 downto 0);
I7, I6, I5, I4, I3, I2, I1, I0: in std_logic_vector(31 downto 0);
O_out: out std_logic_vector(31 downto 0)
);
end component;
-- internal signals used
-- signal I_in: std_logic_vector(2 downto 0);
signal O_out: std_logic_vector(7 downto 0);
signal CLK_vals: std_logic_vector(7 downto 0);
-- signal MUXA_O_out: std_logic_vector(31 downto 0);
-- signal MUXB_O_out: std_logic_vector(31 downto 0);
-- two arrays of eight 32 bit vectors - the Q and QN outputs of all D_FFs
type reg_array is array (0 to 7) of std_logic_vector(31 downto 0);
signal Q, QN: reg_array;
begin
-- decoder instance
DCDR1:
DCDR
port map (
I_in => WRT_REG_NUM,
O_out => O_out
);
GEN_D_FF:
for ROW in 0 to 7 generate
begin
GEN_D_FF0:
for COL in 0 to 31 generate
begin
DFF_X:
D_FF
port map(
D_in => WRT_DATA(COL),
CLK => CLK_vals(ROW),
Q_out => Q(ROW)(COL),
QN_out => QN(ROW)(COL)
);
end generate;
CLK_vals(ROW) <= O_out(ROW) and REG_WRT;
end generate;
-- DCDR_AND:
-- process (O_out, REG_WRT)
-- begin
--
-- I_in <= WRT_REG_NUM;
-- for I in 0 to 7 loop
-- CLK_vals(I) <= O_out(I) and REG_WRT;
-- end loop;
-- end process;
-- MUX instances
MUX_A:
MUX
port map (
S_in => READ_REG_A,
I7 => Q(7),
I6 => Q(6),
I5 => Q(5),
I4 => Q(4),
I3 => Q(3),
I2 => Q(2),
I1 => Q(1),
I0 => Q(0),
O_out => Port_A
);
MUX_B:
MUX
port map (
S_in => READ_REG_B,
I7 => Q(7),
I6 => Q(6),
I5 => Q(5),
I4 => Q(4),
I3 => Q(3),
I2 => Q(2),
I1 => Q(1),
I0 => Q(0),
O_out => Port_B
);
end architecture;
library ieee;
use ieee.std_logic_1164.all;
entity reg_tb is
end entity;
architecture fum of reg_tb is
component REG
port (
REG_WRT: in std_logic;
WRT_REG_NUM: in std_logic_vector (2 downto 0);
WRT_DATA: in std_logic_vector (31 downto 0);
READ_REG_A: in std_logic_vector (2 downto 0);
READ_REG_B: in std_logic_vector (2 downto 0);
PORT_A: out std_logic_vector (31 downto 0);
PORT_B: out std_logic_vector (31 downto 0)
);
end component;
signal REG_WRT: std_logic := '1';
signal WRT_REG_NUM: std_logic_vector (2 downto 0) := "000";
signal WRT_DATA: std_logic_vector (31 downto 0) := (others => '0');
signal READ_REG_A: std_logic_vector (2 downto 0) := "000";
signal READ_REG_B: std_logic_vector (2 downto 0) := "000";
signal PORT_A: std_logic_vector (31 downto 0);
signal PORT_B: std_logic_vector (31 downto 0);
begin
DUT:
REG
port map (
REG_WRT => REG_WRT,
WRT_REG_NUM => WRT_REG_NUM,
WRT_DATA => WRT_DATA,
READ_REG_A => READ_REG_A,
READ_REG_B => READ_REG_B,
PORT_A => PORT_A,
PORT_B => PORT_B
);
STIMULUS:
process
begin
wait for 20 ns;
REG_WRT <= '0';
wait for 20 ns;
REG_WRT <= '1';
wait for 20 ns;
WRT_DATA <= x"feedface";
WRT_REG_NUM <= "001";
REG_WRT <= '0';
wait for 20 ns;
REG_WRT <= '1';
READ_REG_A <= "001";
wait for 20 ns;
WRT_DATA <= x"deadbeef";
WRT_REG_NUM <= "010";
READ_REG_B <= "010";
REG_WRT <= '0';
wait for 20 ns;
REG_WRT <= '1';
wait for 20 ns;
wait for 20 ns;
wait;
end process;
end architecture;
But it runs and produces the waveform shown above. This was done with Tristan Gingold's ghdl (ghdl-0.31) on a Mac (OS X 10.9.2) using Tony Bybell's gtkwave. See Sourceforge pages for ghdl-updates and gtkwave.

Why Does This VHDL Work in Sumulation and Does not Work on the Virtex 5 Device

I have spent the whole day trying to solve the following problem. I am building a small averaging multichannel oscilloscope and I have the following module for storing the signal:
library IEEE;
use IEEE.std_logic_1164.all;
use IEEE.std_logic_unsigned.all;
use IEEE.numeric_std.all;
entity storage is
port
(
clk_in : in std_logic;
reset : in std_logic;
element_in : in std_logic;
data_in : in std_logic_vector(11 downto 0);
addr : in std_logic_vector(9 downto 0);
add : in std_logic; -- add = '1' means add to RAM
-- add = '0' means write to RAM
dump : in std_logic;
element_out : out std_logic;
data_out : out std_logic_vector(31 downto 0)
);
end storage;
architecture rtl of storage is
component bram is
port
(
clk : in std_logic;
we : in std_logic;
en : in std_logic;
addr : in std_logic_vector(9 downto 0);
di : in std_logic_vector(31 downto 0);
do : out std_logic_vector(31 downto 0)
);
end component bram;
type state is (st_startwait, st_add, st_write);
signal current_state : state := st_startwait;
signal next_state : state := st_startwait;
signal start : std_logic;
signal we : std_logic;
signal en : std_logic;
signal di : std_logic_vector(31 downto 0);
signal do : std_logic_vector(31 downto 0);
signal data : std_logic_vector(11 downto 0);
begin
ram : bram port map
(
clk => clk_in,
we => we,
en => en,
addr => addr,
di => di,
do => do
);
process(clk_in, reset, start)
begin
if rising_edge(clk_in) then
if (reset = '1') then
current_state <= st_startwait;
else
start <= '0';
current_state <= next_state;
if (element_in = '1') then
start <= '1';
end if;
end if;
end if;
end process;
process(current_state, start, dump)
variable acc : std_logic_vector(31 downto 0);
begin
element_out <= '0';
en <= '1';
we <= '0';
case current_state is
when st_startwait =>
if (start = '1') then
acc(11 downto 0) := data_in;
acc(31 downto 12) := (others => '0');
next_state <= st_add;
else
next_state <= st_startwait;
end if;
when st_add =>
if (add = '1') then
acc := acc + do;
end if;
we <= '1';
di <= acc;
next_state <= st_write;
when st_write =>
if (dump = '1') then
data_out <= acc;
element_out <= '1';
end if;
next_state <= st_startwait;
end case;
end process;
end rtl;
Below is the BRAM module as copied from the XST manual. This is a no-change type of BRAM and I believe there is the problem. The symptom is that, while this simulates fine, I read only zeroes from the memory when I use the design on the device.
library IEEE;
use IEEE.std_logic_1164.all;
use IEEE.std_logic_unsigned.all;
entity bram is
port
(
clk : in std_logic;
we : in std_logic;
en : in std_logic;
addr : in std_logic_vector(9 downto 0);
di : in std_logic_vector(31 downto 0);
do : out std_logic_vector(31 downto 0)
);
end bram;
architecture rtl of bram is
type ram_type is array (0 to 999) of std_logic_vector (31 downto 0);
signal buf : ram_type;
begin
process(clk, en, we)
begin
if rising_edge(clk) then
if en = '1' then
if we = '1' then
buf(conv_integer(addr)) <= di;
else
do <= buf(conv_integer(addr));
end if;
end if;
end if;
end process;
end rtl;
What follows is a description of the chip use and the expected output. "clk_in" is a 50 MHz clock. "element_in" is '1' for 20 ns and '0' for 60 ns. "addr_in" iterates from 0 to 999 and changes every 80 ns. "element_in", "data_in", and "addr" are all aligned and synchronous. Now "add" is '1' for 1000 elements, then both "add" and "dump" are zero for 8000 elements and, finally "dump" is '1' for 1000 elements. Now, if I have a test bench that supplies "data_in" from 0 to 999, I expect data_out to be 0, 10, 20, 30, ..., 9990 when "dump" is '1'. That is according to the simulation. In reality I get 0, 1, 2, 3, ..., 999....

Some initial issues to address are listed below.
The process(current_state, start, dump) in storage entity looks like it is
intended to implement a combinatorial element (gates), but the signal (port)
data_in is not in the sensitivity list.
This is very likely to cause a difference between simulation and synthesis
behavior, since simulation will typically only react to the signals in the
sensitivity list, where synthesis will implement the combinatorial design and
react on all used signals, but may give a warning about incomplete sensitivity
list or inferred latches. If you are using VHDL-2008 then use can use a
sensitivity list of (all) to have the process sensitivity to all used
signals, and otherwise you need to add missing signals manually.
The case current_state is in process(current_state, start, dump) lacks an
when others => ..., so the synthesis tool has probably given you a warning
about inferred latches. This should be fixed by adding the when others =>
with and assign all signals driven by the process to the relevant value.
The use clause lists:
use IEEE.std_logic_unsigned.all;
use IEEE.numeric_std.all;
But both of these should not be used at the same time, since they declare some
of the same identifiers, for example is unsigned declared in both. Since the
RAM uses std_logic_unsigned I suggest that you stick with that only, and
delete use of numeric_std. For new code I would though recommend use of
numeric_std.
Also the process(clk_in, reset, start) in storage entity implements a
sequential element (flip flop) sensitive to only rising edge of clk_in, so
the two last signals in sensitivity list ..., reset, start) are unnecessary,
but does not cause a problem.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Multiply Accumulate Unit - vhdl

Related

VHDL Changing and holding the signal in if statement

Why won't a VHDL "inout" signal be assigned a value when used as an output

Multiple read of register in VHDL and encapsulation leads to wrong value

nested generate statements for 32 x 8 register VHDL

Why Does This VHDL Work in Sumulation and Does not Work on the Virtex 5 Device

Categories

Resources