test bench of a 32x8 register file VHDL - vhdl

I wrote the assembly code for this circuit in vhdl already. I want to simulate it with a test bench.
RegWrite: 1 bit input (clock)
Write Register Number: 3-bit input(write addresses)
Write Data: 32-bit input (data in) Read
Register Number A: 3-bit input (read addresses)
Read Register Number B: 3-bit input (read adddresses)
Port A: 32-bit output (data out)
Port B: 32-bit output (data out)
I think my problem is that I don't understand what this circuit does. I chose random values to assign to the inputs, but it didn't output anything. What are good inputs to choose for this circuit?
here is my test bench file for reference:
library ieee;
use ieee.std_logic_1164.all;
entity Reg_TB is -- entity declaration
end Reg_TB;
architecture TB of Reg_TB is
component RegisterFile_32x8
port ( RegWrite: in std_logic;
WriteRegNum: in std_logic_vector(2 downto 0);
WriteData: in std_logic_vector(31 downto 0);
ReadRegNumA: in std_logic_vector(2 downto 0);
ReadRegNumB: in std_logic_vector(2 downto 0);
PortA: out std_logic_vector(31 downto 0);
PortB: out std_logic_vector(31 downto 0)
);
end component;
signal T_RegWrite : std_logic;
signal T_WriteRegNum: std_logic_vector(2 downto 0);
signal T_WriteData: std_logic_vector(31 downto 0);
signal T_ReadRegNumA: std_logic_vector(2 downto 0);
signal T_ReadRegNumB: std_logic_vector(2 downto 0);
signal T_PortA : std_logic_vector(31 downto 0);
signal T_PortB : std_logic_vector(31 downto 0);
begin
T_WriteRegNum <= "011";
T_WriteData <= "00000000000000000000000000000001";
T_ReadRegNumA <= "001";
T_ReadRegNumB <= "100";
U_RegFile: RegisterFile_32x8 port map
(T_RegWrite, T_WriteRegNum, T_WriteData,T_ReadRegNumA, T_ReadRegNumB, T_PortA, T_PortB);
-- concurrent process to offer clock signal
process
begin
T_RegWrite <= '0';
wait for 5 ns;
T_RegWrite <= '1';
wait for 5 ns;
end process;
process
begin
wait for 12 ns;
-- case 2
wait for 28 ns;
-- case 3
wait for 2 ns;
-- case 4
wait for 10 ns;
-- case 5
wait for 20 ns;
wait;
end process;
end TB;
as you can see I chose
WriteRegNum = "011"
WriteData = "00000000000000000000000000000001"
ReadRegNumA = "001"
ReadRegNumB = "100"
I think that I chose bad inputs. The simulation does this:

In general reading an address before it is written doesn't produce any useful results.
Your block diagram shows a 32 bit wide 8 word deep register file with two read ports and one write port with RegWrite used as a clock gated by the decode of the write address. A stable WriteRegNum value and a rising edge on RegWrite effects a write to the address specified by WriteRegNum.
The two read ports appear completely independent. Specifying an address on the respective ReadRegNumA or ReadRegNumB should output the contents of that register to the respective output port.
To get something useful out, you have to write to that location first, otherwise it will be the default value ((others => 'U'),) suspiciously like your waveform.
Trying writing to a location before expecting valid read data from it. Use values that are distinguishable by register location. Theoretically you should be preserving set up and hold time on WriteRegNum with respect to the rising edge of RegWrite.
Example stimulus producing output:
library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
entity registerfile_32x8 is
port (
RegWrite: in std_logic;
WriteRegNum: in std_logic_vector (2 downto 0);
WriteData: in std_logic_vector (31 downto 0);
ReadRegNumA: in std_logic_vector (2 downto 0);
ReadRegNumB: in std_logic_vector (2 downto 0);
PortA: out std_logic_vector (31 downto 0);
PortB: out std_logic_vector (31 downto 0)
);
end entity;
architecture fum of registerfile_32x8 is
type reg_array is array (0 to 7) of std_logic_vector(31 downto 0);
signal reg_file: reg_array;
begin
process(RegWrite)
begin
if rising_edge(RegWrite) then
reg_file(to_integer(unsigned(WriteRegNum))) <= WriteData;
end if;
end process;
PortA <= reg_file(to_integer(unsigned(ReadRegNumA)));
PortB <= reg_file(to_integer(unsigned(ReadRegNumB)));
end architecture;
library ieee;
use ieee.std_logic_1164.all;
entity reg_tb is
end entity;
architecture fum of reg_tb is
component registerfile_32x8
port (
RegWrite: in std_logic;
WriteRegNum: in std_logic_vector (2 downto 0);
WriteData: in std_logic_vector (31 downto 0);
ReadRegNumA: in std_logic_vector (2 downto 0);
ReadRegNumB: in std_logic_vector (2 downto 0);
PortA: out std_logic_vector (31 downto 0);
PortB: out std_logic_vector (31 downto 0)
);
end component;
signal RegWrite: std_logic := '1';
signal WriteRegNum: std_logic_vector (2 downto 0) := "000";
signal WriteData: std_logic_vector (31 downto 0) := (others => '0');
signal ReadRegNumA: std_logic_vector (2 downto 0) := "000";
signal ReadRegNumB: std_logic_vector (2 downto 0) := "000";
signal PortA: std_logic_vector (31 downto 0);
signal PortB: std_logic_vector (31 downto 0);
begin
DUT:
registerfile_32x8
port map (
RegWrite => RegWrite,
WriteRegNum => WriteRegNum,
WriteData => WriteData,
ReadRegNumA => ReadRegNumA,
ReadRegNumB => ReadRegNumB,
PortA => PortA,
PortB => PortB
);
STIMULUS:
process
begin
wait for 20 ns;
RegWrite <= '0';
wait for 20 ns;
RegWrite <= '1';
wait for 20 ns;
WriteData <= x"feedface";
WriteRegnum <= "001";
RegWrite <= '0';
wait for 20 ns;
RegWrite <= '1';
ReadRegNumA <= "001";
wait for 20 ns;
WriteData <= x"deadbeef";
WriteRegNum <= "010";
ReadRegNumB <= "010";
RegWrite <= '0';
wait for 20 ns;
RegWrite <= '1';
wait for 20 ns;
wait for 20 ns;
wait;
end process;
end architecture;
david_koontz#Macbook: ghdl -a regfile_32x8.vhdl
david_koontz#Macbook: ghdl -e reg_tb
david_koontz#Macbook: ghdl -r reg_tb --wave=reg_tb.ghw
david_koontz#Macbook: open reg_tb.gtkw
Essentially, the point is to have non 'U' values in a register file that's being read. If you notice the last write to WriteRegNum = "010", PortB shows undefined output until the write occurs.

Related

VHDL: muxes between 11 buses 8 bits wide output

i recieved this question as a pre interview question "Draw a diagram and write the VHDL code for a module that meets the following requirements:
a. Fully synchronous.
b. Muxes between 11 buses where each bus is 8-bits wide.
c. Has 2 cycles of latency.
d. Optimized for maximum clock frequency."
ive been trying to do it myself reading my old notes and assignments i have done in university but i just don't think i'm on the right track with this. i have the code soo far posted below:
library IEEE;
use IEEE.STD_LOGIC_1164.all;
entity Mux is
port(
A: in STD_LOGIC_vector(7 downto 0);
B: in STD_LOGIC_vector(7 downto 0);
C: in STD_LOGIC_vector(7 downto 0);
D: in STD_LOGIC_vector(7 downto 0);
E: in STD_LOGIC_vector(7 downto 0);
F: in STD_LOGIC_vector(7 downto 0);
G: in STD_LOGIC_vector(7 downto 0);
H: in STD_LOGIC_vector(7 downto 0);
I: in STD_LOGIC_vector(7 downto 0);
J: in STD_LOGIC_vector(7 downto 0);
K: in STD_LOGIC_vector(7 downto 0);
S0: in std_LOGIC_vector(3 downto 0);
Z: out STD_LOGIC_vector(7 downto 0)
);
end Mux;
architecture func of Mux is
begin
process (A,B,C,D,E,F,G,H,I,J,K,S0)
begin
if S0="0001" then
Z<= A;
elsif S0="0010" then
Z<= B;
elsif S0="0011" then
Z<= C;
elsif S0="0100" then
Z<= D;
elsif S0="0101" then
Z<= E;
elsif S0="0110" then
Z<= F;
elsif S0="0111" then
Z<= G;
elsif S0="1000" then
Z<= H;
elsif S0="1001" then
Z<= I;
elsif S0="1010" then
Z<= J;
elsif S0="1011" then
Z<= K;
else
Z<=A;
end if;
end process;
end func;
and this is the code i have for my second file:
LIBRARY ieee;
USE ieee.std_logic_1164.ALL;
use IEEE.std_logic_arith.all;
entity mux11test is
end entity mux11test;
architecture test of mux11test is
signal T_A: STD_LOGIC_vector(7 downto 0):="00000001";
signal T_B: STD_LOGIC_vector(7 downto 0):="00000010";
signal T_C: STD_LOGIC_vector(7 downto 0):="00000011";
signal T_D: STD_LOGIC_vector(7 downto 0):="00000100";
signal T_E: STD_LOGIC_vector(7 downto 0):="00000101";
signal T_F: STD_LOGIC_vector(7 downto 0):="00000110";
signal T_G: STD_LOGIC_vector(7 downto 0):="00000111";
signal T_H: STD_LOGIC_vector(7 downto 0):="00001000";
signal T_I: STD_LOGIC_vector(7 downto 0):="00001001";
signal T_J: STD_LOGIC_vector(7 downto 0):="00001010";
signal T_K: STD_LOGIC_vector(7 downto 0):="00001011";
signal T_S: STD_LOGIC_vector( 3 downto 0);
signal T_Z: STD_LOGIC_vector(7 downto 0);
component mux11 IS
port(
A: in STD_LOGIC_vector(7 downto 0);
B: in STD_LOGIC_vector(7 downto 0);
C: in STD_LOGIC_vector(7 downto 0);
D: in STD_LOGIC_vector(7 downto 0);
E: in STD_LOGIC_vector(7 downto 0);
F: in STD_LOGIC_vector(7 downto 0);
G: in STD_LOGIC_vector(7 downto 0);
H: in STD_LOGIC_vector(7 downto 0);
I: in STD_LOGIC_vector(7 downto 0);
J: in STD_LOGIC_vector(7 downto 0);
K: in STD_LOGIC_vector(7 downto 0);
S0: in std_LOGIC_vector(3 downto 0);
Z: out STD_LOGIC_vector(7 downto 0)
);
END COMPONENT ;
signal clk : std_LOGIC;
constant clk_period: time:=100ns;
begin
umux: Mux11 port map(T_A,T_B,T_C,T_D,T_E,T_F,T_G,T_H,T_I,T_J,T_K,T_S,T_Z);
clk_process:process
begin
clk<='0';
wait for clk_period/2;
clk <='1';
wait for clk_period/2;
end process;
PROCESS
begin
if T_S="0001" then
T_Z <= T_A ;
elsif T_S="0010" then
T_Z <= T_B ; wait for 100 ns;
elsif T_S="0011" then
T_Z <= T_C ; wait for 100 ns;
elsif T_S="0100" then
T_Z <= T_D ; wait for 100 ns;
elsif T_S="0101" then
T_Z <=T_E ; wait for 100 ns;
elsif T_S="0110" then
T_Z <= T_F ; wait for 100 ns;
elsif T_S="0111" then
T_Z <= T_G ; wait for 100 ns;
elsif T_S="1000" then
T_Z <= T_H ; wait for 100 ns;
elsif T_S="1001" then
T_Z <= T_I ; wait for 100 ns;
elsif T_S="1010" then
T_Z <= T_J ; wait for 100 ns;
elsif T_S="1011" then
T_Z <= T_K ; wait for 100 ns;
wait;
end if;
end PROCESS;
end architecture test;
is there anyone who could tell me if im on the right path and if this is fully synchronous and how would i start implementing or determining 2 cycles of latency?
I try to write a clear answer to help you.
First of all you need a clock in your design let's call it clk.
entity Mux is
port(
clk: in std_logic;
A: in STD_LOGIC_vector(7 downto 0);
B: in STD_LOGIC_vector(7 downto 0);
C: in STD_LOGIC_vector(7 downto 0);
D: in STD_LOGIC_vector(7 downto 0);
E: in STD_LOGIC_vector(7 downto 0);
F: in STD_LOGIC_vector(7 downto 0);
G: in STD_LOGIC_vector(7 downto 0);
H: in STD_LOGIC_vector(7 downto 0);
I: in STD_LOGIC_vector(7 downto 0);
J: in STD_LOGIC_vector(7 downto 0);
K: in STD_LOGIC_vector(7 downto 0);
S0: in std_LOGIC_vector(3 downto 0);
Z: out STD_LOGIC_vector(7 downto 0));
end Mux;
The idea when you use synchronous processes is to always update your values on an edge of your clock. Let's say the rising edge. Thus your processes have to be only sensitive to your input clk.
P : PROCESS (clk)
BEGIN
IF (rising_edge(clk)) THEN
...
END IF;
END PROCESS;
Concerning your multiplexer, your idea was good. Yet I would suggest to use a CASE statement because it is easier to read than IF ELSIF.
CASE S0 IS
WHEN "0001" => Z <= A;
WHEN "0010" => Z <= B;
...
WHEN "1011" => Z <= K;
END CASE;
EDIT : since I forgot to talk about the 2 cycles latency I'll say two words. There you need two intermediate signals (ie Z_i and Z_ii). Z_ii takes Z_i after one clock cycle and Z takes Z_ii after one clock cycle.
Z_ii <= Z_i;
Z <= Z_ii;
Of course you then need to drive Z_i (and not Z) in you process.

Can't get VHDL Sequential Multiplier to Multiply correctly

I have a School Lab that I must do pertaining to creating a sequential multiplier in VHDL. My issues is happening before making the finite state machine for the sequential multiplier. I can not get the base model to multiply correctly, I think I have a issue in my test bench but am not 100% sure of this. I still have doubt that the issue is in my code.
Top Design (basically calling the D-Flip-Flops, MUX and Adder)
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
--use ieee.std_logic_arith.all;
--use ieee.std_logic_unsigned.all;
entity toplvds is
port( A,B: in std_logic_vector(3 downto 0);
Zero: in std_logic_vector(3 downto 0);
clk, clr, load, loadP, sb: in std_logic;
Po: out std_logic_vector(7 downto 0));
end toplvds;
architecture Behavioral of toplvds is
component dffa
port( dina: in std_logic_vector(3 downto 0);
clr, clk, load: in std_logic;
q: out std_logic_vector(3 downto 0));
end component;
component dffb
port( dinb: in std_logic_vector(3 downto 0);
clr, clk, load, sb: in std_logic;
qb0: out std_logic);
end component;
component mux
port( d0,d1: in std_logic_vector(3 downto 0);
s: in std_logic;
y: out std_logic_vector(3 downto 0));
end component;
component adder
port( a,b: in std_logic_vector(3 downto 0);
cry: out std_logic;
r: out std_logic_vector(3 downto 0));
end component;
component dffP
port( dinp: in std_logic_vector(3 downto 0);
carry: in std_logic;
clr, clk, loadP, sb: in std_logic;
PHout: out std_logic_vector (3 downto 0);
P: out std_logic_vector(7 downto 0));
end component;
signal Wire1: std_logic_vector(3 downto 0);
signal Wire2: std_logic_vector(3 downto 0);
signal Wire3: std_logic;
signal Wire4: std_logic_vector(3 downto 0);
signal Wire5: std_logic_vector(3 downto 0);
signal Wire6: std_logic_vector(3 downto 0);
signal Wire7: std_logic;
begin
Wire1 <= Zero;
u1: dffa port map (dina=>A,clr=>clr,clk=>clk,load=>load,q=>Wire2);
u2: dffb port map (dinb=>B,clr=>clr,clk=>clk,load=>load,sb=>sb,qb0=>Wire3);
u3: mux port map (d0=>Wire2,d1=>Wire1,s=>Wire3,y=>Wire4);
u4: adder port map (a=>Wire6,b=>Wire4,cry=>Wire7,r=>Wire5);
u5: dffp port map (dinp=>Wire5,carry=>Wire7,clr=>clr,clk=>clk,loadP=>loadP,sb=>sb,PHout=>Wire6,P=>Po);
end Behavioral;
D-Flip-Flop for Multiplicand
library ieee;
use ieee.std_logic_1164.all;
entity dffa is
port( dina: in std_logic_vector(3 downto 0);
clr, clk, load: in std_logic;
q: out std_logic_vector(3 downto 0));
end dffa;
architecture beh of dffa is
begin
process(clk,clr)
begin
if(clr = '1') then
q <= ( others => '0');
elsif (rising_edge(clk)) then
if(load = '1') then
q <= dina;
end if;
end if;
end process;
end beh;
D-Flip-Flop for Multiplier
library ieee;
use ieee.std_logic_1164.all;
entity dffb is
port( dinb: in std_logic_vector(3 downto 0);
clr, clk, load, sb: in std_logic;
qb0: out std_logic);
end dffb;
architecture beh of dffb is
signal q: std_logic_vector(3 downto 0);
begin
qb0 <= q(0);
process(clk,clr, load, sb)
begin
if(clr = '1') then
q <= ( others => '0');
elsif (rising_edge(clk)) then
if(load = '1') then
q <= dinb;
elsif (sb = '1') then
q <= '0' & q ( 3 downto 1);
end if;
end if;
end process;
end beh;
MUX
library ieee;
use ieee.std_logic_1164.all;
entity mux is
port( d0,d1: in std_logic_vector(3 downto 0);
s: in std_logic;
y: out std_logic_vector(3 downto 0));
end mux;
architecture beh of mux is
begin
y <= d0 when s = '1' else d1;
end beh;
Adder
library ieee;
use ieee.std_logic_1164.all;
use ieee.std_logic_unsigned.all;
use ieee.std_logic_arith.all;
entity adder is
port( a,b: in std_logic_vector(3 downto 0);
cry: out std_logic;
r: out std_logic_vector(3 downto 0));
end adder;
architecture beh of adder is
signal temp : std_logic_vector(4 downto 0);
begin
temp <= ('0' & a) + ('0' & b);
r <= temp(3 downto 0);
cry <= temp(4);
end beh;
D-Flip-Flop for Product
library ieee;
use ieee.std_logic_1164.all;
entity dffp is
port( dinp: in std_logic_vector(3 downto 0);
carry: in std_logic;
clr, clk, loadP, sb: in std_logic;
PHout: out std_logic_vector (3 downto 0);
P: out std_logic_vector(7 downto 0));
end dffp;
architecture beh of dffp is
signal q: std_logic_vector(7 downto 0);
begin
--qp0 <= q(0);
process(clk,clr, loadP, sb)
begin
if(clr = '1') then
q <= ( others => '0');
elsif (rising_edge(clk)) then
if(loadP = '1') then
--q <= "00000000";
q(7 downto 4) <= dinp;
elsif (sb = '1') then
q <= carry & q ( 7 downto 1);
--else
--q(7 downto 4) <= dinp;
end if;
end if;
end process;
PHout <= q(7 downto 4);
P <= q;
end beh;
TEST-BENCH Code
LIBRARY ieee;
USE ieee.std_logic_1164.ALL;
-- Uncomment the following library declaration if using
-- arithmetic functions with Signed or Unsigned values
--USE ieee.numeric_std.ALL;
ENTITY toplvds_tb IS
END toplvds_tb;
ARCHITECTURE behavior OF toplvds_tb IS
-- Component Declaration for the Unit Under Test (UUT)
COMPONENT toplvds
PORT(
A : IN std_logic_vector(3 downto 0);
B : IN std_logic_vector(3 downto 0);
Zero : IN std_logic_vector(3 downto 0);
clk : IN std_logic;
clr : IN std_logic;
load : IN std_logic;
loadP : IN std_logic;
sb : IN std_logic;
Po : OUT std_logic_vector(7 downto 0)
);
END COMPONENT;
--Inputs
signal A : std_logic_vector(3 downto 0) := (others => '0');
signal B : std_logic_vector(3 downto 0) := (others => '0');
signal Zero : std_logic_vector(3 downto 0) := (others => '0');
signal clk : std_logic := '0';
signal clr : std_logic := '0';
signal load : std_logic := '0';
signal loadP : std_logic := '0';
signal sb : std_logic := '0';
--Outputs
signal Po : std_logic_vector(7 downto 0);
-- Clock period definitions
constant clk_period : time := 10 ns;
BEGIN
-- Instantiate the Unit Under Test (UUT)
uut: toplvds PORT MAP (
A => A,
B => B,
Zero => Zero,
clk => clk,
clr => clr,
load => load,
loadP => loadP,
sb => sb,
Po => Po
);
-- Clock process definitions
clk_process :process
begin
clk <= '0';
wait for clk_period/2;
clk <= '1';
wait for clk_period/2;
end process;
-- Stimulus process
stim_proc: process
begin
A <= "1011";
B <= "1101";
Zero <="0000";
load <= '0';
sb <= '0';
clr <= '1';
wait for 12 ns;
clr <= '0'; load <= '1';
wait for 12 ns;
load <= '0'; sb <= '1';
wait for 12 ns;
sb <= '0'; loadP <= '1';
wait for 12 ns;
loadP <= '0'; sb <= '1';
wait for 12 ns;
sb <= '0'; loadP <= '1';
wait for 12 ns;
loadP <= '0'; sb <= '1';
wait for 12 ns;
sb <= '0'; loadP <= '1';
wait for 12 ns;
loadP <= '0'; sb <= '1';
wait for 12 ns;
sb <= '0'; loadP <= '1';
wait for 12 ns;
loadP <= '0'; sb <= '1';
wait for 20 ns;
loadP <= '0'; sb <= '0';
wait;
end process;
END;
Sorry that I have not commented the code for better understanding. I know this will be hard to follow but I hope someone will. I will also attach an image of the figure of the sequential multiplier I am following, the circuit design.
4 by 4 binary sequential multiplier circuit
4 by 4 binary sequential multiplier circuit - more
Well it was indeed something in the testbench that was giving issues. I worked it out in the lab with fellow classmates. Thank You for your help anyways it is much appreciated.
p.s. All I did was changed some timing values in the testbench at the very bottom to when the load and shift bit would happen and I got it to work.

Why is this Shift Register not loading properly in VHDL?

I have a custom designed shift register that has as input DL(leftmost input), DR(rightmost), CLR that clears and loads DR, S that shifts right and W that loads leftmost. After testing it, the rightmost is being loaded but not the left. I have reread the code multiple times, but I can't figure out what is wrong. Here's the code:
library IEEE;
use IEEE.std_logic_1164.all;
entity shiftregister is
port (
CLK, CLR: in STD_LOGIC;
S: in STD_LOGIC; --Shift right
W: in STD_LOGIC; --Write
Cin: in STD_LOGIC; --possible carry in from the addition
DL: in STD_LOGIC_VECTOR (7 downto 0); --left load for addition result
DR: in STD_LOGIC_VECTOR (7 downto 0); --right load for initial multiplier
Q: out STD_LOGIC_VECTOR (15 downto 0)
);
end shiftregister ;
architecture shiftregister of shiftregister is
signal IQ: std_logic_vector(15 downto 0):= (others => '0');
begin
process (CLK)
begin
if(CLK'event and CLK='1') then
if CLR = '1' then
IQ(7 downto 0) <= DR; --CLR clears and initializes the multiplier
IQ(15 downto 8) <= (others => '0');
else
if (S='1') then
IQ <= Cin & IQ(15 downto 1);
elsif (W='1') then
IQ(15 downto 8) <= DL;
end if;
end if;
end if;
end process;
Q<=IQ;
end shiftregister;
Waveform
TestBench
library IEEE;
use IEEE.std_logic_1164.all;
entity register_tb is
end register_tb;
architecture register_tb of register_tb is
component shiftregister is port (
CLK, CLR: in STD_LOGIC;
S: in STD_LOGIC; --Shift right
W: in STD_LOGIC; --Write
Cin: in STD_LOGIC; --possible carry in from the addition
DL: in STD_LOGIC_VECTOR (7 downto 0); --left load for addition result
DR: in STD_LOGIC_VECTOR (7 downto 0); --right load for initial multiplier
Q: out STD_LOGIC_VECTOR (15 downto 0)
);
end component;
signal CLK: std_logic:='0';
signal CLR: std_logic:='1';
signal Cin: std_logic:='0';
signal S: std_logic:='1';
signal W: std_logic:='0';
signal DL, DR: std_logic_vector(7 downto 0):="00000000";
signal Q: std_logic_vector(15 downto 0):="0000000000000000";
begin
U0: shiftregister port map (CLK, CLR, S, W, Cin, DL,DR,Q);
CLR <= not CLR after 20 ns;
CLK <= not CLK after 5 ns;
W <= not W after 10 ns;
DL <= "10101010" after 10 ns;
DR <= "00110011" after 10 ns;
end register_tb;
Your simulation shows that your S input is always high. The way you have your conditions setup, this means that the last elsif statement will not execute because S has priority over W. If you want your write to have priority over your shift operation, you should switch your conditions
if (W='1') then
IQ(15 downto 8) <= DL;
elsif (S='1') then
IQ <= Cin & IQ(15 downto 1);
end if;
Based on your comment for the desired behaviour, you could do something like this:
if (S='1' and W='1') then
IQ <= Cin & DL & IQ(7 downto 1);
elsif (W='1') then -- S=0
IQ(15 downto 8) <= DL;
elsif (S='1') then -- W=0
IQ <= Cin & IQ(15 downto 1);
end if; -- W=0 & S=0
Some improvements:
(1) Remove all signal but CLK from sensitivity list. Your process has no async signals, so only clock is needed in sensitivity list.
process(CLK)
(2) Assign zero only to the required bits -> question of taste ;)
IQ(7 downto 0) <= DR; --CLR clears and initializes the multiplier
IQ(15 downto 8) <= (others => '0');
(3) A elsif statement can clarify the assignment precedence:
if (S='1') then
IQ <= Cin & IQ(15 downto 1);
elsif (W='1') then
IQ(15 downto 8) <= DL;
end if;
(4) Line Q <= IQ; produces a second 16-bit register. I think this is not intended. Move this line outside of the process.

nested generate statements for 32 x 8 register VHDL

My circuit has a grid of 32 x 8 D flip flops. each row should be producing a 32 bit vectors that contain the Q values from the D-ff's - which are then sent to a 8x1 MUX. The following code is me trying to properly generate the 32 x 8 D flip flops and test if I can get a vector out of them (the 32 bit I0 vector).
the circuit I'm trying to write the implementation for can be seen in the figure posted in this question:
test bench of a 32x8 register file VHDL
library ieee;
use ieee.std_logic_1164.all;
entity REG is
port (
REG_WRT : in std_logic;
WRT_REG_NUM : in std_logic_vector(2 downto 0);
WRT_DATA : in std_logic_vector(31 downto 0);
READ_REG_A : in std_logic_vector(2 downto 0);
READ_REG_B : in std_logic_vector(2 downto 0);
PORT_A : out std_logic_vector(31 downto 0);
PORT_B : out std_logic_vector(31 downto 0)
);
end REG;
architecture BEHV_32x8_REG of REG is
-- decoder component
component DCDR
port (
I_in : in std_logic_vector(2 downto 0);
O_out : out std_logic_vector(7 downto 0)
);
end component;
-- D flip flop component
component D_FF
port (
D_in : in std_logic;
CLK : in std_logic;
Q_out : out std_logic;
QN_out : out std_logic -- Q not
);
end component;
-- MUX copmonent
component MUX
port (
S_in : in std_logic_vector(2 downto 0);
I7, I6, I5, I4, I3, I2, I1, I0 : in std_logic_vector(31 downto 0);
O_out : out std_logic_vector(31 downto 0)
);
end component;
-- internal signals used
signal I_in : std_logic_vector(2 downto 0);
signal O_out : std_logic_vector(7 downto 0);
signal CLK_vals : std_logic_vector(7 downto 0);
signal MUXA_O_out : std_logic_vector(31 downto 0);
signal MUXB_O_out : std_logic_vector(31 downto 0);
-- two arrays of eight 32 bit vectors - the Q and QN outputs of all D_FFs
type reg_array is array (0 to 7) of std_logic_vector(31 downto 0);
signal Q, QN: reg_array;
begin
-- decoder instance
DCDR1 : DCDR port map(I_in, O_out);
GEN_D_FF:
for ROW in 0 to 7 generate
begin
GEN_D_FF0:
for COL in 0 to 31 generate
begin
DFF_X : D_FF port map(WRT_DATA(COL), CLK_vals(ROW), Q(ROW)(COL), QN(ROW)(COL));
end generate GEN_D_FF0;
end generate GEN_D_FF;
DCDR_AND : process(O_out, REG_WRT)
begin
I_in <= WRT_REG_NUM;
for I in 0 to 7 loop
CLK_vals(I) <= O_out(I) and not REG_WRT;
end loop;
end process DCDR_AND;
-- MUX instances
MUX_A : MUX port map(READ_REG_A, Q(7), Q(6), Q(5), Q(4), Q(3), Q(2), Q(1), Q(0), MUXA_O_out);
MUX_B : MUX port map(READ_REG_B, Q(7), Q(6), Q(5), Q(4), Q(3), Q(2), Q(1), Q(0), MUXB_O_out);
process(MUXA_O_out, MUXB_O_out)
begin
PORT_A <= MUXA_O_out;
PORT_B <= MUXB_O_out;
end process;
end BEHV_32x8_REG;
When I simulate the above code in ModelSim, I don't get any output for I0. Where is my design flawed? Am I violating any VHDL best practices? Assuming I can get this to function properly, how could I get 8 different 32-bit vectors (from each row of flip flops) to send to the MUX(s)?
I appreciate any advice I receive!
EDIT: I've updated the code to reflect advice given in the answers
You have 8 ROWs of 32 bit COLs connected to I0. Without a reset input to the D_FFs at best you'd have to write to all 8 rows to get 'X's instead of 'U's.
Your MUX isn't instantiated for either read port. If you were to implement the array value:
type reg_array is array (0 to 7) of std_logic_vector(31 downto 0);
signal Q,QN: reg_array;
These would replace I0 and Q_out.
From the referenced answer (you apparently just marked as useful - thanks) you could replace I0(COL) and QN_out(COL) in the D_FF instantiation in the inner generate statement with Q(ROW)(COL) and QN(ROW)(COL).
Note If you're not using the Q NOT outputs of the D_FFs you can either not provide them as ports or not connect them (open). You could also use Q outputs for one read MUX and QN outputs for the other read MUX, inverting the output of that MUX. With only two ports you aren't reducing the load significantly, you could just use Q.
For a MUX using signal Q defined as the reg_array above the MUX inputs would be Q(0) through Q(7) and the output would be either PORT_A or PORT_B. S_in would be connected to READ_REG_A orREAD_REG_B respectively.
One thing that's not apparent from reading your VHDL design description is why there is wait for 10 ns in your process DCDR_AND? It delays the write past the low baud of CLK (the low portion of clock). In a zero timed model you could simply used not CLK instead of CLK (CLK_vals(I) <= O_out(I) and not CLK, delete the wait for 10 ns; line). For a timed model resulting from synthesis, wait can't be synthesized. If you were to intend to synthesize, CLK can be used if you can count on input holdover should WR_DATA be clock synchronous.
And then your model has discretely instantiated D_FFs and uses MUXes for read ports.
I resisted the urge to modify your code and show it on the off chance you're doing the same class exercise. If any of this is unclear ask in a comment on this answer and I'll add to the answer, clearly mark any corrections to it or demonstrate code if necessary.
In response to the comment "...yet I still see UUU's for PORT_A and PORT_B"
Note that REG_WRT already shows as an inverted clock from the test bench from the previous effort, so I removed the preceding not in process DCDR_AND, otherwise using the previous effort's test bench unchanged other than matching your port names.
Also notice that the PORT_A and PORT_B outputs remain uninitialized until the address (READ_REG_A or READ_REG_B) is written to, which was the point of that particular test bench.
The idea behind writing to the flip flops (collectively 8 32 bit registers) in the middle of a clock cell (REG_WRT) was to avoid clock skew issues, in the case of the test bench caused by writing inputs based on delay values instead of on clock edges.
You could similarly have stimulus in a clocked process, which might require balancing clock delays to insure WRT_DATA and WRT_REG_NUM are valid at the right time. This is also cured by using not REG_WRT.
If you make REG_WRT in the test bench an upright clock instead of inverted you can leave the not in process DCDR_AND.
There's also concurrent signal assignment statements which incidentally can go in generate statements allowing the process DCDR_AND to folded into the first generate statement:
GEN_D_FF:
for ROW in 0 to 7 generate
begin
GEN_D_FF0:
for COL in 0 to 31 generate
begin
DFF_X:
D_FF
port map(
D_in => WRT_DATA(COL),
CLK => CLK_vals(ROW),
Q_out => Q(ROW)(COL),
QN_out => QN(ROW)(COL)
);
end generate;
DCDR_AND:
CLK_vals(ROW) <= O_out(ROW) and REG_WRT;
end generate;
-- DCDR_AND:
-- process (O_out, REG_WRT)
-- begin
--
-- I_in <= WRT_REG_NUM;
-- for I in 0 to 7 loop
-- CLK_vals(I) <= O_out(I) and REG_WRT;
-- end loop;
-- end process;
And could also be used on the PORT_A and PORT_B assignments instead of inside the process statement. You could also assign PORT_A and PORT_B as actuals to O_out in the two MUX instantiations, doing away with either a concurrent signal assignment or a process, for example:
MUX_A:
MUX
port map (
S_in => READ_REG_A,
I7 => Q(7),
I6 => Q(6),
I5 => Q(5),
I4 => Q(4),
I3 => Q(3),
I2 => Q(2),
I1 => Q(1),
I0 => Q(0),
O_out => Port_A
);
You could do this because you aren't using the read port data internally and the ports are mode out.
And while doing this I found that eliminating the assignment to I_in as above can cause all the 'U's on your read ports which can be cured similarly:
DCDR1:
DCDR
port map (
I_in => WRT_REG_NUM,
O_out => O_out
);
Allowing signal declarations for I_in, MUXA_O_out and MUXB_O_out to be eliminated:
-- internal signals used
-- signal I_in: std_logic_vector(2 downto 0);
signal O_out: std_logic_vector(7 downto 0);
signal CLK_vals: std_logic_vector(7 downto 0);
-- signal MUXA_O_out: std_logic_vector(31 downto 0);
-- signal MUXB_O_out: std_logic_vector(31 downto 0);
I didn't have a case of always having 'U's on the read ports except when I had accidentally eliminated the inclusion of WRT_REG_NUM in CLK_vals as noted above.
I didn't quite finish prettifying your code:
library ieee;
use ieee.std_logic_1164.all;
entity DCDR is
port (
I_in: in std_logic_vector (2 downto 0);
O_out: out std_logic_vector (7 downto 0)
);
end entity;
architecture foo of DCDR is
signal input: std_logic_vector (2 downto 0);
begin
input <= TO_X01Z(I_in);
O_out <= "00000001" when input = "000" else
"00000010" when input = "001" else
"00000100" when input = "010" else
"00001000" when input = "011" else
"00010000" when input = "100" else
"00100000" when input = "101" else
"01000000" when input = "110" else
"10000000" when input = "111" else
(others => 'X');
end architecture;
library ieee;
use ieee.std_logic_1164.all;
entity D_FF is
port (
D_in: in std_logic;
CLK: in std_logic;
Q_out: out std_logic;
QN_out: out std_logic
);
end entity;
architecture foo of D_FF is
signal Q: std_logic;
begin
FF:
process (CLK)
begin
if CLK'EVENT and CLK = '1' then
Q <= D_in;
end if;
end process;
Q_out <= Q;
QN_out <= not Q;
end architecture;
library ieee;
use ieee.std_logic_1164.all;
entity MUX is
port (
S_in: in std_logic_vector(2 downto 0);
I7, I6, I5, I4, I3, I2, I1, I0: in std_logic_vector(31 downto 0);
O_out: out std_logic_vector(31 downto 0)
);
end entity;
architecture foo of MUX is
begin
O_out <= I0 when S_in = "000" else
I1 when S_in = "001" else
I2 when S_in = "010" else
I3 when S_in = "011" else
I4 when S_in = "100" else
I5 when S_in = "101" else
I6 when S_in = "110" else
I7 when S_in = "111" else
(others => 'X');
end architecture;
library ieee;
use ieee.std_logic_1164.all;
entity REG is
port (
REG_WRT: in std_logic;
WRT_REG_NUM: in std_logic_vector(2 downto 0);
WRT_DATA: in std_logic_vector(31 downto 0);
READ_REG_A: in std_logic_vector(2 downto 0);
READ_REG_B: in std_logic_vector(2 downto 0);
PORT_A: out std_logic_vector(31 downto 0);
PORT_B: out std_logic_vector(31 downto 0)
);
end REG;
architecture BEHV_32x8_REG of REG is
-- decoder component
component DCDR
port (
I_in: in std_logic_vector(2 downto 0);
O_out: out std_logic_vector(7 downto 0)
);
end component;
-- D flip flop component
component D_FF
port (
D_in: in std_logic;
CLK: in std_logic;
Q_out: out std_logic;
QN_out: out std_logic -- Q not
);
end component;
-- MUX component
component MUX
port (
S_in: in std_logic_vector(2 downto 0);
I7, I6, I5, I4, I3, I2, I1, I0: in std_logic_vector(31 downto 0);
O_out: out std_logic_vector(31 downto 0)
);
end component;
-- internal signals used
-- signal I_in: std_logic_vector(2 downto 0);
signal O_out: std_logic_vector(7 downto 0);
signal CLK_vals: std_logic_vector(7 downto 0);
-- signal MUXA_O_out: std_logic_vector(31 downto 0);
-- signal MUXB_O_out: std_logic_vector(31 downto 0);
-- two arrays of eight 32 bit vectors - the Q and QN outputs of all D_FFs
type reg_array is array (0 to 7) of std_logic_vector(31 downto 0);
signal Q, QN: reg_array;
begin
-- decoder instance
DCDR1:
DCDR
port map (
I_in => WRT_REG_NUM,
O_out => O_out
);
GEN_D_FF:
for ROW in 0 to 7 generate
begin
GEN_D_FF0:
for COL in 0 to 31 generate
begin
DFF_X:
D_FF
port map(
D_in => WRT_DATA(COL),
CLK => CLK_vals(ROW),
Q_out => Q(ROW)(COL),
QN_out => QN(ROW)(COL)
);
end generate;
CLK_vals(ROW) <= O_out(ROW) and REG_WRT;
end generate;
-- DCDR_AND:
-- process (O_out, REG_WRT)
-- begin
--
-- I_in <= WRT_REG_NUM;
-- for I in 0 to 7 loop
-- CLK_vals(I) <= O_out(I) and REG_WRT;
-- end loop;
-- end process;
-- MUX instances
MUX_A:
MUX
port map (
S_in => READ_REG_A,
I7 => Q(7),
I6 => Q(6),
I5 => Q(5),
I4 => Q(4),
I3 => Q(3),
I2 => Q(2),
I1 => Q(1),
I0 => Q(0),
O_out => Port_A
);
MUX_B:
MUX
port map (
S_in => READ_REG_B,
I7 => Q(7),
I6 => Q(6),
I5 => Q(5),
I4 => Q(4),
I3 => Q(3),
I2 => Q(2),
I1 => Q(1),
I0 => Q(0),
O_out => Port_B
);
end architecture;
library ieee;
use ieee.std_logic_1164.all;
entity reg_tb is
end entity;
architecture fum of reg_tb is
component REG
port (
REG_WRT: in std_logic;
WRT_REG_NUM: in std_logic_vector (2 downto 0);
WRT_DATA: in std_logic_vector (31 downto 0);
READ_REG_A: in std_logic_vector (2 downto 0);
READ_REG_B: in std_logic_vector (2 downto 0);
PORT_A: out std_logic_vector (31 downto 0);
PORT_B: out std_logic_vector (31 downto 0)
);
end component;
signal REG_WRT: std_logic := '1';
signal WRT_REG_NUM: std_logic_vector (2 downto 0) := "000";
signal WRT_DATA: std_logic_vector (31 downto 0) := (others => '0');
signal READ_REG_A: std_logic_vector (2 downto 0) := "000";
signal READ_REG_B: std_logic_vector (2 downto 0) := "000";
signal PORT_A: std_logic_vector (31 downto 0);
signal PORT_B: std_logic_vector (31 downto 0);
begin
DUT:
REG
port map (
REG_WRT => REG_WRT,
WRT_REG_NUM => WRT_REG_NUM,
WRT_DATA => WRT_DATA,
READ_REG_A => READ_REG_A,
READ_REG_B => READ_REG_B,
PORT_A => PORT_A,
PORT_B => PORT_B
);
STIMULUS:
process
begin
wait for 20 ns;
REG_WRT <= '0';
wait for 20 ns;
REG_WRT <= '1';
wait for 20 ns;
WRT_DATA <= x"feedface";
WRT_REG_NUM <= "001";
REG_WRT <= '0';
wait for 20 ns;
REG_WRT <= '1';
READ_REG_A <= "001";
wait for 20 ns;
WRT_DATA <= x"deadbeef";
WRT_REG_NUM <= "010";
READ_REG_B <= "010";
REG_WRT <= '0';
wait for 20 ns;
REG_WRT <= '1';
wait for 20 ns;
wait for 20 ns;
wait;
end process;
end architecture;
But it runs and produces the waveform shown above. This was done with Tristan Gingold's ghdl (ghdl-0.31) on a Mac (OS X 10.9.2) using Tony Bybell's gtkwave. See Sourceforge pages for ghdl-updates and gtkwave.

VHDL : Value not propagating to port map

I have the below VHDL file, where i am facing problem. The final sum is getting the value undefined always.
CL_Adder is the Carry lookahead adder and is check as individual component and is working fine. Regstr module is also working fine.
The problem is with the reslt, reslt_out1, reslt_out2 variables usage ..!
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use work.CS_Adder_Package.all;
-- Uncomment the following library declaration if using
-- arithmetic functions with Signed or Unsigned values
--use IEEE.NUMERIC_STD.ALL;
-- Uncomment the following library declaration if instantiating
-- any Xilinx primitives in this code.
--library UNISIM;
--use UNISIM.VComponents.all;
entity movingaverage is
Port ( sin : in STD_LOGIC_VECTOR (10 downto 0);
clk : in STD_LOGIC;
rst : in STD_LOGIC;
--reslt_in: in std_logic_vector(14 downto 0);
sout : out STD_LOGIC_VECTOR (10 downto 0)
--reslt_out: out std_logic_vector(14 downto 0)
);
end movingaverage;
architecture Structural of movingaverage is
component Regstr is
port ( d : in STD_LOGIC_VECTOR (10 downto 0);
clk : in STD_LOGIC;
rst : in STD_LOGIC;
q : out STD_LOGIC_VECTOR (10 downto 0));
end component;
component CL_Adder is
Port ( x : in STD_LOGIC_VECTOR (14 downto 0);
y : in STD_LOGIC_VECTOR (14 downto 0);
cin : in STD_LOGIC;
s : out STD_LOGIC_VECTOR (14 downto 0);
cout : out STD_LOGIC);
end component;
signal s: input_array;
signal s_se :std_logic_vector(14 downto 0):= (others =>'0');
signal s_se1 :std_logic_vector(14 downto 0):= (others =>'0');
signal s_se2 : std_logic_vector(14 downto 0):= (others =>'0');
signal reslt : std_logic_vector(14 downto 0):= (others =>'0');
signal reslt_out1: std_logic_vector(14 downto 0):= (others =>'0');
signal reslt_out2: std_logic_vector(14 downto 0):= (others =>'0');
signal c1,c2: std_logic;
begin
u0: for i in 15 downto 1 generate
u1:regstr port map(s(i-1)(10 downto 0),clk,rst,s(i)(10 downto 0));
end generate u0;
u7:regstr port map(sin,clk,rst,s(0)(10 downto 0));
s_se(14 downto 0) <= sin(10) & sin(10) & sin(10) & sin(10) & sin(10 downto 0);
reslt<= reslt_out2;
u8:CL_Adder port map(s_se,reslt,'0',reslt_out1,c1);
s_se1<= s(15)(10) & s(15)(10) & s(15)(10) & s(15)(10) & s(15)(10 downto 0);
s_se2 <= not(s_se1);
u9:CL_Adder port map(reslt_out1,s_se2,'1',reslt_out2,c2);
Sout <= reslt(14 downto 4); --divide by 16
end Structural;
Without more code I must add a little guessing, but could look like there is a
loop in the design in reslt => reslt_out1 => reslt_out2 => reslt, since
there is no clock (clk) on CL_Adder in the code:
reslt <= reslt_out2;
...
u8:CL_Adder port map(s_se, reslt, '0', reslt_out1, c1);
...
u9:CL_Adder port map(reslt_out1, s_se2, '1', reslt_out2, c2);
Whether this is the reason for the problem depends on how you see the
"undefined". In simulation the loop itself should not result in X (unknown),
or similar, but the loop hints a problem. Btw, you mention "variables usage",
but there are no variables in the shown code; only signals.
Addition:
If the purpose is to accumulate the value, then a sequential process (clocked process to make flip flops) may be used to capture the result of each iteration, and present as argument in next iteration. The reslt <= reslt_out2; may then be replaced with a process like:
process (clk, rst) is
begin
if rst = '1' then -- Reset if required
reslt <= (others => '0');
elsif rising_edge(clk) then -- Clock
reslt <= reslt_out2;
end if;
end process;

Resources