I've got a situation like the following:
library ieee;
use ieee.std_logic_1164;
entity clkin_to_clkout is
port (
clk_in : in std_logic;
clk_out : out std_logic);
end entity clkin_to_clkout;
architecture arch of clkin_to_clkout is
begin
clk_out <= clk_in;
end architecture arch;
The assignment of clk_in to clk_out isn't a problem for synthesis, but in a simulator it will induce a delta delay from clk_in to clk_out, thereby creating a clock crossing boundary. Is there any way to assign an entity output to an entity input without introducing a delta delay? Thanks.
Edit: Responses to some comments. First, I want this exact question answered, please. For clarification, I want the output port to behave exactly as if it were an alias of the input port. If the answer is, "In VHDL there is no possible way to make an output port an exact behavioral match of an input port", then that is the correct answer and I'll accept it as a limitation of the language. Second, if you don't see what the problem is, please instantiate the clkin_to_clkout entity in the following testbench and observe the difference between mr_sig_del_dly vs mr_sig_clk_dly when you simulate for a few clk1 cycles:
library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
entity delta_delay is
end entity delta_delay;
architecture arch of delta_delay is
signal clk1: std_logic := '0';
signal clk2 : std_logic;
signal mr_sig : unsigned(7 downto 0) := (others => '0');
signal mr_sig_del_dly : unsigned(7 downto 0);
signal mr_sig_clk_dly : unsigned(7 downto 0);
component clkin_to_clkout is
port (
clk_in : in std_logic;
clk_out : out std_logic);
end component clkin_to_clkout;
begin
clk1 <= not clk1 after 10 ns;
clk_inst : clkin_to_clkout
port map (
clk_in => clk1,
clk_out => clk2);
mr_sig <= mr_sig + 1 when rising_edge(clk1);
mr_sig_del_dly <= mr_sig when rising_edge(clk2);
mr_sig_clk_dly <= mr_sig when rising_edge(clk1);
end architecture arch;
When you simulate, you will observe that mr_sig_clk_dly is delayed 1 clock cycle as expected because it is assigned on the same clock that mr_sig is on (clk1). mr_sig_del_dly is not delayed 1 clk1 cycle even though clk2 is just a passthrough of clk1 in the clkin_to_clkout module. This is because clk2 is a delta delayed version of clk1 because I used a signal assignment.
Again, thanks for all your responses.
In VHDL-2008 or before there is no possible way to make an output port an exact behavioral match of an input port.
Reference Jim Lewis's comment to the original question.
Thanks, Jim and to all who opined.
It seems you do not know what a delta delay is.
A delta delay is an infinity small delay. Every assignment has (at least) a delta delay in simulation. That's just how VHDL works.
edit:
After your comments, I see where you are coming from. The issue you are encountering is probably simulation only, as synthesis will simplify it. However, there is a electronic equivalent, being the multi-phase clocks. Consider you want a 2-phase clock, i.e. differential signal, where the second signal is the inverse of the first. If you would realize these clocks by just using one invertor, the second signal would have a phase offset. This is due to the latency of the invertor component. Thus, in clock generating logic (like PLL and DCM) the not-inverted signal is also delayed (using a variable latency buffer). I.e. all clock signals need to be processed, giving them the same (delta) delay.
The same solution can be applied in VHDL. Example:
library ieee;
use ieee.std_logic_1164.all;
entity clk_buffers is
port(
clk : in std_logic;
clk1 : out std_logic;
clk2_n : out std_logic
);
end entity;
architecture rtl of clk_buffers is begin
clk1 <= clk;
clk2_n <= not clk;
end architecture;
library ieee;
entity test_bench is end entity;
architecture behavioural of test_bench is
use ieee.std_logic_1164.all;
signal clk, clk1, clk2_n : std_logic := '1';
signal base, child1, child2 : integer := 0;
begin
clk <= not clk after 1 ns;
clk_buffers_inst : entity work.clk_buffers
port map(clk => clk, clk1 => clk1, clk2_n => clk2_n);
base <= base+1 when rising_edge(clk1);
child1 <= base when rising_edge(clk1);
child2 <= base when falling_edge(clk2_n);
end architecture;
Related
I'm writing a VHDL code to model an 8x1 multiplexer where each input has 32-bit width. So I created an array to model the MUX but now I'm stuck with the Test Bench, it's gotten so complicated. Here is my original file (I'm sure it has so many redundancies) How can I actually make the test bench to recognize my array (R_in) from the component's file and then how will I stimulate it?
LIBRARY ieee;
USE ieee.std_logic_1164.ALL;
USE ieee.numeric_std.ALL;
ENTITY mux8_1 IS
PORT(Rs :IN STD_LOGIC_VECTOR(2 DOWNTO 0);
in0,in1,in2,in3,in4,in5,in6,in7 :IN STD_LOGIC_VECTOR(31 DOWNTO 0);
R_out :OUT STD_LOGIC_VECTOR(31 DOWNTO 0)
);
END mux8_1;
ARCHITECTURE behaviour OF mux8_1 IS
type t_array_mux is array (0 to 7) of STD_LOGIC_VECTOR(31 DOWNTO 0);
signal R_in:t_array_mux;
BEGIN
R_in(0) <= in0;
R_in(1) <= in1;
R_in(2) <= in2;
R_in(3) <= in3;
R_in(4) <= in4;
R_in(5) <= in5;
R_in(6) <= in6;
R_in(7) <= in7;
process(R_in, Rs)
BEGIN
CASE Rs IS
WHEN "000"=>R_out<=R_in(0);
WHEN "001"=>R_out<=R_in(1);
WHEN "010"=>R_out<=R_in(2);
WHEN "011"=>R_out<=R_in(3);
WHEN "100"=>R_out<=R_in(4);
WHEN "101"=>R_out<=R_in(5);
WHEN "110"=>R_out<=R_in(6);
WHEN "111"=>R_out<=R_in(7);
WHEN OTHERS=>R_out<= (others => '0');
END CASE;
END process;
END behaviour;
And here is my "in progress" test bench file. Just ignore the "stimulus process" part I know it's wrong I just couldn't figure out how to write it for a 32-bit signal.
LIBRARY ieee;
USE ieee.std_logic_1164.ALL;
use ieee.numeric_std.all;
ENTITY mux8_1_TB IS
END mux8_1_TB;
ARCHITECTURE behaviour OF mux8_1_TB IS
COMPONENT mux8_1
PORT(Rs :IN STD_LOGIC_VECTOR(2 DOWNTO 0);
in0,in1,in2,in3,in4,in5,in6,in7 :IN STD_LOGIC_VECTOR(31 DOWNTO 0);
R_out :OUT STD_LOGIC_VECTOR(31 DOWNTO 0)
);
END COMPONENT;
type t_array_mux is array (0 to 7) of STD_LOGIC_VECTOR(31 DOWNTO 0);
--Inputs
signal R_in:t_array_mux:=(others=>'0');
signal in0,in1,in2,in3,in4,in5,in6,in7 :STD_LOGIC_VECTOR(31 DOWNTO 0):=(others=>'0');
signal Rs :STD_LOGIC_VECTOR(2 DOWNTO 0):=(others=>'0');
--Outputs
signal R_out:STD_LOGIC_VECTOR(31 DOWNTO 0);
-- Instantiate the Unit Under Test + connect the ports to my signal
BEGIN
R_in(0) <= in0;
R_in(1) <= in1;
R_in(2) <= in2;
R_in(3) <= in3;
R_in(4) <= in4;
R_in(5) <= in5;
R_in(6) <= in6;
R_in(7) <= in7;
uut: mux8_1 PORT MAP(
Rs=>Rs,
R_in=>R_in,
R_out=>R_out
);
-- Stimulus process (where the values -> inputs are set)
PROCESS
begin
R_in<="01010101";
wait for 10 ns;
Rs<="001";
wait for 10 ns;
Rs<="010";
wait for 20 ns;
Rs<="011";
wait for 30 ns;
Rs<="100";
wait for 40 ns;
Rs<="101";
wait for 50 ns;
Rs<="110";
wait for 60 ns;
Rs<="111";
wait for 70 ns;
END PROCESS;
END;
You need to change your uut port map so instead of R_in, it has individual in0 - in7 ports to match your mux8_1 component definition. Then, map in0 - in7 testbench signals directly to these ports:
uut: mux8_1 port map(
...
in0 => in0,
in1 => in1,
...
);
Or if you want to keep the R_in signal, port map like this:
uut: mux8_1 port map(
...
in0 => R_in(0),
in1 => R_in(1),
...
);
This assignment to R_in in your testbench is incorrect:
R_in<="01010101";
R_in is defined as a t_array_mux type, so it can't be assigned a bit vector value. It has to be assigned to an array of 32-bit std_logic_vector. That line should really be removed altogether, as you're already making assignments to R_in in another location outside of the process. Multiple assignments will cause signal contention.
You're initializing R_in in your testbench like this:
signal R_in:t_array_mux:=(others=>'0');
The others keyword as you've used it will only work on an individual std_logic_vector. You need to nest others for your array of std_logic_vector:
signal R_in:t_array_mux:=(others=>(others=>'0'));
You'll want to assign values to your 32-bit in0 - in7 signals so you can see the output of your mux change in the sim. They can be assigned outside the stimulus process. You can assign them using hex-notation (x preceding "") or just binary:
in0 <= x"12345678"; --hex
or
in0 <= "00010010001101000101011001111000"; --binary
Your stimulus process looks fine. As you change Rs, you would expect to see the different input values on R_out. You could add a single wait; at the end of the process, or the process will keep repeating until the end of sim.
Component ports with user-defined types
Alternatively, you could port map your R_in testbench signal directly to a R_in port on your component as you've done, but it would take a bit more work. Your mux8_1 component definition does not have an R_in port. You can add a t_array_mux type port named R_in, if you define the t_array_mux type in a package which you then include in your component and testbench files
library work;
use work.your_package_name.all;
in addition to library IEEE, etc. Then you can use the t_array_mux type in your component port definition:
ENTITY mux8_1 IS
PORT(Rs : IN STD_LOGIC_VECTOR(2 DOWNTO 0);
R_in : IN T_ARRAY_MUX; --User-defined port type
R_out : OUT STD_LOGIC_VECTOR(31 DOWNTO 0)
);
END mux8_1;
This will allow you to do the port mapping of your uut the way you currently have it. You'll have to add the package to the project or compile list in whatever tool you're using.
Using a testbench, you can test the correctness/output behavior of your module by giving a sequence of input signals and then comparing the output signals with the expected output.
Firstly, R_in is unknown to your testbench file, as it was an internal signal of your module. So, providing values to that signal doesn't make sense.
Secondly, you need to supply input to your in0, in1, ..., in7 signals, as they seem to drive your output signal R_out, along with the other input signal Rs
I tried implementing a fir filter in VHDL but during the first three clocks I get no output and the error at 0 ps, Instance /filter_tb/uut/ : Warning: There is an 'U'|'X'|'W'|'Z'|'-' in an arithmetic operand, the result will be 'X'(es)..
Source file (I also have 2 other files for D Flip-Flops):
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use ieee.std_logic_unsigned.all;
entity filter is
port ( x: in STD_LOGIC_VECTOR(3 downto 0);
clk: in STD_LOGIC;
y: out STD_LOGIC_VECTOR(9 downto 0));
end filter;
architecture struct of filter is
type array1 is array (0 to 3) of STD_LOGIC_VECTOR(3 downto 0);
signal coef : array1 :=( "0001", "0011", "0010", "0001");
signal c0, c1, c2, c3: STD_LOGIC_VECTOR(7 downto 0):="00000000";
signal s0, s1, s2, s3: STD_LOGIC_VECTOR(3 downto 0) :="0000";
signal sum: STD_LOGIC_VECTOR(9 downto 0):="0000000000";
component DFF is
Port ( d : in STD_LOGIC_VECTOR(3 downto 0);
clk : in STD_LOGIC;
q : out STD_LOGIC_VECTOR(3 downto 0));
end component;
component lDFF is
Port ( d : in STD_LOGIC_VECTOR(9 downto 0);
clk : in STD_LOGIC;
q : out STD_LOGIC_VECTOR(9 downto 0));
end component;
begin
s0<=x;
c0<=x*coef(0);
DFF1: DFF port map(s0,clk,s1);
c1<=s1*coef(1);
DFF2: DFF port map(s1,clk,s2);
c2<=s2*coef(2);
DFF3: DFF port map(s2,clk,s3);
c3<=s3*coef(3);
sum<=("00" & c0+c1+c2+c3);
lDFF1: lDFF port map(sum,clk,y);
end struct;
Testbench:
LIBRARY ieee;
USE ieee.std_logic_1164.ALL;
-- Uncomment the following library declaration if using
-- arithmetic functions with Signed or Unsigned values
use ieee.std_logic_unsigned.all;
ENTITY filter_tb IS
END filter_tb;
ARCHITECTURE behavior OF filter_tb IS
-- Component Declaration for the Unit Under Test (UUT)
COMPONENT filter
PORT(
x : IN STD_LOGIC_VECTOR(3 downto 0);
clk : IN std_logic;
y : OUT STD_LOGIC_VECTOR(9 downto 0)
);
END COMPONENT;
--Inputs
signal x : STD_LOGIC_VECTOR(3 downto 0) := (others => '0');
signal clk : std_logic := '0';
--Outputs
signal y : STD_LOGIC_VECTOR(9 downto 0);
-- Clock period definitions
constant clk_period : time := 10 ns;
BEGIN
-- Instantiate the Unit Under Test (UUT)
uut: filter PORT MAP (
x => x,
clk => clk,
y => y
);
-- Clock process definitions
clk_process :process
begin
clk <= '0';
wait for clk_period/2;
clk <= '1';
wait for clk_period/2;
end process;
-- Stimulus process
stim_proc1: process
begin
x<="0001";
wait for 10ns;
x<="0011";
wait for 10ns;
x<="0010";
wait for 10ns;
--x<="0011";
end process;
END;
Output:
If anyonce could help, I'd appreciate it. I think it has something to do with the inital values of the signals c_i and s_i but I'm not too sure.
Your FIR filter contains flip-flops. These flip-flops have no reset input and so power up in an unknown state. You simulator models this by initialising the flip-flops' outputs to "UUUU" (as the are four bits wide). A 'U' std_logic value represents and uninitialised value.
So, your code behaves as you ought to expect. If you're not happy with that behaviour, you need to add a reset input and connect it to your flip-flops.
You have build a series of three register making up a cascade of registers.
You have not provided a reset so the register contents will be Unknown. You use the registers for calculations without any condition. Thus you arithmetic calculations will see the Unknown values and fail as you have seen.
The first (simplest) solution would be to add a reset. But that is not the best solution. You will no longer get warnings but the first three cycles of your output will be based on the register reset value not of your input signal.
If you have a big stream and don't care about some incorrect values in the first clock cycle you can live with that.
The really correct way would be to have a 'valid' signal transported along side your data. You only present the output data when there is a 'valid'. This is the standard method to process data through any pipeline hardware structure.
By the way: you normally do not build D-ffs yourself. The synthesizer will do that for you. You just use a clocked process and process the data vectors in it.
I have some questions. If I add a reset pin, when will I toggle it from 1 to 0? How can I create this circuit without explicitly using D-ffs?
You make a reset signal in the same way as you make your clock.
As to D-registers: they come out if you use the standard register VHDL code:
reg : process (clk,reset_n)
begin
// a-synchronous active low reset
if (reset_n='0') then
s0 <= "0000";
s1 <= "0000";
s2 <= "0000";
elsif (rising_edge(clk)) then
s0 <= x;
s1 <= s0;
s2 <= s1;
....
(Code entered as-is, not checked for syntax or typing errors)
I want to see the speed of my VHDL design. As far as I know, it is indicated by Fmax in the Quartus II software. After compiling my design, it shows an Fmax of 653.59 MHz. I wrote a testbench and did some tests to make sure that the design is working as expected. The problem I have with the design is that at the rising edge of the clock, the inputs are set correctly, but the output only comes after one more cycle.
My question is: How can I check the speed of my design (longest delay between the input ports and the output port) and also get the output of the addition at the same time that the inputs are loaded/at the same cycle?
My testbench results are as follows:
a: 0001 and b: 0101 gives XXXX
a: 1001 and b: 0001 gives 0110 (the expected result from the previous
calculation)
a: 1001 and b: 1001 gives 1010 (the expected result from the previous
calculation)
etc
Code:
library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
entity adder is
port(
clk : in STD_LOGIC;
a : in unsigned(3 downto 0);
b : in unsigned(3 downto 0);
sum : out unsigned(3 downto 0)
);
end adder;
architecture rtl of adder is
signal a_r, b_r, sum_r : unsigned(3 downto 0);
begin
sum_r <= a_r + b_r;
process(clk)
begin
if (rising_edge(clk)) then
a_r <= a;
b_r <= b;
sum <= sum_r;
end if;
end process;
end rtl;
Testbench:
library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
entity testbench is
end entity;
architecture behavioral of testbench is
component adder is
port(
clk : in STD_LOGIC;
a : in unsigned(3 downto 0);
b : in unsigned(3 downto 0);
sum : out unsigned(3 downto 0)
);
end component;
signal a, b, sum : unsigned(3 downto 0);
signal clk : STD_LOGIC;
begin
uut: adder
port map(
clk => clk,
a => a,
b => b,
sum => sum
);
stim_process : process
begin
wait for 1 ns;
clk <= '0';
wait for 1 ns;
clk <= '1';
a <= "0001";
b <= "0101";
wait for 1 ns;
clk <= '0';
wait for 1 ns;
clk <= '1';
a <= "1001";
b <= "0001";
wait for 1 ns;
clk <= '0';
wait for 1 ns;
clk <= '1';
a <= "1001";
b <= "1001";
end process;
end behavioral;
is there any issue with using sum_r as your output?
You dont need the input and output registers, if you consider this ALU as a pure combinatorial logic. The Fmax once you deleted them will disappear, will then be dependent and what its connected from and what its connected to and only if incoming is from registers and outgoing is to registers. If it is only logic going from in to out and from input pin to output pin, I think its extremely difficult to say what the propagation delay is and vendors software like Altera and other modern vendors do not have tools which are adequate for this kind of analysis.
Thats why you will hear people talking about difficulties in design asynchronous logic.
I think such fine analysis is difficult to perform with certainty and accuracy. Since for you, the propagation delay would be in picoseconds. Even literature is difficult to find any quantitative answers on propagation delay.
Why is it difficult? remember that propagation delay is determined by the total path capacitance, there is a way to estimate propagation delay for transistors but I dont know the deep details about how the LUTs are internally constructed so I cannot give you a good estimation. So it depends heavily on the family, the process of manufacture, the construction of FPGA and if the load is connected to IO.
You may however make your own estimations by going to the logic planner, look at the path and assume about 20-100ps propagation delay per LUT that it travels through
See the image below.
What you are trying to design is an ALU. By definition, an ALU should be in theory simply a combinatorial logic.
Therefore, strictly speaking, your adder code should only be this.
library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
entity adder is
port(
a : in unsigned(3 downto 0);
b : in unsigned(3 downto 0);
sum : out unsigned(3 downto 0)
);
end adder;
architecture rtl of adder is
begin
sum <= a + b;
end rtl;
Where no clock is required since this function is really a combinatorial process.
However if you want to make your ALU go into a stage like how i have described, what you should be doing is actually this
library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
entity adder is
port(
clk : in STD_LOGIC;
a : in unsigned(3 downto 0);
b : in unsigned(3 downto 0);
sum : out unsigned(3 downto 0)
);
end adder;
architecture rtl of adder is
signal a_r, b_r, sum_r : unsigned(3 downto 0);
signal internal_sum : unsigned(3 downto 0);
begin
sum <= sum_r;
internal_sum <= a_r + b_r;
process(clk)
begin
if (rising_edge(clk)) then
a_r <= a;
b_r <= b;
sum_r <= internal_sum;
end if;
end process;
end rtl;
You have not mentioned about carry out so i will not discuss that here.
Finally if you are using Altera, they have a very nice RTL viewer that you can have a look to see your synthesized design. Under Tools->Netlist Viewer-> RTL Viewer.
I am using modelsim for simulating a pseudo-random pattern generator using the below code. The problem is when i force the data_reg signal to a seed value (ex: 0001010101101111) the data_out shows the same value instead of a random value. i will really appreciate any help i cud get on this one.
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.STD_LOGIC_ARITH.ALL;
use IEEE.STD_LOGIC_UNSIGNED.ALL;
entity dff is
Port ( CLK : in std_logic;
RSTn : in std_logic;
D : in std_logic;
Q : out std_logic);
end dff;
architecture Behavioral of dff is
begin
process(CLK)
begin
if CLK'event and CLK='1' then
if RSTn='1' then
Q <= '1';
else
Q <= D;
end if;
end if;
end process;
end Behavioral;
VHDL CODE FOR PRBS Generator using LFSR:
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
entity lfsr is
Port ( CLK : in STD_LOGIC;
RSTn : in STD_LOGIC;
data_out : out STD_LOGIC_VECTOR (15 downto 0));
end lfsr;
architecture Behavioral of lfsr is
component dff
Port ( CLK : in std_logic;
RSTn : in std_logic;
D : in std_logic;
Q : out std_logic);
end component;
signal data_reg : std_logic_vector(15 downto 0);
signal tap_data : std_logic;
begin
process(CLK)
begin
tap_data <= (data_reg(1) xor data_reg(2)) xor (data_reg(4) xor
data_reg(15));
end process;
stage0: dff
port map(CLK, RSTn, tap_data, data_reg(0));
g0:for i in 0 to 14 generate
stageN: dff
port map(CLK, RSTn, data_reg(i), data_reg(i+1));
end generate;
data_out <= data_reg after 3 ns;
end Behavioral;
First off. In your LFSR you have a process sensitive to CLK which should only be combinational:
process(CLK) -- Not correct
-- Change to the following (or "all" in VHDL-2008)
process(data_reg)
You could also just implement it as a continuous assignment outside of a formal process which is functionally the same in this case.
When you force data_reg to a value you are overriding the normal signal drivers instantiated in the design. In the GUI the force command defaults to "Freeze". Once that is in place, the drivers can't update data_reg because the freeze force is dominant until you cancel it. In the force dialog select the "Deposit" kind to change the state without overriding the drivers on subsequent clocks.
The Modelsim documentation has this to say about the different force kinds:
freeze -- Freezes the item at the specified value until it is forced again or until it is unforced with a noforce command.
drive -- Attaches a driver to the item and drives the specified value until the item is forced again or until it is unforced with a noforce command. This option is illegal for unresolved signals.
deposit -- Sets the item to the specified value. The value remains until there is a subsequent driver transaction, or until the item is forced again, or until it is unforced with a noforce command
Note: While a lot of instructional materials (unfortunately) demonstrate the use of the std_logic_arith and std_logic_unsigned libraries, these are not actual IEEE standards and shouldn't be used in standard conformant VHDL. Use numeric_std instead or, in your case, just eliminate them since you aren't using any arithmetic from those libraries.
I've made a dual port register bank in VHDL, and I want to test it to make sure it works. How would I go about doing this? I know what I want to do (set register 2 to be a constant, read out of it in test program, write to register 3 and read it back out and see if I have the same results).
Only thing is, I'm new to VHDL, so I don't know if there's a console or how a test program is structured or how to instantiate the register file, or even what to compile it in (I've been using quartus so far).
Here's my register file:
use IEEE.STD_LOGIC_ARITH.all;
use IEEE.STD_LOGIC_UNSIGNED.all;
-- Register File
entity RF is
port(
signal clk, we: in std_logic;
signal ImmediateValue : in std_logic_vector(15 downto 0);
signal RegisterSelectA, RegisterSelectB : in integer range 0 to 15;
signal AOut, BOut : out std_logic_vector(15 downto 0)
);
end RF
architecture behavior of RF is
array std_logic_vector_field is array(15 downto 0) of std_logic_vector(15 downto 0);
variable registers : std_logic_vector(15 downto 0);
process (clk, we, RegisterSelectA, RegisterSelectB, ImmediateValue)
wait until clk'event and clk = '1';
registers(RegisterSelectA) := ImmediateValue when we = '1';
AOut <= registers(RegisterSelectA);
BOut <= registers(RegisterSelectB);
end process;
end behavior;
First of all, if you are new to VHDL design, you might be best off starting with a tutorial on the web, or grabbing a book like "The Designer's Guide to VHDL".
Anyway, just like a software design, to test a VHDL design, you have to write some test code. In hardware design, usually these tests are unit-test like, but are often called "testbenches".
For the design you've given, you'll need to create something like this:
library ieee.std_logic_1164.all;
library ieee.numeric_std.all;
entity test_RF is
end entity;
architecture test of test_RF is
signal clk, we: std_logic;
signal ImmediateValue : std_logic_vector(15 downto 0);
signal RegisterSelectA, RegisterSelectB : integer range 0 to 15;
signal AOut, BOut : std_logic_vector(15 downto 0)
begin
-- Instantiate the design under test
u_RF : entity work.RF
port map (
clk => clk,
we => we,
ImmediateValue => ImmediateValue,
RegisterSelectA => RegisterSelectA,
RegisterSelectB => RegisterSelectB,
AOut => AOut,
BOut => BOut
);
-- create a clock
process is
begin
clk <= '0';
loop
wait for 10 ns;
clk <= not clk;
end loop;
end process;
-- create one or more processes to drive the inputs and read the outputs
process is
begin
wait until rising_edge(clk);
-- do stuff
-- use assert to check things
-- etc
end process;
end architecture;