How are loops within a process synthesized in VHDL? - vhdl

I'm working on the implementation of a FIR filter in VHDL and need some advice regarding when to use and not to use process statements. Part of the code is presented below. Specifically, I'm wodering how the resetCoeffs loop will synthesize. Will it be a sequential reset, and so be very inefficeint in both speed and area I assume, or will it be done in parallel? If the former is the case, how would I go about writing it so that is can be a parallel reset.
process (clk) is begin
if rising_edge(clk) then
if rst = '1' then
-- Reset pointer
ptr <= (others => '0');
-- Reset coefficients
resetCoeffs: for i in 0 to ORDER - 1 loop
coeffs(i) <= (others => '0');
end loop;
else
-- Increment pointer
ptr <= ptr + 1;
-- Fetch input value
vals( to_integer(ptr) ) <= ival;
-- Write coefficient
if coeff_wen = '1' then
coeffs( to_integer(ptr) ) <= ival;
end if;
end if;
end if;
end process;

It's going to be parallel. It basically has to be since the whole loop (everything in the process, really) has to operate in a single clock cycle in hardware.
I'm curious, though. Since you've made the effort to write out the whole process, why not just synthesize it with your favorite synthesizer? That's what I did, to check my answer.

Once a process begin execution (because of a change in one of the signals in its sensitivity list) it must run until it suspends, either by reaching the end, or reaching a wait statement.
Because of this defined behaviour, loops will "execute" all their code until they exit before time is allowed to move on in the simulator. The synthesiser will therefore create the same behaviour by unrolling all the loop and making sure it "all happens at once" (or at least within the clock tick for a clocked process like yours).

Related

Execute two chunks of code in sequence but the code itself in parallel

I have a piece of code that needs to be executed after another. For example, I have an addition slv_reg2 <= slv_reg0 + slv_reg1; and then I need the result subtracted from a number.
architecture IMP of user_logic is
signal slv_reg0 : std_logic_vector(0 to C_SLV_DWIDTH-1); --32 bits wide
signal slv_reg1 : std_logic_vector(0 to C_SLV_DWIDTH-1);
signal slv_reg2 : std_logic_vector(0 to C_SLV_DWIDTH-1);
signal slv_reg3 : std_logic_vector(0 to C_SLV_DWIDTH-1);
signal flag : bit := '0';
begin
slv_reg2 <= slv_reg0 + slv_reg1;
flag <= '1';
process (flag)
begin
IF (flag = '1') THEN
slv_reg3 <= slv_reg0 - slv_reg2;
END IF;
end process;
end IMP;
I haven't tested the code above but I would like some feedback if my thought is correct. What is contained in the process doens't need to run in sequence, how can I make this part also run in parallel?
In summary, I have two chunks of code that need to be executed in sequence but the code itselft should run in parallel.
--UPDATE--
begin
slv_regX <= (slv_reg0 - slv_reg1) + (slv_reg2 - slv_reg3) +(slv_reg4 - slv_reg5); --...etc
process(clk) -- Process for horizontal counter
begin
if(rising_edge(clk)) then
if(flag = 1) then
flag <= 0;
end if;
end if;
end process;
It's concurrent, not parallel. Just use intermediate signals
I am afraid your thought process about VHDL is not entirely correct. You acknowledge the first assignment happening 'in parallel', but you must realise that the process is also running concurrently with the rest of the statements. Only what happens inside a process is evaluated sequentially - every time any of the signals in the sensitivity list change.
I need to make a lot of assumptions about what you need, but perhaps you're just after this style:
c <= a + b;
e <= d - c;
The signal e will contain d-(a+b) at all times. If you don't immediately understand why - this is where the difference between concurrent and parallel comes in. It is being continuously evaluated - just like if you'd hook the circuit up with copper wires.
If what you actually need is for something to happen based on a clock - you're really not there yet with your example and I recommend looking up examples and tutorials first. People will be happy to help once you provide VHDL code that is more complete.
Issues in your example
Assuming you are intending to write synthesisable VHDL, there several problems with your code:
Your process sensitivity list is not correct - you would either make it sensitive to only a clock, or to all the input signals to the process.
You are both initialising your 'flag' to '1' and assigning it a '0' in concurrent code, that doesn't make sense.
Signals ..reg0 and ..reg1 are not assigned. If they are inputs, don't declare them as signals.
You have named your signals as if they are anonymous numbered 'registers', but they are not registers, just signals: wires! (if ended up with this style because you are comparing to Verilog - let me argue that the Verilog 'reg' regularly don't make sense as they often end up not being registers.)
Try it
I haven't tested the code above but
You really should. Find code examples, tutorials, go and play!

Error (10818): Can't infer register for ... at ... because it does not hold its value outside the clock edge

I´m trying to verify four buttons. When one of them are pushed I need to check if the corresponding led are lit. So, I did the code, where a process checks which button has been pressed and compares the values with the value of the led (lit or not). The problem occurs when I want to increase the variable that controls the amount of hits (successes) of the player.
Remembering that "acertos" is a signal of the type std_logic_vector(3 downto 0)
process(pb0,pb1,pb2,pb3)
variable erro_int : STD_LOGIC;
begin
if (clk_game = '0') then
erro_int:='0';
if rising_edge(pb0) then
if pb0 /= led(0) then
erro_int:='1';
end if;
elsif rising_edge(pb1) then
if pb1 /= led(1) then
erro_int:='1';
end if;
elsif rising_edge(pb2) then
if pb2 /= led(2) then
erro_int:='1';
end if;
elsif rising_edge(pb3) then
if pb3 /= led(3) then
erro_int:='1';
end if;
else
acertos <= acertos+1;
end if;
end if;
end process;
See Error (10818): Can't infer register for “E” at clk200Hz.vhd(29) because it does not hold its value outside the clock edge which demonstrates the exact same Error case. Normally you'd also be encouraged to use the vendor as a resource for their error messages as well.
The error case here is expressing that the acertos assignment is not taking place inside a conditional assignment statement dependent on a signal event used as a clock.
However this isn't the end to potential problems you'll likely encounter with the process you've shown us from some design specification.
There's the question of whether signals pb0, pb1, pb2 and pb3 are de-bounce filtered. See Wentworth Institute of Technology Department of Electronics and Mechanical ELEC 236 Logic Circuits Switch Debounce Design, where the first scope trace shows a normally pulled high, momentary switch connecting to ground switch input. The issue is contact bounce. The web page talks about some solutions.
De-bounce would allow you to use your button inputs as clocks but there are other issues in your process.
For instance some vendors won't accept multiple clocks in the same process statement, your design specification may not be portable.
There isn't a recognized sequential event inferring storage element construct that allows multiple clocks for the same storage element (erro_int). Because there is no passage of simulation time expressed by successive if statements without inter-dependencies you might expect that perhaps only the last one get's expressed as hardware.
You could combine all the push buttons into a single signal:
button_press <= not (not pb0 and not pb1 and not pb2 and not bp3);
Where any button pressed can cause an edge event when transitioning from no buttons pressed.
If you de-bounce this say using a counter and testing for successive events you could use it as the sole clock.
Let's assume for the following it's de-bounced. Setting a default value in the process statement for erro_out would give you a process that looks something like:
process(button_press)
variable erro_int: std_logic;
begin
if (clk_game = '0') then
erro_int:='0';
if rising_edge(button_press) then
if pb0 /= led(0) or pb1 /= led(1) or pb1 /= led(2) or pb3 /=pb3 then
erro_int:= '1';
end if;
acertos <= acertos + 1;
end if;
end process;
(And I looked up the translation of acertos - hits, not necessarily valid hits, I take it this is a game.)
This still doesn't solve the issue that erro_int is a local variable. If it's used elsewhere it wants to be a declared as a signal. If you change the process to:
...
signal erro_int: std_logic; -- defined say as an architecture declarative item
...
process(button_press)
begin
if (clk_game = '0') then
erro_int <='0';
if rising_edge(button_press) then
if pb0 /= led(0) or pb1 /= led(1) or pb1 /= led(2) or pb3 /=pb3 then
erro_int <= '1';
end if;
acertos <= acertos + 1;
end if;
end process;
You can use it externally. The reason why this works is that a process only has a single driver for a signal and the last assignment at the current simulation time is the one that takes effect (there's only one assignment to the current simulation time).
And of course if you use a clock to de-bounce you could use button_press as an enable if it were gated to only last one clock.

In a state machine process is there a difference if I state specifically that the state stays the same?

In a process that controls a state machine, when the state stays the same, is there a difference if specifically stated that the state signal gets the same value as it has? In the example below, are the two lines inside the process with the notes needed?
--CLK and RST are input signals
type state_machine_states is
(
st_idle, st_1
);
signal sm : state_machine_states ;
signal next_state : std_logic;
begin
--assume that there is some logic which deals with the next_state signal
states_proc: process (RST, CLK)
begin
if (RST = '1')
sm <= 'st_idle'
elsif rising_edge(CLK) then
case sm is
when st_idle =>
if (next_state = '1') then
sm <= st_1;
else --Are these two lines needed, and is there
sm <= st_idle --any difference if they are written or not?
end if;
when st_1 =>
sm <= st_idle;
end case;
end if;
end process;
This is fine with most modern tools. If you omit an else in a combinatorial process then you'll infer a latch. But in a clocked process you will not.
It treats this it as an enable to drive the register. The input is not clocked to the output when the enable is not driven.
when using an else statement, or generally when each code path is covered, then you avoid using latches
Update: when the else is omitted, you use the latch as memory
The code shown along with the original question is not appropriate for FSM implementation. You are looking at sm and next state to decide what sm should be, which does not make sense. The following must occur in ANY state machine:
1) Look at the present state and to the input to decide what the next state should be. This must be REGISTERED.
2) Look at the present state to decide what the output should be (Moore case). This should NOT BE REGISTERED, unless a special condition is required (for example, glitch-free output or pipelined construction - See, for example, "Finite State Machines in Hardware: Theory and Design (with VHDL and SystemVerilog)").

VDHL loop with variable outside of the process (how to be)

How can I avoid variable in this loop (outside the process)?
variable var1 : std_logic_vector (ADRESS_WIDTH-1 downto 0) := (others => '0');
for i in 0 to ADRESS_WIDTH-2 loop
var1 := var1 + '1';
with r_addr select
fifo_data_out <= array_reg(i) when var1,
end loop;
array_reg(ADRESS_WIDTH-1) when others;
This version (in process) isn't correct too - syntax errors
process (r_addr, r_addr1, fifo_data_out, array_reg, r_data1)
variable var1 : std_logic_vector (ADRESS_WIDTH-1 downto 0) := (others => '0');
begin
case r_addr is
when "0000000000" => fifo_data_out <= array_reg(0);
for i in 1 to ADRESS_WIDTH-2 loop
when var1 => fifo_data_out <= array_reg(i);
var1 := var1 + '1';
end loop;
when others => fifo_data_out <= array_reg(ADRESS_WIDTH-1);
end case;
There's a bunch of things that aren't quite right on your implementation. Completely agnostic of what you're trying to accomplish there are some things with VHDL that should be remembered:
Every opening statement must have a closing one.
Every nesting level must have a closing statement to "un-nest" it
You can't put portions of one statement set (the case internals) inside another statement (the for loop), this seems odd, but think about it, what case is it going to attach itself to.
VHDL and hardware programming in general is ridiculously parallel, from one iteration to the next the process is completely independent of all other processes.
Now, looking at your code I see what looks like what you want to accomplish, and this is a perfect example of why you should know a little bit of scripting in another language to help out hardware level programming. You should be as specific as possible when creating a process, know what you want to accomplish and in what bounds, I know this is like every other languages but hardware programming gives you all the tools to hang yourself very very thoroughly. Here's the best that I can make out from your code to clean things up.
async_process : process (r_addr, fifo_data_out, array_reg)
begin
case r_addr is
when "0000000000" => fifo_data_out <= array_reg(0);
when "0000000001" => fifo_data_out <= array_reg(1);
when "0000000002" => fifo_data_out <= array_reg(2);
when others => fifo_data_out <= array_reg(ADRESS_WIDTH-1);
end case;
end process;
r_addr_inc_process : process (clock <or other trigger>, reset)
<This is where you would deal with the bounds of the register address.
If you still want to deal with that variable, do it here.>
end process;
So, as you can see from this, you want to update as few things as possible when you're dealing with a process, this way your sensitivity list is extremely specific, and you can force most updating to occur synchronously instead of asynchronously. The reason your async process can be as such is that it will update every time r_addr is updated, and that happens every clock read, or on some flag, and it gives you a consistent reset state.
With how iterative the process is, you can see how using a scripting language to fill in the 100's of register values would help it from being very time consuming.

VHDL FSM set unit input and use output in same state

I am implementing a Mealy-type FSM in vhdl. I currently am using double process, although i've just read a single-process might be neater. Consider that a parameter of your answer.
The short version of the question is: May I have a state, inside of which the input of another component is changed, and, aftwerwards, in the same state, use an output of said component? Will that be safe or will it be a rat race, and I should make another state using the component's output?
Long version: I have a memory module. This is a fifo memory, and activating its reset signal takes a variable named queue_pointer to its first element. After writing to the memory, the pointer increases and, should it get out of range, it is (then also) reset to the first element, and an output signal done is activated. By the way, i call this component FIMEM.
My FSM first writes the whole FIMEM, then moves on to other matters. The last write will be done from the state:
when SRAM_read =>
READ_ACK <= '1';
FIMEM_enable <= '1';
FIMEM_write_readNEG <= '0';
if(FIMEM_done = '1') then --is that too fast? if so, we're gonna have to add another state
FIMEM_reset <= '1'; --this is even faster, will need check
data_pipe_to_FOMEM := DELAYS_FIMEM_TO_FOMEM;
next_state <= processing_phase1;
else
SRAM_address := SRAM_address + 1;
next_state <= SRAM_wait_read;
end if;
At this state, having enable and write active means data will be written on the FIMEM. If that was the last data space on the memory, FIMEM_done will activate, and the nice chunk of code within the if will take care of the future. But, will there be time enough? If not, and next state goes to SRAM_wait_read, and FIMEM_done gets activated then, there will be problems. The fact that FIMEM is totally synchronous (while this part of my code is in the asynchronous process) messes even more?
here's my memory code, just in case:
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.NUMERIC_STD.ALL;
entity memory is
generic (size: positive := 20);
Port ( clk,
reset,
enable,
write_readNEG: in std_logic;
done: out std_logic;
data_in: in STD_LOGIC_VECTOR(7 downto 0);
data_out: out STD_LOGIC_VECTOR(7 downto 0) );
end memory;
architecture Behavioral of memory is
subtype word is STD_LOGIC_VECTOR(7 downto 0);
type fifo_memory_t is array (0 to size-1) of word;
signal fifo_memory : fifo_memory_t :=((others=> (others=>'0')));
--Functionality instructions:
--Resetting sets the queue pointer to the first element, and done to 0
--Each cycle with enable active, a datum from the pointer position is
--written/read according to write_readNEG, and the pointer is incremented.
--If the operation was at the last element, the pointer returns to the first place
--and done is set to 1. When done is 1, enable is ignored.
Begin
process(clk,reset)
variable done_buf : std_logic;
variable queue_pointer: natural range 0 to size-1;
begin
if(reset = '1') then
queue_pointer := 0;
done_buf := '0';
elsif(rising_edge(clk)) then
if(done_buf = '0' and enable = '1') then
case write_readNEG is
when '0' =>
data_out <= fifo_memory(queue_pointer);
when '1' =>
fifo_memory(queue_pointer) <= data_in;
when others => null;
end case;
if(queue_pointer = size-1) then
done_buf := '1';
queue_pointer := 0;--check
else
queue_pointer := queue_pointer + 1;
end if;
end if; --enable x not done if
end if; --reset/rising edge end if
done <= done_buf;
end process;
End Behavioral;
More details inspired by the first comment:
The memory can read the data at the same cycle enable is activated, as seen here:
Notice how the "1", the value when enable is turned active, is actually written into the memory.
Unfortunately, the piece of code is in the asynchronous process! Although I'm VERY strongly thinking of moving to a single-process description.
In contrast to all the circuits I've designed until now, it is very hard for me to test it via simulation. This is a project in my university, where we download our vhdl programs to a xilinx spartan 3 FPGA. This time, we have been given a unit which transfers data between Matlab and the FPGA's SRAM (the functionality of which, I have no idea). Thus, I have to use this unit to transfer the data between the SRAM and my memory module. This means, in order to simulate, my testbench file will have to simulate the given unit! And this is hard..suppose I must try it, though...
first of all, whether to use a single process or a dual process type of FSM notation is a matter of preference (or company coding style rules). I find single process notation easier to write/read/manage.
your enable signal will have an effect on your memory code only after the next rising clock edge. the done signal related to the actual memory state will therefore be available one clock cycle after updating enable. I guess (and hope! but it's not visible in your posted code), your current_state<=next_state part of the FSM is synchronous! therefore your state machine will be in the SRAM_wait_read state by the time when done is updated!
btw: use a simulator! it will help to check the functionality!
thanks for adding the simulation view! strangely your done signal updates on neg. clock edge... in my simulation it updates on pos. edge; as it should, by the way!
to make the situation more clear I suggest, you move the done <= done_buf; line inside the "rising_edge-if" (this should be done anyhow when using synchronous processes!).

Resources