Latch signal without delay - vhdl

I would like to latch a signal, however when I try to do so, I get a delay of one cycle, how can I avoid this?
myLatch: process(wclk, we) -- Can I ommit the we in the sensitivity list?
begin
if wclk'event and wclk = '1' then
lwe <= we;
end if;
end process;
However if I try this and look into the waves during simulation lwe is delayed by one cycle of wclk. All I want tp achieve is to sample we on the rising edge of wclk and keep it stable till the next rising edge. I then assign the latched signal to another entities port map which is defined in the architecture.
==============================================
Well I figured out that I have to omit the wclk'event to get a latch instead of a flip flop. This seems rather unintuitive to me. By simply shortening the time where I sample the signal to be latched I go from latch to flip flop. Can anyone explain why this is and where my perception is wrong. (I am a vhdl beginner)

First off, a few observations on the process you pasted above:
myLatch: process(wclk, we)
begin
if wclk'event and wclk = '1' then
lwe <= we;
end if;
end process;
The signal we can be omitted from the sensitivity list because you have described a clocked process. The only signals required in the sensitivity list of a process like this are the clock and the asynchronous reset if you choose to use one (a synchronous reset would not need to be added to the sensitivity list).
Instead of using if wclk'event and wclk = '1' then you should instead use if rising_edge(wclk) then or if falling_edge(wclk) then, there's a good blog post on the reasons why here.
By omitting the wclk'event you changed the process from a clocked process to a combinatorial process, like so:
myLatch: process(wclk, we)
begin
if wclk = '1' then
lwe <= we;
end if;
end process;
In a combinatorial process all inputs should be present in the sensitivity list, so you would be correct to have both wclk and we in the list as they had an influence on the output. Normally you would ensure that lwe is assigned in all cases of your if statement to avoid inferring a latch, however this appears to be your intention in this case.
Latches in general should be avoided, so if you find yourself needing one you should perhaps pause and consider your approach. Doulos have a couple of articles on latches here and here that you might find useful.
You stated that all you want to achieve is to sample we on the rising edge of wclk and keep it stable until the next rising edge. The process below will accomplish this:
store : process(wclk)
begin
if rising_edge(wclk) then
lwe <= we;
end if;
end process;
With this process, lwe will be updated with the value of we upon every rising edge of wclk and it will remain valid for a single clock cycle.
Let me know if this clears things up for you.

Believe it or not, the issue is actually in your testbench. This has to do with how the VHDL simulation model works.
VHDL is usually used for synchronous hardware design -- that means, using flip-flops that sample on the rising edge and set outputs on the falling edge, so that there are no race conditions between reading and writing. But in VHDL this master/slave logic is not actually simulated using opposite clock edges.
Consider a process
process (clock) begin
if rising_edge(clock) then
a <= b;
end if;
end process;
At the start of a simulation timestep, if clock has just risen, the if will execute. Then the assignment a <= b will be executed, and this will not immediately cause an assignment to take place, but schedule the assignment for the end of the timestep.
After all processes have been run, then all scheduled assignments take place. This means that no process will "see" the new value of a until the next timestep.
Time a b Actions
Start of ts 1 '0' '1' a <= '1' is scheduled
End of ts 1 '1' '0' a <= '1' is executed
Start of ts 2 '1' '0' a <= '0' is scheduled
End of ts 2 '0' '1' a <= '0' is executed
So when you look on the waveform viewer, what you will see is a apparently being set on the rising edge of the clock, and following b delayed by one clock cycle; you don't see the intermediate scheduling of assignments that causes this to happen.
Of course, in real life, there is no "end of the timestep", and the actual changing of signal a happens when the slave part of the flip-flop triggers, ie, on the negative edge. (Maybe it would have been less confusing for VHDL to just use the negative edge; but, oh well, this is how it works).
Here are two testbenches for your latch code:
Test bench 1, using rising edges
Test bench 2, using falling edges
In the first, if you look in the waveform viewer you will see exactly what you describe -- lwe appears to be delayed by 1 clock cycle -- but really, the delay is happening in the non-blocking assignment that sets counter -- so when the rising edge happens, we does not actually have its new value yet. And in the second, you see no such delay; lwe is set exactly on the rising edge to the value of we at that time.
For a related topic in Verilog, see Nonblocking Assignments in Verilog Synthesis, Coding Styles That Kill .

The process you have is what you want according to your description, although 'we' should be removed from the sensitivity list. If this doesn't work as you believe it should it is almost certainly a problem with your test bench/simulation. (See Owen's answer.) Specifically you are probably changing the value of 'we' too late, so that the flip-flop latches the previous value instead of the new one.
I'm interested to know what the source of this signal is though, if it's an asynchronous signal that can change at any time you will have to add some logic to protect against metastability.
To answer your second question about latches, it is correct that omitting wclk'event will result in a latch. This process will not do what you want, however, because it will propagate changes to 'we' to 'lwe' during the whole positive half-period of the clock. The short answer to your question is that implementing this type of behavior requires a latch, while the behavior described by the original process requires a flip-flop.

Related

Why not a two-process state machine in VHDL?

When I learnt how to express finite state machines in VHDL, it was with a two-process architecture. One process handles the clock/reset signals, and another handles the combinatorial logic of updating the state and output. An example is below.
I've seen this style criticised (see the comments and answer to this question for example), but never in any detail. I'd like to know whether there are objective(ish) reasons behind this.
Are there technical reasons to avoid this style? Xilinx' synthesiser seems to detect it as a state machine (you can see it in the output, and verify the transitions), but do others struggle with it, or generate poor quality implementations?
Is it just not idiomatic VHDL? Remember to avoid opinion-based answers; if it's not idiomatic, is there a widely used teaching resource or reference that uses a different style? Idiomatic styles can also exist because, eg. there are classes of mistakes that are easy to catch with the right style, or because the code structure can better express the problem domain, or for other reasons.
(Please note that I'm not asking for a definition or demonstration of the different styles, I want to know if there are objective reasons to specifically avoid the two-process implementation.)
Example
Some examples can be found in Free Range VHDL (p89). Here's a super simple example:
library ieee;
use ieee.std_logic_1164.all;
-- Moore state machine that transitions from IDLE to WAITING, WAITING
-- to READY, and then READY back to WAITING each time the input is
-- detected as on.
entity fsm is
port(
clk : in std_logic;
rst : in std_logic;
input : in std_logic;
output : out std_logic
);
end entity fsm;
architecture fsm_arc of fsm is
type state is (idle, waiting, ready);
signal prev_state, next_state : state;
begin
-- Synchronous/reset process: update state on clock edge and handle
-- reset action.
sync_proc: process(clk, rst)
begin
if (rst = '1') then
prev_state <= idle;
elsif (rising_edge(clk)) then
prev_state <= next_state;
end if;
end process sync_proc;
-- Combinatorial process: compute next state and output.
comb_proc: process(prev_state, input)
begin
case prev_state is
when idle =>
output <= '0';
if input = '1' then
next_state <= waiting;
else
next_state <= idle;
end if;
when waiting =>
output <= '1';
if input = '1' then
next_state <= ready;
else
next_state <= waiting;
end if;
when ready =>
output <= '0';
if input = '1' then
next_state <= waiting;
else
next_state <= ready;
end if;
end case;
end process comb_proc;
end fsm_arc;
(Note that I don't have access to a synthesiser right now, so there might be some errors in it.)
I always recommend one-process state machines because it avoids two classes of basic errors that are exceedingly common with beginners:
Missing items in the combinational process's sensitivity list cause the simulation to misbehave. It even works in the lab since most synthesizers don't care about the sensitivity list.
Using one of the combinational results as an input instead of the registered version, causing unclocked loops or just long paths/skipped states.
Less importantly, the combinational process reduces simulation efficiency.
Less objectively, I find them easier to read and maintnain; they require less boiler plate and I don't have to keep the sensitivity list in sync with the logic.
The only two objective reasons I see are about readability and simulation efficiency in the case of Moore state machines (where primary outputs depend only on the current state, not the primary inputs).
Readability: if you merge together in a single process outputs and next state computations it might be more difficult to read / understand / maintain than with separate combinatorial processes for separate concerns.
Simulation efficiency: in the 2-processes solution your combinatorial process will be triggered on every primary input and / or current state change. This makes sense for the part of the process that computes the next state but not for the part that computes the outputs. The latter shall be triggered only on current state changes.
There is a detailed description about this in ref. [1] below. First, FSMs are classified in 3 categories (first time this is done), then each is examined thoroughly, with many complete examples. You can find the exact answer to your question on pages 107-115 for category 1 (regular) finite state machines; on pages 185-190 for category 2 (timed) machines; and on pages 245-248 for category 3 (recursive) state machines. The templates are described in detail for both Moore and Mealy version in each of the three categories.
[1] V. Pedroni, Finite State Machines in Hardware: Theory and Design (with VHDL and SystemVerilog), MIT Press, Dec. 2013.

VHDL: Mealy machine and button press detection

Hi I'm a bit confused about the implementation of Mealy state machine using VHDL. My current work is like this:
process(clk, rst)
begin
if rst = '1' then
state <= s1;
elsif (clk'event and clk = '1') then
state <= next_state;
end if;
end process;
and another process like this:
process(state, op)
begin
case state is
when s1 =>
...some implementation
end process;
And now the problem is: I need to detect the press of the button from the user, but I'm not sure where to put it. Should it be inside the first process or the second process? Besides, I also looked through the following guide: implement state machine in FPGA, is it okay to use just one process for the Mealy machine as shown on the webpage? If it is so then I think the work will be easier. Thanks!
You should put it in the second process. The first process is only used to change states and the next_state is also calculated in the second.
There are several ways to write FSMs and people tend to favour one or the other for various reasons. Pick the one that works for you.
You cannot design a Mealy state machine with only one process. Even Moore state machines, in most cases, cannot be modelled with only one process.
A state machine always has a state register which must be modelled with a synchronous process. That is, a process which sensitivity list contains only the clock (and set or reset signals if they are asynchronous).
Every output of a synchronous process will synthesize as the output of a register because its value changes only on an edge of the clock (plus states of asynchronous set or reset if any). So, you cannot describe the outputs of a Mealy state machine in the same synchronous process as the state register. If you were doing so, it would not be a Mealy machine any more because its outputs would not combinationally depend on the inputs.
For Moore machines, things are a bit more subtle but, except in very exceptional cases, you also need at least two processes. When I write "process", I include processes short-hands like concurrent signal assignments, concurrent procedure calls or component/entity instantiations.
To make it simple: VHDL modelling for synthesis is straightforward if you have a clear view of the hardware you want.
Draw a block diagram of your hardware with registers and combinatorial parts clearly identified.
Draw bubbles enclosing hardware elements, one bubble per process, respecting the rule that if a bubble contains a register, all its outputs must be register outputs.
The synchronous processes are those enclosing registers. Their code is exactly:
process(clk)
begin
if rising_edge(clk) then
<your code>
end if;
end process;
Put your code in <your code>, never put code elsewhere. If you have asynchronous set or reset the code must be something like:
process(clk, reset)
begin
if reset = '1' then
<initialize outputs>
elsif rising_edge(clk) then
<your code>
end if;
end process;
The other processes are combinatorial processes. List all their entering signals (INPUTS) and output signals (OUTPUTS). The code must be:
process(INPUTS)
begin
<your code>
end process;
with the constraint that each OUTPUT signal must be assigned a value in every execution of the process. The best way to guarantee this is to start the process with a default assignment of all OUTPUTS.
That's all. Draw and code what you see. Bonus: every arrow crossing the border of one of your process-bubbles is a signal that you will have to declare unless it is already a primary input or output of your design.
Exercise: draw the block diagram of a Mealy state machine and understand why it cannot be modelled with one single process. Understand also why it can always be modelled with two processes, even if it is not necessarily desirable. Finally, try to identify the rare cases where a Moore state machine can be modelled with one process only.

Creating a tachometer in VDHL

I have been assigned the task of creating a tachometer using VDHL to program a device. I have been provided with the pin in which an input signal will be connected and from that need to display the frequency of ones occurring per second (the frequency). Having only programmed in VHDL a couple of times previously I am having difficulty figuring out how to implement the code:
So far I have constructed the following steps that the device needs to take
Count the logical ones in the input signal by creating a process depending on it
I did this by creating a process which is dependent on the input_singal and increments a variable when a high is present in the input_signal
counthigh:process(input_signal) -- CountHigh process
begin
if (input signal = '1') then
current_count := current_count+1;
end if;
end process; -- End process
Stop counting after a set amount of time and update the display with the frequency of the input_signal
I am unsure how to accomplish this using VHDL. I have provided a process from previous code which I used to implement a state machine. c_clk is a clock that operates at 5MHz/1024 (the timer div constant used) meaning that the period is equal to 2.048*10^-4 seconds. So the time between every rising edge is equal to that.
What I would like to do is wait for a set amount of rising_edges (I suppose I could define another variable and wait for a multiple of it to update the display and reset the current_count variable).
statereset:process -- StateReset process
begin
wait until rising_edge(c_clk); -- On each rising edge
if (reset='0') then
current_s <= s0; -- Default state on reset.
else
current_s <= next_s; -- Update the current state
end if;
end process; -- End process
From previous code I already have a entity called SevenSeg which I am able to manipulate to display the current frequency of the signal using basic mathematics.
I would just like to check that by making the counthigh process dependent on the input signal the process will 'wait' until the next std_logic_vector is available and read that instead of counting a high from the input_signal numerous times. Am I able to wait until there is a rising_edge(input_singal) in one process while making another process dependent on the clock rate?
If anyone has any ideas or feedback it would be greatly appreciated. I know I am asking an extremely broad and open-ended question but I am trying to figure out how to accomplish this task.
Cheers, NZBRU.
counthigh:process(input_signal) -- CountHigh process
begin
if (input signal = '1') then
current_count := current_count+1;
end if;
end process; -- End process
I understand what you are trying to achieve, but it won't work. In simulation, it will count each time input_signal goes high or low, which is good, but this code won't synthesize.
A counter needs a clock, and a process with a clock need a rising_edge. I expect your input to be of lower frequency than your operating clock, so I suggest you use an edge detector running using your clock. I will leave it as an exercise, but here's a good reference.
To wait 1 second or whatever else, use a counter. If your clock is 5MHz, use a signal to count from 0 to 4_999_999. When the counter is 4_999_999, reset the counter, the edge detector and update your display.
BTW, since your a beginner, try to use signals instead of variables. Variables have a similar behavior to programming languages, but they are a lot of pitfalls when used in synthesis. For a beginner, I suggest to stick to signals, once you're used to them and understand a little better how VHDL works, you can go back to using variables. In my own design for synthesis, I have something like 95% signals, which is standard for FPGA designers.

how are process'es evaluated in practice

I have two process'es like below.
If say A=1, B=2 and C=3, what happens in simulation is on rising_edge B=1 and C=2, which is the result I want.
But am I guaranteed that this is also true when the design is implemented into an fpga?
What worries me is the delay associated with the extra if-state in process BC.
AB : process(A,clk)
begin
if rising_edge(clk) then
B <= A;
end if;
end process;
BC : process(B,clk)
begin
if rising_edge(clk) then
if (some_statement) then
C <= B;
end if;
end if;
end process;
B will take on the value of A (B=1) and C will take on the value of B (C=2).
However, I guess you are not actually describing what you want to. The problem is that you have A and B in the sensitivity list of the two processes. This means that in process AB, B will change each time A changes as well as when rising_edge(clk) is true. The same holds for process BC. Assuming you want to describe two registers in series, your code should be
AB : process(clk)
begin
if rising_edge(clk) then
B <= A;
end if;
end process;
BC : process(clk)
begin
if rising_edge(clk) then
if (some_statement) then
C <= B;
end if;
end if;
end process;
In this case, if you synthesize this code onto an FPGA, you will infer two registers. The register in process BC will use the registers enable signal which is connected to the boolean output of some_statement. If some_statement is already a single std_logic signal, this will not introduce additional delay but require some routing ressources, so you should still avoid using the enable signal where you don't really need it.
I think Simon answered the question perfectly, just to clarify the issue a bit further:
If initial values of your data is A=1, B=2 and C=3 then you will have the following during the simulation:
During start A=1, B=2 and C=3
After first rising edge of the clock A=1, B=1 and C=2
After second rising edge of the clock A=1, B=1 and C=1
After that, all signals will be 1.
The delay of the if statement must be more than 'clock period' - 'hold time necessary for internal registers' to cause any problem for you. Unless you have an extremely complicated logic with signals from multiple clock domains there, there is a little risk you get into problem with your code (the more accurate code is the one sent by Simon).
But am I guaranteed that this is also true when the design is implemented into an fpga?
The synthesizer ought to produce an FPGA which matches the behaviour of your VHDL in the simulator. If not, it's a bug!
Note that there are some "accepted" deviations - for example, if you miss a right-hand side signal off the sensitivity list, the synthesizer will assume you meant to put it there, but the simulator will assume you know what you're doing, and there will be a mismatch. Personally, I regard that behaviour as a bug, but it is too firmly entrenched by too many tools, I don't see it ever changing.
What worries me is the delay associated with the extra if-state in process BC.
Everything in a clocked process like yours "executes" within a single clock tick. If there is too much logic (for example, each nested if introduces a new layer of logic), you may find that a clock tick has to last longer than you desire.
(Not like software on most modern micros, where everything "takes as long as it takes", and is often unpredictable, depending on the state of caches, TLBs etc.)

In VHDL when is the right time to use a Process statement?

I'm going through the phases of learning VHDL for the second or third time now. (this time armed with a very good and free e-book ) and I'm finally starting to "get" quite a bit of it. Now I'm learning about behavioral styles and the process statement and most of it makes sense. However, I've read in many places that processes are to be avoided except for in certain cases. I mean, in theory can't everything be implemented in data-flow instead of behavioral?
When exactly should it be obvious that a process statement should be used?
The process statement is extremely useful, in what situations have you been told not to use them?
There are many different cases where you would use a process statement, I'll outline a few of these below:
One of the most common usages of the process statement (for synthesis) is to describe logic which is synchronous to a clock signal, for example a simple counter that increments every clock cycle when not in reset could be described as:
DATA_REGISTER : process(CLOCK)
begin
if rising_edge(CLOCK) then
if RESET = '1' then
COUNTER <= (others => '0');
else
COUNTER <= COUNTER + 1; --COUNTER is assumed to be of type 'unsigned'
end if;
end if;
end process;
As your designs grow more complex you will inevitably implement a state machine at some point, this will employ one or more processes depending on the style of state machine you choose to implement.
For behavorial code you can use processes in conjunction with wait statements to generate test vectors or to model the behaviour of a real system. Here's a really basic example of a 100MHz clock generator taken from one of my testbenches:
architecture BEH of ethernet_receive_tb is
signal s_clock : std_logic := '0'; --Initial assignment to clock kicks off the process.
begin
CLOCKGEN : process(s_clock)
begin
s_clock <= not s_clock after 5 NS;
end process CLOCKGEN;
...
You can also describe asynchronous logic with processes, in this case you need to include all signals which are read in the process in the sensitivity list and you need to make sure that any outputs are always defined to avoid inferred latches.
IF_ELSE: process (SEL, A, B)
begin
F <= B; -- Default assignment
if SEL = '1' then
F <= A;
end if;
end process;
Hopefully you can see that the process statement is very useful and that you will use it in many different situations. I hope this answered your question!
Process blocks are your friend.
They provide a way of saying "This block of code is related. It's inputs are X,Y,Z and it drives A,B,C". The inputs are documented by the sensitivity list (unless it's a clocked process in which case it should be in your comments). If anything else drives the same signals then you'll get warnings, errors, X's in simulation (depending on your tools). Whatever you get it's pretty obvious.
Personally I would be quite happy writing multiple processes in a single entity, but everyone has their styles. For example, if I have multiple pipe-line stages, each stage is a process. If I have parallel non-interfering paths each will be in a separate process. By doing it this way the code is structured in small, easy to read blocks. Small simple logic synthesizes into small fast blocks (in general).
You could view my style as using them as lightweight entities.
In synthesisable code, processes are required any time you need to keep information from one clock cycle to another. "To store state" in the jargon.
(Note that a process can implied by code such as
d <= q when rising_edge(clk);
)
If non-synthesisable code, processes are useful for getting events to happen in a particular order:
p1: process
begin
data <= "--------";
WE <= '0';
wait until reset = '1';
wait until processor_initialised = '1';
assert ACK = '0' report "ACK should be low!" severity error;
data <= X"16";
WE <= '1';
wait until ACK = '1';
end process;
Most of my code has a single process per entity. Each entity does some useful, well-defined and small-enough-to-be-testable task

Resources