VHDL: Mealy machine and button press detection - vhdl

Hi I'm a bit confused about the implementation of Mealy state machine using VHDL. My current work is like this:
process(clk, rst)
begin
if rst = '1' then
state <= s1;
elsif (clk'event and clk = '1') then
state <= next_state;
end if;
end process;
and another process like this:
process(state, op)
begin
case state is
when s1 =>
...some implementation
end process;
And now the problem is: I need to detect the press of the button from the user, but I'm not sure where to put it. Should it be inside the first process or the second process? Besides, I also looked through the following guide: implement state machine in FPGA, is it okay to use just one process for the Mealy machine as shown on the webpage? If it is so then I think the work will be easier. Thanks!

You should put it in the second process. The first process is only used to change states and the next_state is also calculated in the second.
There are several ways to write FSMs and people tend to favour one or the other for various reasons. Pick the one that works for you.

You cannot design a Mealy state machine with only one process. Even Moore state machines, in most cases, cannot be modelled with only one process.
A state machine always has a state register which must be modelled with a synchronous process. That is, a process which sensitivity list contains only the clock (and set or reset signals if they are asynchronous).
Every output of a synchronous process will synthesize as the output of a register because its value changes only on an edge of the clock (plus states of asynchronous set or reset if any). So, you cannot describe the outputs of a Mealy state machine in the same synchronous process as the state register. If you were doing so, it would not be a Mealy machine any more because its outputs would not combinationally depend on the inputs.
For Moore machines, things are a bit more subtle but, except in very exceptional cases, you also need at least two processes. When I write "process", I include processes short-hands like concurrent signal assignments, concurrent procedure calls or component/entity instantiations.
To make it simple: VHDL modelling for synthesis is straightforward if you have a clear view of the hardware you want.
Draw a block diagram of your hardware with registers and combinatorial parts clearly identified.
Draw bubbles enclosing hardware elements, one bubble per process, respecting the rule that if a bubble contains a register, all its outputs must be register outputs.
The synchronous processes are those enclosing registers. Their code is exactly:
process(clk)
begin
if rising_edge(clk) then
<your code>
end if;
end process;
Put your code in <your code>, never put code elsewhere. If you have asynchronous set or reset the code must be something like:
process(clk, reset)
begin
if reset = '1' then
<initialize outputs>
elsif rising_edge(clk) then
<your code>
end if;
end process;
The other processes are combinatorial processes. List all their entering signals (INPUTS) and output signals (OUTPUTS). The code must be:
process(INPUTS)
begin
<your code>
end process;
with the constraint that each OUTPUT signal must be assigned a value in every execution of the process. The best way to guarantee this is to start the process with a default assignment of all OUTPUTS.
That's all. Draw and code what you see. Bonus: every arrow crossing the border of one of your process-bubbles is a signal that you will have to declare unless it is already a primary input or output of your design.
Exercise: draw the block diagram of a Mealy state machine and understand why it cannot be modelled with one single process. Understand also why it can always be modelled with two processes, even if it is not necessarily desirable. Finally, try to identify the rare cases where a Moore state machine can be modelled with one process only.

Related

Why not a two-process state machine in VHDL?

When I learnt how to express finite state machines in VHDL, it was with a two-process architecture. One process handles the clock/reset signals, and another handles the combinatorial logic of updating the state and output. An example is below.
I've seen this style criticised (see the comments and answer to this question for example), but never in any detail. I'd like to know whether there are objective(ish) reasons behind this.
Are there technical reasons to avoid this style? Xilinx' synthesiser seems to detect it as a state machine (you can see it in the output, and verify the transitions), but do others struggle with it, or generate poor quality implementations?
Is it just not idiomatic VHDL? Remember to avoid opinion-based answers; if it's not idiomatic, is there a widely used teaching resource or reference that uses a different style? Idiomatic styles can also exist because, eg. there are classes of mistakes that are easy to catch with the right style, or because the code structure can better express the problem domain, or for other reasons.
(Please note that I'm not asking for a definition or demonstration of the different styles, I want to know if there are objective reasons to specifically avoid the two-process implementation.)
Example
Some examples can be found in Free Range VHDL (p89). Here's a super simple example:
library ieee;
use ieee.std_logic_1164.all;
-- Moore state machine that transitions from IDLE to WAITING, WAITING
-- to READY, and then READY back to WAITING each time the input is
-- detected as on.
entity fsm is
port(
clk : in std_logic;
rst : in std_logic;
input : in std_logic;
output : out std_logic
);
end entity fsm;
architecture fsm_arc of fsm is
type state is (idle, waiting, ready);
signal prev_state, next_state : state;
begin
-- Synchronous/reset process: update state on clock edge and handle
-- reset action.
sync_proc: process(clk, rst)
begin
if (rst = '1') then
prev_state <= idle;
elsif (rising_edge(clk)) then
prev_state <= next_state;
end if;
end process sync_proc;
-- Combinatorial process: compute next state and output.
comb_proc: process(prev_state, input)
begin
case prev_state is
when idle =>
output <= '0';
if input = '1' then
next_state <= waiting;
else
next_state <= idle;
end if;
when waiting =>
output <= '1';
if input = '1' then
next_state <= ready;
else
next_state <= waiting;
end if;
when ready =>
output <= '0';
if input = '1' then
next_state <= waiting;
else
next_state <= ready;
end if;
end case;
end process comb_proc;
end fsm_arc;
(Note that I don't have access to a synthesiser right now, so there might be some errors in it.)
I always recommend one-process state machines because it avoids two classes of basic errors that are exceedingly common with beginners:
Missing items in the combinational process's sensitivity list cause the simulation to misbehave. It even works in the lab since most synthesizers don't care about the sensitivity list.
Using one of the combinational results as an input instead of the registered version, causing unclocked loops or just long paths/skipped states.
Less importantly, the combinational process reduces simulation efficiency.
Less objectively, I find them easier to read and maintnain; they require less boiler plate and I don't have to keep the sensitivity list in sync with the logic.
The only two objective reasons I see are about readability and simulation efficiency in the case of Moore state machines (where primary outputs depend only on the current state, not the primary inputs).
Readability: if you merge together in a single process outputs and next state computations it might be more difficult to read / understand / maintain than with separate combinatorial processes for separate concerns.
Simulation efficiency: in the 2-processes solution your combinatorial process will be triggered on every primary input and / or current state change. This makes sense for the part of the process that computes the next state but not for the part that computes the outputs. The latter shall be triggered only on current state changes.
There is a detailed description about this in ref. [1] below. First, FSMs are classified in 3 categories (first time this is done), then each is examined thoroughly, with many complete examples. You can find the exact answer to your question on pages 107-115 for category 1 (regular) finite state machines; on pages 185-190 for category 2 (timed) machines; and on pages 245-248 for category 3 (recursive) state machines. The templates are described in detail for both Moore and Mealy version in each of the three categories.
[1] V. Pedroni, Finite State Machines in Hardware: Theory and Design (with VHDL and SystemVerilog), MIT Press, Dec. 2013.

Creating a tachometer in VDHL

I have been assigned the task of creating a tachometer using VDHL to program a device. I have been provided with the pin in which an input signal will be connected and from that need to display the frequency of ones occurring per second (the frequency). Having only programmed in VHDL a couple of times previously I am having difficulty figuring out how to implement the code:
So far I have constructed the following steps that the device needs to take
Count the logical ones in the input signal by creating a process depending on it
I did this by creating a process which is dependent on the input_singal and increments a variable when a high is present in the input_signal
counthigh:process(input_signal) -- CountHigh process
begin
if (input signal = '1') then
current_count := current_count+1;
end if;
end process; -- End process
Stop counting after a set amount of time and update the display with the frequency of the input_signal
I am unsure how to accomplish this using VHDL. I have provided a process from previous code which I used to implement a state machine. c_clk is a clock that operates at 5MHz/1024 (the timer div constant used) meaning that the period is equal to 2.048*10^-4 seconds. So the time between every rising edge is equal to that.
What I would like to do is wait for a set amount of rising_edges (I suppose I could define another variable and wait for a multiple of it to update the display and reset the current_count variable).
statereset:process -- StateReset process
begin
wait until rising_edge(c_clk); -- On each rising edge
if (reset='0') then
current_s <= s0; -- Default state on reset.
else
current_s <= next_s; -- Update the current state
end if;
end process; -- End process
From previous code I already have a entity called SevenSeg which I am able to manipulate to display the current frequency of the signal using basic mathematics.
I would just like to check that by making the counthigh process dependent on the input signal the process will 'wait' until the next std_logic_vector is available and read that instead of counting a high from the input_signal numerous times. Am I able to wait until there is a rising_edge(input_singal) in one process while making another process dependent on the clock rate?
If anyone has any ideas or feedback it would be greatly appreciated. I know I am asking an extremely broad and open-ended question but I am trying to figure out how to accomplish this task.
Cheers, NZBRU.
counthigh:process(input_signal) -- CountHigh process
begin
if (input signal = '1') then
current_count := current_count+1;
end if;
end process; -- End process
I understand what you are trying to achieve, but it won't work. In simulation, it will count each time input_signal goes high or low, which is good, but this code won't synthesize.
A counter needs a clock, and a process with a clock need a rising_edge. I expect your input to be of lower frequency than your operating clock, so I suggest you use an edge detector running using your clock. I will leave it as an exercise, but here's a good reference.
To wait 1 second or whatever else, use a counter. If your clock is 5MHz, use a signal to count from 0 to 4_999_999. When the counter is 4_999_999, reset the counter, the edge detector and update your display.
BTW, since your a beginner, try to use signals instead of variables. Variables have a similar behavior to programming languages, but they are a lot of pitfalls when used in synthesis. For a beginner, I suggest to stick to signals, once you're used to them and understand a little better how VHDL works, you can go back to using variables. In my own design for synthesis, I have something like 95% signals, which is standard for FPGA designers.

VHDL state machine differences (for synthesization)

I am taking a class on embedded system design and one of my class mates, that has taken another course, claims that the lecturer of the other course would not let them implement state machines like this:
architecture behavioral of sm is
type state_t is (s1, s2, s3);
signal state : state_t;
begin
oneproc: process(Rst, Clk)
begin
if (Rst = '1') then
-- Reset
elsif (rising_edge(Clk)) then
case state is
when s1 =>
if (input = '1') then
state <= s2;
else
state <= s1;
end if;
...
...
...
end case;
end if;
end process;
end architecture;
But instead they had to do like this:
architecture behavioral of sm is
type state_t is (s1, s2, s3);
signal state, next_state : state_t;
begin
syncproc: process(Rst, Clk)
begin
if (Rst = '1') then
--Reset
elsif (rising_edge(Clk)) then
state <= next_state;
end if;
end process;
combproc: process(state)
begin
case state is
when s1 =>
if (input = '1') then
next_state <= s2;
else
next_state <= s1;
end if;
...
...
...
end case;
end process;
end architecture;
To me, who is very inexperienced, the first method looks more fool proof since everything is clocked and there is less (no?) risk of introducing latches.
My class mate can't give me any reason for why his lecturer would not let them use the other way of implementing it so I'm trying to find the pros and cons of each.
Is any of them prefered in industry? Why would I want to avoid one or the other?
The single process form is simpler and shorter. This alone reduces the chance that it contains errors.
However the fact that it also eliminates the "incomplete sensitivity list" problem that plagues the other's combinational process should make it the clear winner regardless of any other considerations.
And yet there are so many texts and tutorials advising the reverse, without properly justifying that advice or (in at least one case I can't find atm) introducing a silly mistake into the single process form and rejecting the entire idea on the grounds of that mistake.
The only thing (AFAIK) the single-process form doesn't do well is un-clocked outputs. These are (IMO) poor practice anyway as they can be races at the best of times, and could be handled by a separate combinational process for that output only if you really had to.
I'm guessing there was originally some practical reason behind it; maybe a mid-1990s synthesis tool that couldn't reliably handle the single process form, and that made it into the original documentation that the lecturers learned from. Like those blasted non-standard std_logic_arith libraries. And so the myth has been perpetuated...
Those same lecturers would probably have a fit if they saw what can pass through a modern synthesis tool : integers, enumerations, record types, loops, functions and procedures updating signals (Xilinx ISE is now fine with these. Some versions of Synplicity have trouble with functions, but accept an identical procedure with an Out parameter).
One other comment : I prefer if Rst = '1' then over if (Rst = '1') then. It looks less like line noise (or C).
I agree with Brian on this. The only issue with the one process state machine is you cannot have un-clocked outputs, which is an issue if you need 0 latency on input to output. Otherwise the one process model helps to minimize bugs as it clearly relates the outputs to the state.
I was taught the two process model in school, but have discovered that the one process model is what is generally accepted in industry. I believe the reasoning for using the two process model in school is it gives students an understanding of how the placement of combinational logic relative to registers changes based on how the code is written (which IMO is very important when starting out) and what it means for their design. However simply forcing you to use the two process model with no explanation does not accomplish this.
the first method looks more fool proof since everything is clocked and there is less (no?) risk of introducing latches.
Yes, the first method where everything is clocked has no chance of introducing latches. It may introduce flipflops, but that's fine.
The 2nd method can introduce true asynchronous latches, which even in the best case are not very well handled by the back end FPGA tools I've used, and are not supported at all in some architectures, so would have to be built out of gates or lookup-tables.
In addition, if you get your sensitivity list wrong in the second process, your simulation can differ from your synthesis result! This is because synthesisers (for reasons I've given up trying to understand) treat the sensitivity list as if it were populated with all the signals you read (completely ignoring the VHDL language spec in the process) whereas the simulator will do exactly what you said.
Ugh. I hate the dual process state machine thing personally. He is probably an old guy and this was the most reliable way to do it 20 years ago. The tools understand your way and I personally like that approach better.
Your classmate is absolutely right. The problem here is that your question is not complete. The reason for your colleague's code to be better than yours is that people normally define the output values and the next state values in the same process, as shown below (this is the same as your own code, just with the output values added to it, which results in a "bad"code):
elsif (rising_edge(clk)) then
case state is
when s1 =>
--define outputs:
outp1 <= ...;
outp2 <= ...;
...
--define next state:
if (input = '1') then
state <= s2;
else
state <= s1;
end if;
when s2 =>
...
...
end case;
end if;
Recall that in an FSM the output is produced by the combinational logic section, therefore it is memoryless. However, in the code above, the output values get registered, which is not what the FSM must produce. Indeed, registering the outputs is a case-by-case decision EXTERNAL to the FSM (the outputs could be registered, for example, for glitch removal, which is a PARTICULAR, PLANNED decision, not a FORCED situation, as in the code above).

Latch signal without delay

I would like to latch a signal, however when I try to do so, I get a delay of one cycle, how can I avoid this?
myLatch: process(wclk, we) -- Can I ommit the we in the sensitivity list?
begin
if wclk'event and wclk = '1' then
lwe <= we;
end if;
end process;
However if I try this and look into the waves during simulation lwe is delayed by one cycle of wclk. All I want tp achieve is to sample we on the rising edge of wclk and keep it stable till the next rising edge. I then assign the latched signal to another entities port map which is defined in the architecture.
==============================================
Well I figured out that I have to omit the wclk'event to get a latch instead of a flip flop. This seems rather unintuitive to me. By simply shortening the time where I sample the signal to be latched I go from latch to flip flop. Can anyone explain why this is and where my perception is wrong. (I am a vhdl beginner)
First off, a few observations on the process you pasted above:
myLatch: process(wclk, we)
begin
if wclk'event and wclk = '1' then
lwe <= we;
end if;
end process;
The signal we can be omitted from the sensitivity list because you have described a clocked process. The only signals required in the sensitivity list of a process like this are the clock and the asynchronous reset if you choose to use one (a synchronous reset would not need to be added to the sensitivity list).
Instead of using if wclk'event and wclk = '1' then you should instead use if rising_edge(wclk) then or if falling_edge(wclk) then, there's a good blog post on the reasons why here.
By omitting the wclk'event you changed the process from a clocked process to a combinatorial process, like so:
myLatch: process(wclk, we)
begin
if wclk = '1' then
lwe <= we;
end if;
end process;
In a combinatorial process all inputs should be present in the sensitivity list, so you would be correct to have both wclk and we in the list as they had an influence on the output. Normally you would ensure that lwe is assigned in all cases of your if statement to avoid inferring a latch, however this appears to be your intention in this case.
Latches in general should be avoided, so if you find yourself needing one you should perhaps pause and consider your approach. Doulos have a couple of articles on latches here and here that you might find useful.
You stated that all you want to achieve is to sample we on the rising edge of wclk and keep it stable until the next rising edge. The process below will accomplish this:
store : process(wclk)
begin
if rising_edge(wclk) then
lwe <= we;
end if;
end process;
With this process, lwe will be updated with the value of we upon every rising edge of wclk and it will remain valid for a single clock cycle.
Let me know if this clears things up for you.
Believe it or not, the issue is actually in your testbench. This has to do with how the VHDL simulation model works.
VHDL is usually used for synchronous hardware design -- that means, using flip-flops that sample on the rising edge and set outputs on the falling edge, so that there are no race conditions between reading and writing. But in VHDL this master/slave logic is not actually simulated using opposite clock edges.
Consider a process
process (clock) begin
if rising_edge(clock) then
a <= b;
end if;
end process;
At the start of a simulation timestep, if clock has just risen, the if will execute. Then the assignment a <= b will be executed, and this will not immediately cause an assignment to take place, but schedule the assignment for the end of the timestep.
After all processes have been run, then all scheduled assignments take place. This means that no process will "see" the new value of a until the next timestep.
Time a b Actions
Start of ts 1 '0' '1' a <= '1' is scheduled
End of ts 1 '1' '0' a <= '1' is executed
Start of ts 2 '1' '0' a <= '0' is scheduled
End of ts 2 '0' '1' a <= '0' is executed
So when you look on the waveform viewer, what you will see is a apparently being set on the rising edge of the clock, and following b delayed by one clock cycle; you don't see the intermediate scheduling of assignments that causes this to happen.
Of course, in real life, there is no "end of the timestep", and the actual changing of signal a happens when the slave part of the flip-flop triggers, ie, on the negative edge. (Maybe it would have been less confusing for VHDL to just use the negative edge; but, oh well, this is how it works).
Here are two testbenches for your latch code:
Test bench 1, using rising edges
Test bench 2, using falling edges
In the first, if you look in the waveform viewer you will see exactly what you describe -- lwe appears to be delayed by 1 clock cycle -- but really, the delay is happening in the non-blocking assignment that sets counter -- so when the rising edge happens, we does not actually have its new value yet. And in the second, you see no such delay; lwe is set exactly on the rising edge to the value of we at that time.
For a related topic in Verilog, see Nonblocking Assignments in Verilog Synthesis, Coding Styles That Kill .
The process you have is what you want according to your description, although 'we' should be removed from the sensitivity list. If this doesn't work as you believe it should it is almost certainly a problem with your test bench/simulation. (See Owen's answer.) Specifically you are probably changing the value of 'we' too late, so that the flip-flop latches the previous value instead of the new one.
I'm interested to know what the source of this signal is though, if it's an asynchronous signal that can change at any time you will have to add some logic to protect against metastability.
To answer your second question about latches, it is correct that omitting wclk'event will result in a latch. This process will not do what you want, however, because it will propagate changes to 'we' to 'lwe' during the whole positive half-period of the clock. The short answer to your question is that implementing this type of behavior requires a latch, while the behavior described by the original process requires a flip-flop.

In VHDL when is the right time to use a Process statement?

I'm going through the phases of learning VHDL for the second or third time now. (this time armed with a very good and free e-book ) and I'm finally starting to "get" quite a bit of it. Now I'm learning about behavioral styles and the process statement and most of it makes sense. However, I've read in many places that processes are to be avoided except for in certain cases. I mean, in theory can't everything be implemented in data-flow instead of behavioral?
When exactly should it be obvious that a process statement should be used?
The process statement is extremely useful, in what situations have you been told not to use them?
There are many different cases where you would use a process statement, I'll outline a few of these below:
One of the most common usages of the process statement (for synthesis) is to describe logic which is synchronous to a clock signal, for example a simple counter that increments every clock cycle when not in reset could be described as:
DATA_REGISTER : process(CLOCK)
begin
if rising_edge(CLOCK) then
if RESET = '1' then
COUNTER <= (others => '0');
else
COUNTER <= COUNTER + 1; --COUNTER is assumed to be of type 'unsigned'
end if;
end if;
end process;
As your designs grow more complex you will inevitably implement a state machine at some point, this will employ one or more processes depending on the style of state machine you choose to implement.
For behavorial code you can use processes in conjunction with wait statements to generate test vectors or to model the behaviour of a real system. Here's a really basic example of a 100MHz clock generator taken from one of my testbenches:
architecture BEH of ethernet_receive_tb is
signal s_clock : std_logic := '0'; --Initial assignment to clock kicks off the process.
begin
CLOCKGEN : process(s_clock)
begin
s_clock <= not s_clock after 5 NS;
end process CLOCKGEN;
...
You can also describe asynchronous logic with processes, in this case you need to include all signals which are read in the process in the sensitivity list and you need to make sure that any outputs are always defined to avoid inferred latches.
IF_ELSE: process (SEL, A, B)
begin
F <= B; -- Default assignment
if SEL = '1' then
F <= A;
end if;
end process;
Hopefully you can see that the process statement is very useful and that you will use it in many different situations. I hope this answered your question!
Process blocks are your friend.
They provide a way of saying "This block of code is related. It's inputs are X,Y,Z and it drives A,B,C". The inputs are documented by the sensitivity list (unless it's a clocked process in which case it should be in your comments). If anything else drives the same signals then you'll get warnings, errors, X's in simulation (depending on your tools). Whatever you get it's pretty obvious.
Personally I would be quite happy writing multiple processes in a single entity, but everyone has their styles. For example, if I have multiple pipe-line stages, each stage is a process. If I have parallel non-interfering paths each will be in a separate process. By doing it this way the code is structured in small, easy to read blocks. Small simple logic synthesizes into small fast blocks (in general).
You could view my style as using them as lightweight entities.
In synthesisable code, processes are required any time you need to keep information from one clock cycle to another. "To store state" in the jargon.
(Note that a process can implied by code such as
d <= q when rising_edge(clk);
)
If non-synthesisable code, processes are useful for getting events to happen in a particular order:
p1: process
begin
data <= "--------";
WE <= '0';
wait until reset = '1';
wait until processor_initialised = '1';
assert ACK = '0' report "ACK should be low!" severity error;
data <= X"16";
WE <= '1';
wait until ACK = '1';
end process;
Most of my code has a single process per entity. Each entity does some useful, well-defined and small-enough-to-be-testable task

Resources