I'm trying to implement some logic that is either triggered on a rising or falling edge of the same clock, depending on a clock polarity signal. I tried the following but got an error message in Quartus 15.1 (error id: 10628): "Can't implement register for two clock edges combined with a binary operator".
process(clk)
begin
if (rising_edge(clk) and pol='0') or (falling_edge(clk) and pol='1') then
-- logic
end if;
end process;
My current workaround is to just write the whole logic block twice, once with each condition separately. But since the logic block isn't that small and still under development this really isn't optimal. Does anyone have a better solution to this?
Edit: One easy way to achieve this would be a gated clock, but from what I understand, this is generally a bad idea, so I tried to do it without a mux or xor in the clock path.
Edit: I misunderstood the question. The below only applies if pol is a constant:
Try the oldschool way of writing this, with separate event attribute and level:
if clk'event and clk = pol then
-- logic
end if;
If pol is '1', this synthesises to the same thing as rising_edge(clk).
Related
I am working on projects which requires synthesis of my RTL codes specifically for ASIC development. Given the case, how much important is it, to separate sequential logic from differential logic while designing my RTLs ? And if it is important, then what should be my approach while designing, as if how should I differentiate my design for sequential and combinational logic?
I generally will separate sequential and combinational as much as possible until it results in too much code (which is usually rare, and possibly an indication of a poor design), or when something just makes more sense (which again is rare, but happens).
This segregation usually aids in initial design, brain->rtl->synthesis (what you think you're making is actually what is synthesized), CDC evaluation for multiclock designs, verification, and other things as well. It's hard for me to give a great example of something bad versus something I would call good, but here is a stab at it.
Say I have a counter that I want to reset on a certain value. I could do this (which I tend to see from people who have a strong software and/or FPGA background)
always #(posedge clk or posedge reset) begin
if(reset) begin
count <= 0;
end else begin
if((count == reset_count_val) || (~enable)) count <= 0;
else count <= count + 1;
end
end
Or I would do this (which is what I would personally do):
//Combinational Path
assign count_in = enable ? ((count == reset_count_val) ? 4'd0 : count_in + 4'd1) : 4'd0;
//Sequential Path
always #(posedge clk or posedge reset) begin
if(reset) count <= 4'd0;
else count <= count_in;
end
The above I would agree is more typing, and to some, more difficult to read. It does however split up the circuit in a way that allows me to easier see what happens on each clock edge. I know that "count_in" is being setup prior to the posedge of clk. I can easily see (as well as anyone else looking at the code) that I would expect a MUX for the reset or addition of count based on the reset_count_val, with a final MUX for gating the count based on the enable signal. Now you could see the same with the first bit of code, however IMO it's not as clear. When you look at a sim, you can see what count_in looks like prior to the rising edge of the clk. This could aid you if the conditional statement for the count_in was rather complex.
Let's say you sent this through synthesis and place and route and you get a timing violation between the Q of the count reg and the D of the count reg (since you have a loopback based on the addition). It would generally be easier to see which path is causing the issue with the 2nd batch of code. This depends on the tool (Primetime more than likely). CDC could also be easier because let's say that reset_count_val is coming from a static registers in another clock domain. The tool may try to synthesize/elaborate the OR in the 1st batch of code thinking the reset_count_val and enable are somewhat related giving you a weird looking CDC violation. Again, sometimes it's hard to come up with an example that exercises all of the "why you shouldn't do this" type of cases.
As an example about splitting combinational and sequential, I inherited a design where someone had a state machine that was written with the combiational and sequential in the same block (always #(posedge clk) where the if/else's would get deep and have next state logic in it). I'm no genius, but after days of staring at that thing and running sims, I could just not figure out what it was doing. It was quite large as well. I simply redid the design, keeping the same algorithm, but splitting up the logic in the format I described here. Even with me adding some features, the size went down ~15%. Other engineers who had the same problem of being lost with the other design, could now understand what was going on with it. This won't always be the case, but more than often it is.
TLDR;
I try to be extremely descriptive when designing RTL that is to be used in an ASIC. The more abstract the code, the more likely it is to build something you don't want, or is more complex than needed. Verification is often much easier, particularly when you get to gate sims. While I am in the camp of "the less code the better" that is not always the case with verilog, especially on an ASIC.
In general, I would not hesitate to mix combinational with sequential logic. I come from an IC design background and have always mixed up combinational logic with sequential. I think that you are restricting yourself too much if you don't and are not fully using the power of your logic synthesiser.
For example, here is how I would design a simple, asynchronously-reset counter in VHDL:
process (Clock, Reset)
begin
if Reset = '1' then
Cnt <= (others => '0');
elsif Rising_edge(Clock) then
if Enable = '1' then
Cnt <= Cnt + 1;
end if;
end if;
end process;
This style of writing a counter in VHDL is ubiquitous. I personally can see no advantage to splitting the code up into two separate processes, one sequential the other combinational. I have just taught a room full of engineers to design a counter in exactly this way.
Here are some exceptions. I would split the combinational logic from the sequential logic if:
i) I were designing a state machine:
There is what I think is a really elegant way of coding a state machine, where you do split the combinational logic from the sequential:
Registers: process (Clock, Reset)
begin
if Reset = '1' then
State <= Idle;
elsif Rising_edge(Clock) then
State <= NextState;
end if;
end process Registers;
Combinational: process (State, Inputs)
begin
NextState <= State;
Output1 <= '0';
Output2 <= '0';
-- etc
case State is
when Idle =>
if Inputs(1) = '1' then
NextState <= State2;
end if;
when State2=>
Output1 <= '1';
if Inputs = "00" then
NextState <= State3;
end if;
-- etc
end case;
end process Combinational;
The advantage of coding a state machine like this is that the combinational process looks very like the state diagram. It is less error prone to write, less error prone to modify and less error prone to read.
ii) The combinational logic were complex:
For really big block of combinational logic, I would separate. The exact definition of "really big" is a matter of judgement.
iii) The combinational logic were on the Q output of a flip-flop:
Any signal driven in a sequential process infers a flip-flop. Therefore, if you wish to implement combinational logic that drives an output of an entity* then this combinational logic must be in a separate process.
*often not a good idea - be careful.
I would post it as comment if I could, as I am not writing full answer, but giving you a source, and also I don't fully refere to the question (I don't know anything about ASICs). But there is great pdf about this problem in general here. Generally, you don't have to completely separate sequential logic from differential logic, but it is helpful to write more readable and maintainable code.
Mixing sequential and combo condenses the code which almost always makes it easier to understand.
Separating makes ECOs easier.
Which you choose is a matter of personal style and organizational coding conventions and standards.
I am implementing a digital design in VHDL which has to be low power. The design has a lot of inputs that are declared as multiple standard logic vectors. The device is allowed to wake up if anything changes on any input. This has to be combinatorical logic because the device is in power down. The code of what I am trying to do says it all: (ToggleSTDBY is a signal so this is legal)
P_Wakeup: PROCESS (VEC1, VEC2, VEC3, Rst_N) IS
BEGIN
IF Rst_N = '0' THEN
ToggleSTDBY <= '0';
ELSIF VEC1'event OR VEC2'event OR VEC3'event THEN
ToggleSTDBY <= NOT(ToggleSTDBY);
END IF;
END PROCESS P_Wakeup;
This is legal in simulation, but upon synthesis it says "'event is only supported for single bit signals". How can I fix this? There are a total of 66 bits in the vectors all together and I really don't want to write 66 processes for waking the device up. A bitwise OR on all bits will not solve anything, since most signals will be high, so the OR on all bits will always result in a high. The following code:
P_Wakeup: PROCESS (VEC, Rst_N) IS
BEGIN
IF Rst_N = '0' THEN
ToggleSTDBY <= '0';
ELSE
FOR i IN VEC'RANGE LOOP
IF VEC(i)'EVENT THEN
ToggleSTDBY <= NOT(ToggleSTDBY);
END IF;
END LOOP;
END IF;
END PROCESS P_Wakeup;
gives error "The prefix of signal attribute 'EVENT must be a static signal name". How can I fix it AND keep the code readable?
The HDL part of VHDL is abbreviation for Hardware Description Language (HDL),
so the VHDL constructions you can use must be possible to map by the synthesis
tool to the target. The use of 'event is typically tied to sequential
(clocked) hardware elements like flip flops or RAMs, and the synthesis tool
typically requires that you write the VHDL in a specific way, so the tool can
identify that a particular hardware elements is to be used.
Even though you may write VHDL code for a simulator, for example ModelSim, that
can compile and simulate use of 'event as in your examples, the synthesis
tool will typically not be able to map this to any available target hardware
element, since there is probably no such hardware elements as an 'event
detector.
But an 'event actually indicates a change in signal value, so you can maybe
write the signal value change detector explicitly in VHDL like:
change <= '1' if (vec_now /= vec_previous) else '0';
Depending on the low-power hardware elements available, you may start the clock
when an '1' is detected asynchronously on change, maybe through
ToggleSTDBY, and then process the change. The last thing before going into
sleep mode is then to capture the current vec value in vec_previous, so
another change can be detected while in sleep mode.
The possibility for doing low-power design of the kind I assume you are doing
based on the description, depends entirely on the features provided in the
target FPGA/ASIC technology. So before trying to get the VHDL syntax right,
you may want to determine how the resulting hardware should look like, based on
the available low power features.
Even if it is possible to write a VHDL code that models your intended behavior, I believe it won't work as you expect. I suggest that before writing the code you try to sort out the details of how exactly your ToggleSTDBY would be set, tested, and reset (a circuit diagram could help).
If you decide to implement ToggleSTDBY as a vector, one solution for the "event is only supported for single bit signals" problem is to move the loop to outside the process, using a for-generate:
gen: for i in ToggleSTDBY'range generate
p_wakeup : process (vec, rst_n) is
begin
if rst_n = '0' then
ToggleSTDBY(i) <= '0';
else
if vec(i)'event then
ToggleSTDBY(i) <= not (ToggleSTDBY(i));
end if;
end if;
end process p_wakeup;
end generate;
I am taking a class on embedded system design and one of my class mates, that has taken another course, claims that the lecturer of the other course would not let them implement state machines like this:
architecture behavioral of sm is
type state_t is (s1, s2, s3);
signal state : state_t;
begin
oneproc: process(Rst, Clk)
begin
if (Rst = '1') then
-- Reset
elsif (rising_edge(Clk)) then
case state is
when s1 =>
if (input = '1') then
state <= s2;
else
state <= s1;
end if;
...
...
...
end case;
end if;
end process;
end architecture;
But instead they had to do like this:
architecture behavioral of sm is
type state_t is (s1, s2, s3);
signal state, next_state : state_t;
begin
syncproc: process(Rst, Clk)
begin
if (Rst = '1') then
--Reset
elsif (rising_edge(Clk)) then
state <= next_state;
end if;
end process;
combproc: process(state)
begin
case state is
when s1 =>
if (input = '1') then
next_state <= s2;
else
next_state <= s1;
end if;
...
...
...
end case;
end process;
end architecture;
To me, who is very inexperienced, the first method looks more fool proof since everything is clocked and there is less (no?) risk of introducing latches.
My class mate can't give me any reason for why his lecturer would not let them use the other way of implementing it so I'm trying to find the pros and cons of each.
Is any of them prefered in industry? Why would I want to avoid one or the other?
The single process form is simpler and shorter. This alone reduces the chance that it contains errors.
However the fact that it also eliminates the "incomplete sensitivity list" problem that plagues the other's combinational process should make it the clear winner regardless of any other considerations.
And yet there are so many texts and tutorials advising the reverse, without properly justifying that advice or (in at least one case I can't find atm) introducing a silly mistake into the single process form and rejecting the entire idea on the grounds of that mistake.
The only thing (AFAIK) the single-process form doesn't do well is un-clocked outputs. These are (IMO) poor practice anyway as they can be races at the best of times, and could be handled by a separate combinational process for that output only if you really had to.
I'm guessing there was originally some practical reason behind it; maybe a mid-1990s synthesis tool that couldn't reliably handle the single process form, and that made it into the original documentation that the lecturers learned from. Like those blasted non-standard std_logic_arith libraries. And so the myth has been perpetuated...
Those same lecturers would probably have a fit if they saw what can pass through a modern synthesis tool : integers, enumerations, record types, loops, functions and procedures updating signals (Xilinx ISE is now fine with these. Some versions of Synplicity have trouble with functions, but accept an identical procedure with an Out parameter).
One other comment : I prefer if Rst = '1' then over if (Rst = '1') then. It looks less like line noise (or C).
I agree with Brian on this. The only issue with the one process state machine is you cannot have un-clocked outputs, which is an issue if you need 0 latency on input to output. Otherwise the one process model helps to minimize bugs as it clearly relates the outputs to the state.
I was taught the two process model in school, but have discovered that the one process model is what is generally accepted in industry. I believe the reasoning for using the two process model in school is it gives students an understanding of how the placement of combinational logic relative to registers changes based on how the code is written (which IMO is very important when starting out) and what it means for their design. However simply forcing you to use the two process model with no explanation does not accomplish this.
the first method looks more fool proof since everything is clocked and there is less (no?) risk of introducing latches.
Yes, the first method where everything is clocked has no chance of introducing latches. It may introduce flipflops, but that's fine.
The 2nd method can introduce true asynchronous latches, which even in the best case are not very well handled by the back end FPGA tools I've used, and are not supported at all in some architectures, so would have to be built out of gates or lookup-tables.
In addition, if you get your sensitivity list wrong in the second process, your simulation can differ from your synthesis result! This is because synthesisers (for reasons I've given up trying to understand) treat the sensitivity list as if it were populated with all the signals you read (completely ignoring the VHDL language spec in the process) whereas the simulator will do exactly what you said.
Ugh. I hate the dual process state machine thing personally. He is probably an old guy and this was the most reliable way to do it 20 years ago. The tools understand your way and I personally like that approach better.
Your classmate is absolutely right. The problem here is that your question is not complete. The reason for your colleague's code to be better than yours is that people normally define the output values and the next state values in the same process, as shown below (this is the same as your own code, just with the output values added to it, which results in a "bad"code):
elsif (rising_edge(clk)) then
case state is
when s1 =>
--define outputs:
outp1 <= ...;
outp2 <= ...;
...
--define next state:
if (input = '1') then
state <= s2;
else
state <= s1;
end if;
when s2 =>
...
...
end case;
end if;
Recall that in an FSM the output is produced by the combinational logic section, therefore it is memoryless. However, in the code above, the output values get registered, which is not what the FSM must produce. Indeed, registering the outputs is a case-by-case decision EXTERNAL to the FSM (the outputs could be registered, for example, for glitch removal, which is a PARTICULAR, PLANNED decision, not a FORCED situation, as in the code above).
I would like to latch a signal, however when I try to do so, I get a delay of one cycle, how can I avoid this?
myLatch: process(wclk, we) -- Can I ommit the we in the sensitivity list?
begin
if wclk'event and wclk = '1' then
lwe <= we;
end if;
end process;
However if I try this and look into the waves during simulation lwe is delayed by one cycle of wclk. All I want tp achieve is to sample we on the rising edge of wclk and keep it stable till the next rising edge. I then assign the latched signal to another entities port map which is defined in the architecture.
==============================================
Well I figured out that I have to omit the wclk'event to get a latch instead of a flip flop. This seems rather unintuitive to me. By simply shortening the time where I sample the signal to be latched I go from latch to flip flop. Can anyone explain why this is and where my perception is wrong. (I am a vhdl beginner)
First off, a few observations on the process you pasted above:
myLatch: process(wclk, we)
begin
if wclk'event and wclk = '1' then
lwe <= we;
end if;
end process;
The signal we can be omitted from the sensitivity list because you have described a clocked process. The only signals required in the sensitivity list of a process like this are the clock and the asynchronous reset if you choose to use one (a synchronous reset would not need to be added to the sensitivity list).
Instead of using if wclk'event and wclk = '1' then you should instead use if rising_edge(wclk) then or if falling_edge(wclk) then, there's a good blog post on the reasons why here.
By omitting the wclk'event you changed the process from a clocked process to a combinatorial process, like so:
myLatch: process(wclk, we)
begin
if wclk = '1' then
lwe <= we;
end if;
end process;
In a combinatorial process all inputs should be present in the sensitivity list, so you would be correct to have both wclk and we in the list as they had an influence on the output. Normally you would ensure that lwe is assigned in all cases of your if statement to avoid inferring a latch, however this appears to be your intention in this case.
Latches in general should be avoided, so if you find yourself needing one you should perhaps pause and consider your approach. Doulos have a couple of articles on latches here and here that you might find useful.
You stated that all you want to achieve is to sample we on the rising edge of wclk and keep it stable until the next rising edge. The process below will accomplish this:
store : process(wclk)
begin
if rising_edge(wclk) then
lwe <= we;
end if;
end process;
With this process, lwe will be updated with the value of we upon every rising edge of wclk and it will remain valid for a single clock cycle.
Let me know if this clears things up for you.
Believe it or not, the issue is actually in your testbench. This has to do with how the VHDL simulation model works.
VHDL is usually used for synchronous hardware design -- that means, using flip-flops that sample on the rising edge and set outputs on the falling edge, so that there are no race conditions between reading and writing. But in VHDL this master/slave logic is not actually simulated using opposite clock edges.
Consider a process
process (clock) begin
if rising_edge(clock) then
a <= b;
end if;
end process;
At the start of a simulation timestep, if clock has just risen, the if will execute. Then the assignment a <= b will be executed, and this will not immediately cause an assignment to take place, but schedule the assignment for the end of the timestep.
After all processes have been run, then all scheduled assignments take place. This means that no process will "see" the new value of a until the next timestep.
Time a b Actions
Start of ts 1 '0' '1' a <= '1' is scheduled
End of ts 1 '1' '0' a <= '1' is executed
Start of ts 2 '1' '0' a <= '0' is scheduled
End of ts 2 '0' '1' a <= '0' is executed
So when you look on the waveform viewer, what you will see is a apparently being set on the rising edge of the clock, and following b delayed by one clock cycle; you don't see the intermediate scheduling of assignments that causes this to happen.
Of course, in real life, there is no "end of the timestep", and the actual changing of signal a happens when the slave part of the flip-flop triggers, ie, on the negative edge. (Maybe it would have been less confusing for VHDL to just use the negative edge; but, oh well, this is how it works).
Here are two testbenches for your latch code:
Test bench 1, using rising edges
Test bench 2, using falling edges
In the first, if you look in the waveform viewer you will see exactly what you describe -- lwe appears to be delayed by 1 clock cycle -- but really, the delay is happening in the non-blocking assignment that sets counter -- so when the rising edge happens, we does not actually have its new value yet. And in the second, you see no such delay; lwe is set exactly on the rising edge to the value of we at that time.
For a related topic in Verilog, see Nonblocking Assignments in Verilog Synthesis, Coding Styles That Kill .
The process you have is what you want according to your description, although 'we' should be removed from the sensitivity list. If this doesn't work as you believe it should it is almost certainly a problem with your test bench/simulation. (See Owen's answer.) Specifically you are probably changing the value of 'we' too late, so that the flip-flop latches the previous value instead of the new one.
I'm interested to know what the source of this signal is though, if it's an asynchronous signal that can change at any time you will have to add some logic to protect against metastability.
To answer your second question about latches, it is correct that omitting wclk'event will result in a latch. This process will not do what you want, however, because it will propagate changes to 'we' to 'lwe' during the whole positive half-period of the clock. The short answer to your question is that implementing this type of behavior requires a latch, while the behavior described by the original process requires a flip-flop.
I have used below statement, frequently. However, I wonder
if ( clock'event and clock = '1' ) then
[do something]
we really need to write clock'event in above statement ?
If yes, why?
You could get the simulation to work perfectly without the clock'event condition, but the synthesis will come out wrong.
The IEEE standard on synthesizable VHDL requires that you add clock'event.
It is commonly accepted good practice to write if rising_edge(clock) instead. This conveys your intention a lot better. Both rising_edge and falling_edge functions are allowed as synthesizable VHDL constructs.
For simulation:
process (clock) is
-- stuff
begin
if clock='1' then -- EVIL! don't do this
-- do things
end if;
end process;
Assuming that clock just switches from '0' to '1' and back (no meta-values), the behavior would be identical to what you'd get with a clock'event condition. Again, this will not synthesize to what you want! You'll probably get a latch, not a D flip-flop.
(Bonus points for whomever tries to synthesize this and gets back with the results!)
Yes, otherwise the following code executes the entire time your clock signal is high, not just at the rising edge of the clock.