Ensuring propagation is complete in VHDL without an explicit click - vhdl

I am looking to build a VHDL circuit which responds to an input as fast as possible, meaning I don't have an explicit clock to clock signals in and out if I don't absolutely need one. However, I am also looking to avoid "bouncing" where one leg of a combinatorial block of logic finishes before another.
As an example, the expression B <= A xor not not A should clearly never assign true to B. However, in a real implementation, the two not gates introduce delays which permit the output of that expression to flicker after A changes but the not gates have not propagated that change. I'd like to "debounce" that circuit.
The usual, and obvious, solution is to clock the circuit, so that one never observes a transient value. However, for this circuit, I am looking to avoid a dependence on a clock signal, and only have a network of gates.
I'd like to have something like:
x <= not A -- line 1
y <= not x -- line 2
z <= A xor y -- line 3
B <= z -- line 4
such that I guarantee that line 4 occurs after line 3.
The tricky part is that I am not doing this in one block, as the exposition above might suggest. The true behavior of the circuit is defined by two or more separate components which are using signals to communicate. Thus once the signal chain propagates into my sub-circuit, I see nothing until the output changes, or doesn't change!
In effect, the final product I'm looking for is a procedure which can be "armed" by the inputs changing, and "triggered" by the sub-circuit announcing its outputs are fully changed. I'd like the result to be snynthesizable so that it responds to the implementation technology. If it's on a FPGA, it always has access to a clock, so it can use that to manage the debouncing logic. If it's being implemented as an ASIC, the signals would have to be delayed such that any procedure which sees the "triggered" signal is 100% confident that it is seeing updated ouputs from that circuit.
I see very few synthesizable approaches to such a procedural "A happens-before B" behavior. wait seems to be the "right" tool for the job, but is typically only synthesizable for use with explicit clock signals.


Does time delay in a sequential logic circuit block have a influence on synthesize or place or route's result?

I use Xilinx ISE as a IDE.
If I add a 100 ps delay at every assignment in a always(Verilog)/process(VHDL) with sensitive list only have clock and reset.
Like this.
always#(posedge clk)
a <= #100 'd0;
a <= #100 b;
I think the delay function is only effect the simulation process.Because every book and user guide tell us delay is not synthesizable.
But I still wondering if the delay function can really effect the place or route's result?Like static timing or clock report?
Like can make a circuit max frequency higher or slower?
No the #delay in your code is not going to affect the timing of the design when it is loaded on to the FPGA.
It also does not affect the place and route results or the static timing analysis. Both of these steps use timing information that is provided by the manufacturer in the form of device models.
You are correct that there's nothing intrinsic about delay statements that makes them unsynthesizable, however it's wildly impractical to attempt to do so. The reason for this is that once on the FPGA you are dealing with a physical circuit whose performance varies with PVT (process, voltage, temperature) and can do so by a lot! The only hedge against this would be an analog circuit that attempts to sense all of the above and adjust itself accordingly. Such a beast will still be limited in what it can do, and would be physically large and power hungry depending on the rage of delay and the variance in all of the above you want to support.
So with than in mind and considering that there is very little (read: no) demand for this outside of special purpose IO FPGA vendors don't provide any such components making the construct unsythesizable.
Delay statements (#100) are usually ignored during synthesis in Verilog. So in synthesis it is the same as:
always#(posedge clk)
a <= 0;
a <= b;
Xlinx Synthesis and Simuation Design Guide states:
Delays in Synthesis Code
Do not use Wait for XX ns (VHDL) or the #XX (Verilog) statements in
your code. (...) This statement does not synthesize to a component.
In designs that include this construct, the functionality of the
simulated design does not always match the functionality of the
synthesized design.
Wait for XX ns Statement Verilog Coding Example
Do not use the After XX ns statement in your VHDL code or the Delay
assignment in your Verilog code
Delay Assignment Verilog Coding Example
assign #XX Q=0;
XX specifies the number of nanoseconds that must pass before a
condition is executed. This statement is usually ignored by the
synthesis tool. In this case, the functionality of the simulated
design does not match the functionality of the synthesized design.
"Usually" there is no impact on synthesis and P&R results.
Xilinx: This statement is usually ignored by the synthesis tool.
When does it have impact then?
Although the delay statement is ignored by the synthesis tool, the HDL code is a little bit different. That may change the seed of randomization in any stage (parsing, elaboration, synthesis etc.), so there is a possibility for different results. These results may be better or worse.
If a delay statement exists in the code, the following warning is expected from Xilinx ISE:
WARNING:Xst:916 - design.v line x: Delay is ignored for synthesis.

write a vhdl process to model a 4 by 2 encoder with registered output and reset

I am confused by what the registered output means. I know how to code an encoder in VHDL, but don't know what the questions means by registered output.
Registered means stored, in a flipflop. Imagine combinatorial logic:
A = B and C
When B or C change, it takes a finite amount of time for A to reflect this change. A small amount of time indeed, which quickly increases as the complexity of this logic increases. If B and C themselves would depend on a bunch of other combinatorial (and, or, xor, whatever non-clocked) logic, they wouldn't change simultaneously, A might toggle a few times before reaching its final state and worst of all, it would get difficult to predict when A would reach that final state. Certainly when considering all possible effects altering the time required by the logic, e.g. temperature. The longer the combinatorial chain, the greater becomes the influence of temperature.
That is why we restrict the length of combinatorial chains and clock the result in a flipflop to resynchronize intermediate signals so to have a predictable, well-behaving system.
A registered output means that the output is driven by a flipflop and one does not need to worry about any combinatorial logic on that path. The result comes out withing the delay specs of that flipflop after a clock edge and the variation due to temperature/voltage/process will be as good as it gets

Match Simulation and Post-Synthesis Behavior in VHDL

This question is an extension of the another shown here, VHDL Process Confusion with Sensitivity Lists
However, having less than 50 Rep points, I was unable to comment for further explanation.
So, I ran into the same problem from the link and accept the answer shown. However, now I am interested in what the recommended approach is in order to match simulation with post-synthesis behavior. The accepted answer in the link states that Level Sensitive Latches are not recommended as a solution due to them causing more problems. So my question is what is the recommended approach? Is there one?
In other words, I want to acheive what was trying to be achieved in that post, but in a way that won't cause more problems. I need my sensitivity list to not be ignored by my synthesis tools.
Also, I am new to VHDL so it may be possible that using processes is not right way to achieve the result I want. I am using a DE2-115 with Quartus Prime 16.0. Any information will be greatly appreciated.
If you are using VHDL to program your FPGA-based prototyping board you are interested in the synthesis semantics of the language. It is rather different from the simulation semantics described in the Language Reference Manual (LRM). Even worse: it is not standardized and varies between synthesis tools. Anyway, synthesis means translation from the VHDL code to digital hardware. The only recommended approach here, for a beginner who still do not clearly understands the synthesis semantics is:
Think hardware first, code next.
In other words, draw a nice block diagram of the hardware you want on a sheet of paper. And use the following 10 rules. Strictly. No exceptions. Never. And do not forget to carefully check the last one, it is as essential as the others but a bit more difficult to verify.
Surround your drawing with a large rectangle. This is the boundary of your circuit. Everything that crosses this boundary is an input or output port. The VHDL entity will describe this boundary.
Clearly separate edge-triggered registers (e.g. square blocks) from combinatorial logic (e.g. round blocks).
Do not use level-triggered latches.
Use only rising-edge triggered registers and use the same single clock for all of them. Its name is clock. It comes from the outside and is an input of all square blocks and only them. Do not even represent the clock, it is the same for all square blocks and you can leave it implicit in your diagram.
Represent the communications between blocks with named and oriented arrows. For the block an arrow comes from, the arrow is an output signal. For the block an arrow goes to, the arrow is an input signal.
Arrows have one single origin but they can have several destinations. If an arrow has several destinations, fork the arrow as many times as needed.
Some arrows come from outside the large rectangle. These are the input ports of the entity. An input arrow cannot also be the output of any of your blocks.
Some arrows go outside. These are the output ports. An output arrow has one single origin and one single destination: the outside. No forks on output arrows. So, an output arrow cannot be also the input of one of your blocks. If you want to use an output arrow as an input for some of your blocks, insert a new round block to split it in two parts: the input of the new block, with as many forks as you wish, and the output arrow that comes from the new block and goes outside. The new block will become a simple continuous assignment in VHDL. A kind of transparent renaming.
All arrows that do not come or go from/to the outside are internal signals. You will declare them all in the architecture.
Every cycle in the diagram must comprise at least one square block.
If you cannot find a way to describe the function you want with this approach, the problem is with the function you want. Not with VHDL or the synthesizer. It means that the function you want is not digital hardware. Implement it using another technology.
The VHDL coding becomes a detail:
one synchronous process per square block,
one combinatorial process per round block.
A synchronous process looks like this:
if rising_edge(clock) then
o1 <= i1;
on <= in;
end if;
end process;
where i1, i2,..., in are all arrows that enter the corresponding square block of your diagram and o1, ..., om are all arrows that output the corresponding square block of your diagram. Do not change anything, except the names of the signals. Nothing. Not even a single character. OK?
A combinatorial process looks like this:
process(i1, i2,... , in)
o1 <= <default_value_for_o1>;
om <= <default_value_for_om>;
end process;
where i1, i2,..., in are all arrows that enter the corresponding round block of your diagram. all and no more. Do not forget a single arrow and do not add anything else. There are no exceptions. Never. And where o1, ..., om are all arrows that output the corresponding round block of your diagram. all and no more. Do not change anything except <declarations>, the names of the inputs, the names of the outputs, the values of the <default_value_for_oi> and <statements>. Do not forget a single default value assignment. If you had to create a new round block to split a primary output arrow, the corresponding process is just:
o <= i;
end process;
which you can simplify as:
o <= i;
without the enclosing process declaration. It is the equivalent concurrent signal assignment.
Once you will be comfortable with this coding style and only then, you will:
Skip the drawing for simple designs. But continue thinking hardware first. Draw in your head instead of on a sheet of paper but continue drawing.
Use asynchronous resets:
process(clock, reset)
if reset = '1' then
o <= reset_value_for_o;
elsif rising_edge(clock) then
o <= i;
end if;
end process;
Merge several combinatorial processes in one single combinatorial process. This is trivial and is just a simple reorganization of the block diagram.
Merge some combinatorial processes with synchronous processes. But in order to do this you must go back to your block diagram and add an eleventh rule:
Group several round blocks and at least one square block by drawing an enclosure around them. Also enclose the arrows that can be. Do not let an arrow cross the boundary of the enclosure if it does not come or go from/to outside the enclosure. Once this is done, look at all the output arrows of the enclosure. If any of them comes from a round block of the enclosure or is also an input of the enclosure, you cannot merge these processes in a synchronous process.
And later on you will also start using latches, falling clock edges, multiple clocks and resynchronizers between clock domains... But we will discuss these when the time will have come.

Do all Flip Flops in a design need to be resettable (ASIC)?

I'm trying to understand clock-reset in a chip. In a design what criteria are used to decide whether a flop should be assigned to a value (typically to zero) during reset?
always_ff #(posedge clk or negedge reset) begin : process_w_reset
if(~reset) begin
flop1 <= '0;
end else begin
if (condition) begin
flop1 <= something ;
always_ff #(posedge clk) begin : process_wo_reset
if (condition) begin
flop1 <= something ;
Is it a bad practice to not to reset a flop which is used later as a control signal in a comb logic? What if the design makes sure that the flop will have a valid value (0 or 1) assigned to it before its used in a comb logic block (i.e. in a if statement or in FSM comb logic) ?
I feel like it's better to always reset all the flops in the design. In that way there won't be any Xs after reset in the chip. However, it seems like for datapath logic, resetting flop might need not be a big deal as it'll be just pipe stages. However if a flop is in control path (i.e. FSM next state comb logic) then it should be reset to a default value. Is my understanding correct? I don't know much about DFT and not sure if it has any other implication.
Assuming that reset means asynchronous reset, as in the code examples.
The answer is partly opinion based, since a design can be made to work with reset of a minimum number of the Flip-Flops (FFs) and all of the FFs.
I suggest that a minimum number of FFs are reset, and typically that leads to reset of most FFs in the control path, and no reset of FFs in the data path. The advantages of this approach are outlined below.
Simulation is often conservative with respect to propagation of uninitialized values, both for Verilog and VHDL, so it is like simulation can check both 0 and 1 values at once when the value is uninitialized.
Bugs due to FFs that are not reset, are therefore likely to show earlier in verification with simulation, and the designer thereby gets valuable feedback about wrong design assumptions, which may lead to corrections in the design that fixes other bugs. Just resetting all the FFs is likely to hide such bugs.
It may seem like design and verification is just easier if all FFs are reset, both in control and data path, since it fixes all those "annoying" X propagation in the design. But it requires an increased number of tests in order to verify all value combinations when X propagation is suppressed through reset.
Implementation gives a smaller load on the reset signal, so it is easier to meet timing of the reset net throughout the chip.
DFT (Design For Test) in general, then adding reset to the FFs will not aid DFT in finding nets stuck at reset value. With a DFT scan chain approach where all the FFs are loaded through the scan chain, then the lack of reset on some FFs will not require more vectors.
Generally you need to think about where the 'X's will propagate in your simulation and which ones matter and which ones are don't care conditions. For example, if you have a block of logic which doesn't start operating until an enable bit is set, so long as the enable bit itself is set and enough upstream logic is reset so reset values will propagate through to the enabled logic in time, you are most likely OK with not reseting the logic in between. However, you do want to reset any logic that feeds back into itself, (for example state machines) otherwise the upstream resets will never be able to establish a known state in the feedback block.
I agree with Morten Zilmer that you should only reset flops that require resetting, although my background is more FPGA than ASIC.
It's worth pointing out there is a gotcha in Verilog / SystemVerilog - if you have a clocked process that drives registers that are reset and registers that aren't you will end up inferring a clock enable or an additional mux on the input of your flip-flop.
This is usually not what was intended.
There is a more detailed explanation in this answer. I also wrote a blog post outlining a mechanism for abstracting away synchronous/asynchronous and active high/low reset.
As a general rule of thumb, you should probably always reset control signals.
For data flops, resetting can cost you area, so it really depends on whether you care about area.
In recent years simulators started to support X propagation modes that allow you to catch some of the X issues in RTL (instead of gate level simulation). It is a good practice to run these to make sure you don't have a reset problem with uninitialized sram or flops.

variable assignment and synthesizable code

Simply having a code like this :
if(rising_edge(clk)) then
temp(0):=temp(3)xor temp(5);
end if
For the example above all this variable assignment would be done in 1 clock cycle which is pretty unpractical. In the behavioral simulation it works fine but in post synthesis it's messed up. Can I add like a delay or a sth like a wait(wait statement is un-synthesizable) to make it wait util the variable gets its value before jumping to the next line?
Doing all of those things in one clock cycle is simple. Hardware is extremely fast, and FPGA clock rates aren't that high relative to processors.
Since you are using variables, the intermediate results are used immediately. If you want a more explicit delay, you could use a signal. The above code with signals would use temp(3) from the previous rising edge.
for synthesis you can not make delays like wait. well defined, controllable delays in synthesis can only be made with pipelining (clock cycles as delay units).
