Here is the scenario:
I have a register with enable (call it RegA). The input put of RegA is pulled high permanently.
Meanwhile, the enable line of RegA is connected to the output of RegB through some simple combinational logic.
Now in the scenario, on the next clock pulse the output of RegB will will go high for just one clock cycle.
My question is, will I see the output of RegA go high in the same clock cycle that RegB goes high, OR will RegA go high on the next clock cycle, OR is it possible that it may never go high due to a race condition?
From experience, I feel like RegA will go high on the same clock cycle that RegB goes high, however, I'm wondering if this is bad practice and unreliable. I'm thinking there could be a race condition between the signal getting to enable line and the clock edge to RegA going high. Since the enable line goes through some combinational logic, it would seem it would loose that race every time and thus RegA wouldn't recognize that the enable line is high in the same clock cycle that RegB goes high.
I'm assuming that the enables you are talking about are clock enables? In this case you will get a one clock cycle delay before RegA goes high, if I understand you correctly. Explanation:
RegA will only react to clock cycles if its enable input is active when the clock arrives. However, since RegB has some internal delay, and since there even is some additional combinatorial delay from its output until it reaches the RegA enable, the active signal won't "make it" to RegA before RegA has already ignored the clock cycle.
This works both ways though, so the active enable signal will also not have gone away when the second clock cycle arrives, thus making RegA see the clock cycle and react to it. During the next clock cycle, the enable will be inactive again.
Remember though that a deactivated clock enable simply causes the clock input to be ignored, and the register will thus hold its value when the clock enable input is inactive.
This is not a race condition (unless you have a poorly designed system with a lot of clock skew for instance, but then you have a lot of other problems too), and can be reliably used - otherwise a lot of the stuff FPGA designers take for granted would be impossible to do.
As long as your clock distribution is OK (for example, in an FPGA this will be managed for you by the tools) then you will get well-defined behaviour.
On the first clock pulse, the output of RegB will go high just after the clock-edge. RegA will therefore have "seen" a low on its enable at the point of the clock transition, so it will not change.
On the next clock cycle, RegB's output will go low just after the clock edge. However, this is too late for RegA as it has already "looked at" the enable signal (when the clock edge came) - it will see its enable signal is high, and will transfer the high input to the output (after a very short delay).
So, yes, you will get an extra cycle delay.
Related
I wrote some VHDL code and I wrote a simple sequential code defining a clock sensitive process. Whenever the clock rises from low to high I check another signal in the architecture and I do stuff depending on its value. Nevertheless, this signal transitions at the same time instant as the rising edge of the clock occurs.
In simulation, when the rising edge arises, the system always samples the signal value before its transition. My question is: how does this work out once the code is implemented on the corresponding FPGA? Does it produce unpredictable sampling of the signal value? Do you advice to always avoid this type of scenario within a VHDL architecture?
I am working on implementing a DDR SDRAM controller for a class and am not allowed to use the Xilinx MIG core.
After thrashing with the design, I am currently working synchronously to my system clock at 100MHz and creating a divided signal "clock" (generated using a counter) that is sent out on the IO pins to DDR SDRAM. I have some logic that feeds me the "rising" edge strobes of this signal clock as I am aware that I cannot use a signal to clock a process. However, this divided clock method runs very slow and I have concerns that I am not meeting the minimum required frequency of the external DDR SDRAM. I am hoping to speed up my read/write bursts, but to do so, my spartan3e will struggle with anything higher than 100MHz. After looking around online, I found this code from EDA Board:
process(Input_Clk,Reset_Control)
begin
if (Reset_Control = '1') then
Output_Data <= (others => '0');
elsif rising_edge(Input_Clk) then
Output_Data <= Input_Data1;
elsif falling_edge(Input_Clk) then
Output_Data <= Input_Data2;
end if;
end process ;
I have written a lot of VHDL, but have never seen something like this before. I'm sure this works fine in simulation, but it doesn't look synthesizable to me. The poster said this should be supported by 1076.6-2004. Does this infer two flip-flops, one clocked on the rising edge and one on the falling edge whose outputs both feed into a 2:1 mux? Does Xilinx support this? I want to avoid having to instantiate a DCM as crossing these clock domains will definitely slow me down and will add undesired complexity. Is there a way to safely generate my DDR data that is being sent to and received from DDR SDRAM without the Xilinx primitive for the MIG? How would I perform the receiving of DDR data in Verilog?
For reference, we have to code in Verilog, so I'm not too sure on how to translate that VHDL process to a Verilog always block if it is synthesizable. We are using the Micron MT46V32M16 if that is relevant.
Here are the timing diagrams for what I am trying to replicate:
I would say that implementing a DDR controller 'for class' is rather challenging. In the companies I worked for they where left for senior engineers to build.
First about the Verilog code shown:
Yes, you are right that can not be synthesized.
The approach to double-clocking inputs is to have two data paths. One on the rising edge and one on the falling edge. In a second stage the two are put in parallel but with double the data width. Thus a 32-bit wide DDR produces 64 data bits per 'system' clock.
More difficult is to clock the arriving data at the right time. As your read diagram shows the data arrives in the middle of the clock edge. For that you need a delayed clock. In an ASIC that is done using a 'tune-able' delay line which is calibrated at start-up and regularly checked for the phase. In an FPGA that would requires some esoteric logic.
I have not been close to DDRs chips for a while, but I think all the modern ones (DDR2 and up?), output a clock themselves to help with the read data.
Also after you have clocked the read data in, using that shifted clock, you have to get the data back to the system clock which requires an asynchronous FIFO.
I hope that gets you started.
can you tell me difference between Test pin and Ready pin in 8086 microprocessor because both of them deal with wait instructions?
TEST: input is examined by the ‘‘Wait’’ instruction. If the TEST input is
LOW execution continues, otherwise the processor waits in an ‘‘Idle’’
state. This input is synchronized internally during each clock cycle on
the leading edge of CLK.
READY: is the acknowledgement from the addressed memory or I/O
device that it will complete the data transfer. The READY signal from
memory/IO is synchronized by the 8284A Clock Generator to form
READY. This signal is active HIGH. The 8086 READY input is not
synchronized. Correct operation is not guaranteed if the setup and hold
times are not met.
If you read the description of the READY signal, the wait instruction is not mentioned.
The READY signal is sampled on each and every memory or I/O cycle. If a device is not capable of responding to the CPU's request in the standard bus cycle, the READY signal can be used to stretch out the cycle, giving it more time.
This is done by signalling to the CPU that the device is not READY. The CPU adds a clock cycles to the bus transaction until it is READY. These extra cycles are given the confusing name of "WAIT STATES", and have nothing to do with the WAIT instruction or the TEST line. Many years ago, makers of fast memory would brag "No wait states!"
The part about the 8284a refers to the details of ensuring that the READY input meets the timing requirements of the processor. Namely the so called setup and hold times, normally only of concern to the engineer designing the computer system.
In your question, you can see that the TEST input is explicitly sampled by the WAIT instruction. The TEST input is simply an input signal with a dedicated pin on the processor (TEST) sampled by a dedicated instruction (WAIT).
Most processors have signals similar to the READY line. The TEST line is rather more rare.
I am currently implementing a MUX, and to test this I've created a generator and a monitor to well generate data as input and monitor its output.
The MUX takes Avalon Streaming interface as input and output and therefor also supports back pressure.
My question is. My test bench run on falling edge while my DUT and input data is generate at rising edge. Both my input clock and my input data is generated at Delta cycle 0. However my back pressure ready signal returning from the DUT and which controls the generator is set at Delta 3. Now this gives some sampling problems because the DUT must only load data every time data from the generator (at delta 0) is valid and the DUT ready is valid (The back pressure signal at Delta 3).
Now if I skew my DUT input clock with 1 ps it fixes the problem. But it feels like that is the wrong approach. What is the correct design principle here. ?
Skew the clock 1 ps or at least move it 4 deltas so i make sure all my signals have been set before rising_edge ?
or
Move the data I generate so it aligns with the DUT output ready signal ?
or
Is it just a decision made from test bench to test bench ?
I've also thought that a clock in a test bench should be generated at delta 0 and everything else must come after.
I am simulating in Riviera-pro
You have various choices:
i) Make everything synchronous. In other words, drive the inputs and sample the outputs on the same edge of the clock as the DUT uses. Afterall, the DUT doesn't suffer any race problems, so if you just extend the clocking strategy to the testbench, everything will work fine. At RTL, but not at gate-level. So if you're doing gate-level sims (which you should be), then this strategy is no good for that.
ii) Clock everything in your testbench off the opposite edge of the clock to the edge the DUT uses. Again, fine for RTL, but whether fine for gate-level depends on the delays through your design.
iii) Drive the inputs to the DUT just after the clock edge and sample the DUT outputs just before it. The clock edge being the edge that the DUT uses. Again, this is fine for RTL, at is the most robust for gate-level, too.
iv) Implement realistic timing for each DUT interface. That ought to work for RTL and gate-level and if it doesn't work for gate-level then the fault is with the DUT not the testbench.
I'm still learning VHDL for synthesis purposes on a custom Xilinx Spartan-6 based board. My design includes a lot of FSM and I've just learned in a previous question that the single process implementation is a lot better and much easier to use.
I also learned that initialization values for signals are actually synthetizable.
So here is the question: do I really need a reset signal to put the FSM in idle with default outputs, IF I don't need to interrupt the FSM mid flow OR I already have another signal that stops it?
let's see what is the Xilinx appraoch on reset :
Xilinx FPGA includes "Global Set/Reset" module which automatically set all signals at their initialisation values at start-up. The initialisation value is declared as follow:
signal foo : std_logic := '0';
-- ^ initialisation value
When designing a new part of code, you have to think twice for each bit if it needs to be reset by something else than the GSR, because using your own global reset is actually using a second global reset.
For your FSM, it has a startup state (IDLE) and will never be reset in the whole bitstream life. We can say at first that the FSM do not need a reset. But if you just do it like it, you'll be exposed to metastability issues. The GSR is quite slow to deassert its reset and it does it asynchronously. All flip-flop won't be released at the same time and your FSM can go in an illegal state.
So, use a local reset for your FSM (and counters as well).
To complete the reset question:
avoiding the use of global reset has better place and route result, which leads to less timing errors. A global reset uses the same network as others signals in the design, it prevents some routing resource to be available for other signal distribution.
if you really need the use of a reset, prefer an active high synchronous reset or at least an active high reset, activated asynchronously and deactivated synchronously. Active High because Xilinx Flip-Flop uses active high SET and RESET, synchronous to avoid metastability problem.
Workaround:
A solution to avoid the local reset on the FSM could be the use of a bufgce module at clock entry. At startup, this module do not feed the design with the clock and wait for some clock cycles before enabling the clock. Only a local reset is used here to manage the enable input of the BUFGCE and the reset of the FGPA is reset free.
I don't know how many clock cycles have to be waited, but it can do it. The first approach is still the best for now.
If your state variable is initialised to 'idle', then having a reset which forces it to 'idle' is only useful if you need it for some other reason. One major example would be if the state machine has states, where, on noticing an erroneous input, it deliberately stops to wait for something to reset it, before resuming normal operation.
The machine might also be running from a clock that is not guaranteed to be glitch free, or is for some reason not 100% reliable. In this case it can be sensible to include a reset, so that something like a host processor or other FPGA logic can somehow detect that the state machine is no longer working, and reset it.
Lots of people seem to have a reset signal in most processes they write, but it's perfectly valid to rely on signal and output initialisation values, if the machine then meets your design requirements. If all the reset does is assert itself briefly during startup, and never again, I would say there's not much point in it.
[EDIT] Per other answers, relying on initialisation values is normally only valid in SRAM-based FPGA designs.
There are a couple of white papers from Xilinx that address the issue and they actually show up as the first two items googling for Xilinx reset.
These are WP272 Get Smart About Reset: Think Local, Not Global and WP275 Get your Priorities Right – Make your Design Up to 50% Smaller.
The first paper does a fair job of pointing out where you should use a reset as opposed to where you can depend on configuration and default values.
The second paper also points out that the reasons why are vendor and technology dependent. You could also note the reason for eliminating 'unnecessary' resets is to preserve place and route resources.
Because you don't elaborate the details of a Finite State Machine implementation while asking if the reset is really necessary, note the claim in WP272 where an asynchronous reset can be deleterious for a One Hot State Machine which would benefit from configuration load (a default value), synchronous reset or a clock synchronized asynchronous reset.
Your VHDL code with (proper) resets is ultimately more portable, should your design ever be intended for an ASIC or some other non- bit image loaded solutions. For those soft loaded designs the ultimate reset is embodied in a configuration load.
Otherwise the purpose is to conserve place and route resources.