VHDL simulation what is the correct delta? - vhdl

I am currently implementing a MUX, and to test this I've created a generator and a monitor to well generate data as input and monitor its output.
The MUX takes Avalon Streaming interface as input and output and therefor also supports back pressure.
My question is. My test bench run on falling edge while my DUT and input data is generate at rising edge. Both my input clock and my input data is generated at Delta cycle 0. However my back pressure ready signal returning from the DUT and which controls the generator is set at Delta 3. Now this gives some sampling problems because the DUT must only load data every time data from the generator (at delta 0) is valid and the DUT ready is valid (The back pressure signal at Delta 3).
Now if I skew my DUT input clock with 1 ps it fixes the problem. But it feels like that is the wrong approach. What is the correct design principle here. ?
Skew the clock 1 ps or at least move it 4 deltas so i make sure all my signals have been set before rising_edge ?
or
Move the data I generate so it aligns with the DUT output ready signal ?
or
Is it just a decision made from test bench to test bench ?
I've also thought that a clock in a test bench should be generated at delta 0 and everything else must come after.
I am simulating in Riviera-pro

You have various choices:
i) Make everything synchronous. In other words, drive the inputs and sample the outputs on the same edge of the clock as the DUT uses. Afterall, the DUT doesn't suffer any race problems, so if you just extend the clocking strategy to the testbench, everything will work fine. At RTL, but not at gate-level. So if you're doing gate-level sims (which you should be), then this strategy is no good for that.
ii) Clock everything in your testbench off the opposite edge of the clock to the edge the DUT uses. Again, fine for RTL, but whether fine for gate-level depends on the delays through your design.
iii) Drive the inputs to the DUT just after the clock edge and sample the DUT outputs just before it. The clock edge being the edge that the DUT uses. Again, this is fine for RTL, at is the most robust for gate-level, too.
iv) Implement realistic timing for each DUT interface. That ought to work for RTL and gate-level and if it doesn't work for gate-level then the fault is with the DUT not the testbench.

Related

How does Vivado ensure that signals do not transition unpredictably when they are sampled by a clock edge and how does this relate to simulation?

I wrote some VHDL code and I wrote a simple sequential code defining a clock sensitive process. Whenever the clock rises from low to high I check another signal in the architecture and I do stuff depending on its value. Nevertheless, this signal transitions at the same time instant as the rising edge of the clock occurs.
In simulation, when the rising edge arises, the system always samples the signal value before its transition. My question is: how does this work out once the code is implemented on the corresponding FPGA? Does it produce unpredictable sampling of the signal value? Do you advice to always avoid this type of scenario within a VHDL architecture?

Best Route For Input Clocks on Kintex7 FPGA

I'm looking for advice on a less than ideal situation.
I've inherited a project where we have a hardware design issue. We generate a clock to a chip which feeds the clock back in over a none clock-capable input. This works at up to 160MHz but we are looking to increase the clock so I'm researching IO options. This is used to clock 8 parallel data inputs.
Right now the data inputs go through a delay and a IDDR block. The output is fed to a FIFO. Our clock is still routed to a BUFG - so we have:
Data - IDELAY - IDDR - FIFO
Clock - BUFG ----^------^
I read somewhere that routing to a BUFG has a large delay so a BUFR-BUFIO is better. Is this the case? Have I missed a better option?
When you say generating a clock to "a chip", I will assume that you mean the Kintex7 chip.
The delay is not a problem. The issue is for your timing closure to be set up properly so that the static timing analysis can validate whether you violate any setup or hold time in all boundary corners of the board.
If you look at DS182 document, you will find under AC Switching characteristics which will give you a rough idea on how well the chip can perform.
However, the best is to let the timing analyzer inside Vivado calculate for you whether your desired clock frequency will be able to close timing.
You just need to make sure
The data input is synchronous to your final clock.
If it isn't, then clock that data input across two stages of registers with respect to the final clock.
Specify your timing constraints
Run through synthesis and implementation
Check the timing to see that there are no violations.
Or maybe I did not understand something about what you are trying to do.

FPGA Will pausing entities (by pausing their clock input) reduce the overall power consumption?

I'm currently creating a multiple entity project where all of the entities have clock synchronous architecture (no behaviourals) and most of the entities work on derived clocks.
I'm using DE0 Nano, so my source clock is 50MHz and I have 4 derived clocks: 1 MHz, 500 kHz, 10 Hz and 1 Hz.
Disclamer: While I am aware that doing things this way is much less power-efficient, I've been wondering if there is something I could do to remedy this at least a little (I'm open to ideas).
Now, in the top-level entity I have an "event handler", which can decide which entities should work at any given moment.
Therefore I came up with an idea to wire an on/off clock switch for the derived clock input signals to the lower level entities and disable some of them (the clock inputs) when I don't need a given entity to work for a while (as I understand, this should stop their processes from firing for that time).
Since I don't have an easy way to test that idea (I estimate it will take a moderate amount of work and time, especially setting up the power consumption measurement) I wanted to ask whether anyone tried something similar and/or knows if it's worth a shot?
For your information, currently when the entities are in "sleep" mode, their processes fire on each rising clock edge, check an internal state or flag variable/signal and stop e.g.:
process (1MHz clock) is
variable ...
begin
if rising_edge(clk) then
if state = ready and enable = '1' then
...
end if;
end if;
end process;
Or maybe there an other, better way to do it?

How to work with DDR in synthesizeable Verilog/VHDL?

I am working on implementing a DDR SDRAM controller for a class and am not allowed to use the Xilinx MIG core.
After thrashing with the design, I am currently working synchronously to my system clock at 100MHz and creating a divided signal "clock" (generated using a counter) that is sent out on the IO pins to DDR SDRAM. I have some logic that feeds me the "rising" edge strobes of this signal clock as I am aware that I cannot use a signal to clock a process. However, this divided clock method runs very slow and I have concerns that I am not meeting the minimum required frequency of the external DDR SDRAM. I am hoping to speed up my read/write bursts, but to do so, my spartan3e will struggle with anything higher than 100MHz. After looking around online, I found this code from EDA Board:
process(Input_Clk,Reset_Control)
begin
if (Reset_Control = '1') then
Output_Data <= (others => '0');
elsif rising_edge(Input_Clk) then
Output_Data <= Input_Data1;
elsif falling_edge(Input_Clk) then
Output_Data <= Input_Data2;
end if;
end process ;
I have written a lot of VHDL, but have never seen something like this before. I'm sure this works fine in simulation, but it doesn't look synthesizable to me. The poster said this should be supported by 1076.6-2004. Does this infer two flip-flops, one clocked on the rising edge and one on the falling edge whose outputs both feed into a 2:1 mux? Does Xilinx support this? I want to avoid having to instantiate a DCM as crossing these clock domains will definitely slow me down and will add undesired complexity. Is there a way to safely generate my DDR data that is being sent to and received from DDR SDRAM without the Xilinx primitive for the MIG? How would I perform the receiving of DDR data in Verilog?
For reference, we have to code in Verilog, so I'm not too sure on how to translate that VHDL process to a Verilog always block if it is synthesizable. We are using the Micron MT46V32M16 if that is relevant.
Here are the timing diagrams for what I am trying to replicate:
I would say that implementing a DDR controller 'for class' is rather challenging. In the companies I worked for they where left for senior engineers to build.
First about the Verilog code shown:
Yes, you are right that can not be synthesized.
The approach to double-clocking inputs is to have two data paths. One on the rising edge and one on the falling edge. In a second stage the two are put in parallel but with double the data width. Thus a 32-bit wide DDR produces 64 data bits per 'system' clock.
More difficult is to clock the arriving data at the right time. As your read diagram shows the data arrives in the middle of the clock edge. For that you need a delayed clock. In an ASIC that is done using a 'tune-able' delay line which is calibrated at start-up and regularly checked for the phase. In an FPGA that would requires some esoteric logic.
I have not been close to DDRs chips for a while, but I think all the modern ones (DDR2 and up?), output a clock themselves to help with the read data.
Also after you have clocked the read data in, using that shifted clock, you have to get the data back to the system clock which requires an asynchronous FIFO.
I hope that gets you started.

Register with enable

Here is the scenario:
I have a register with enable (call it RegA). The input put of RegA is pulled high permanently.
Meanwhile, the enable line of RegA is connected to the output of RegB through some simple combinational logic.
Now in the scenario, on the next clock pulse the output of RegB will will go high for just one clock cycle.
My question is, will I see the output of RegA go high in the same clock cycle that RegB goes high, OR will RegA go high on the next clock cycle, OR is it possible that it may never go high due to a race condition?
From experience, I feel like RegA will go high on the same clock cycle that RegB goes high, however, I'm wondering if this is bad practice and unreliable. I'm thinking there could be a race condition between the signal getting to enable line and the clock edge to RegA going high. Since the enable line goes through some combinational logic, it would seem it would loose that race every time and thus RegA wouldn't recognize that the enable line is high in the same clock cycle that RegB goes high.
I'm assuming that the enables you are talking about are clock enables? In this case you will get a one clock cycle delay before RegA goes high, if I understand you correctly. Explanation:
RegA will only react to clock cycles if its enable input is active when the clock arrives. However, since RegB has some internal delay, and since there even is some additional combinatorial delay from its output until it reaches the RegA enable, the active signal won't "make it" to RegA before RegA has already ignored the clock cycle.
This works both ways though, so the active enable signal will also not have gone away when the second clock cycle arrives, thus making RegA see the clock cycle and react to it. During the next clock cycle, the enable will be inactive again.
Remember though that a deactivated clock enable simply causes the clock input to be ignored, and the register will thus hold its value when the clock enable input is inactive.
This is not a race condition (unless you have a poorly designed system with a lot of clock skew for instance, but then you have a lot of other problems too), and can be reliably used - otherwise a lot of the stuff FPGA designers take for granted would be impossible to do.
As long as your clock distribution is OK (for example, in an FPGA this will be managed for you by the tools) then you will get well-defined behaviour.
On the first clock pulse, the output of RegB will go high just after the clock-edge. RegA will therefore have "seen" a low on its enable at the point of the clock transition, so it will not change.
On the next clock cycle, RegB's output will go low just after the clock edge. However, this is too late for RegA as it has already "looked at" the enable signal (when the clock edge came) - it will see its enable signal is high, and will transfer the high input to the output (after a very short delay).
So, yes, you will get an extra cycle delay.

Resources