Synthesizable delayed buffer in VHDL

Synthesizable delayed buffer in VHDL - vhdl

I am trying to generate a synthesizable buffer in VHDL for a time-to digital project in FPGA.
I have been looking around but cannot find any set-up out there.
I have been recommended that stackoverflow has very good answers.
Could you please give me some tips for this course work, and I would be very greatful to any approach you might come up with.
Thank you a lot in advance!
Regards

Doing time-delay-circuits (TDC) is somewhat hard right now.
Basically, it boils down to having HDL that describes multiple registers all reading the same signal. You then need to apply a keep directive, e.g. equivalent_register_removal for Xilinx. You will possibly also need a timing ignore constraint on the signal you are sampling.
You then need to carefully examine the fabric of your FPGA and make sure your flop flops are placed in the same slice across multiple sites that can all be connected through the same kind of wire (check FPGA Editor), i.e. will have the same time delay.
You can build a minimal test design for Xilinx in FPGA editor. Once you have the routing down, you can then formulate appropriate constraints for your UCF file and build much bigger, more complex TDCs.
I'm only familiar with Altera from a few years ago. But Altera doesn't give you an interface like Xilinx's FPGA editor, so you're on your own determining the placement of your flops. I saw a presentation once about a university work group doing TDCs with Altera and ultimately it boiled down to measuring the resolution by using input stimuli to check whether the design was routed according to their wishes. If it was not, they would adjust some timing parameters out of sensible bounds, rinse and repeat.
The last step of course is to sample your signal in the synchronous part of your design (where the counter is) and read the counter plus flip flop contents when the event you wanted occurs (i.e. rising edge, falling edge). Then you have major time units in your counter and minor time units as a bitfield in the flip flop state.
If you want even spread among your flip flop delays, you will need to carefully examine the delay length of paths between the flip flops and adjust for your overall clock period.
So basically, counter * clock_period + index_of_highest_set_bit_in_flip_flop_state * path_delay is then your delay time.
You will also need to check the FPGA datasheet to know your minimal timings, i.e. the fastest toggle time the input buffer can achieve, the minimal setup and hold time of your flops etc.

Related

Is clock usage recommended in VHDL design?

I am doing a small task where I have to count the pulses coming from two inputs. The requirement doesn't specify clock. Currently I have a process that is triggered when any of the input changes and then corresponding count is increased.
My question is Should I use clock for this design and make the process Clock sensitive and then check if inputs have Changed? Is it a good practice to use Clocks in VHDL design?
Sub-question- I have to double buffer the input data. Does this mean I have to use clock and pass inputs through two flipflops? or is there a way to double buffer data without using clock?

It is possible to design asynchronous circuit with VHDL. The design rules are a bit different than synchronous design (using request and acknoledge signals).
Your need is not very complex and could be designed without clock but you have to be careful with your memorisation element. Specialy if you work with an FPGA, these devices are not supposed to work asynchonously. So look carefully the synthetizer results.
(If it's school homework, use a clock ;) In digital design, the clock usage is the default case. Asynchronous logic is an advanced concept)

Is it possible to design a latch based FIFO instead of FF?

A latch based fifo (i.e. level sensitive latch) might be cheaper in terms of area than FF based FIFO. I'm looking for a latch based FIFO design code or architecture. So far I didn't come across any. Is it possible to design one? I'm looking for some papers or idea to get started...

You can use pulse latches, which retain the advantages of both latches and flip-flops, offering higher performance and lower power consumption, but they are not often "fully" supported by common CAD tools.
Alternatively, you can convert your flops into two level-sensitive master/slave latches. A flip flop can be implemented by two opposite phase latches. This is usually done to enable time borrowing and does not necessarily result a smaller/faster circuit. This way your FIFO structure is very similar to the flop-based design, except that each flop is replaced by two latches.

It is possible to use latches for fifos, though I don't have any code handy to show how. Typically, I have seen fifos implemented as a 'sram' for the storage with a wrapper for the fifo logic around it. This structure can also handle different read/write clocks relatively naturally.
I don't know the exact heuristics, but I think
small sram cells are implemented using flops.
medium sram cells are implemented using latches.
large sram cells are implemented using actual ram cells.
There is some crossover point between using flops and latches, where the extra overhead of control logic and routing for the latches becomes worth the area saving in the actual storage.

Vhdl with no clk

I have a clock in my vhdl code but i don't use it , simply my process just depends on handshake when one component finishes and gets an output out , this output is in the sensitivity list of my FSM and is then becomes an input to the next component and of course its output is also in the sensitivity list of my FSM(so to know when will component finishes its computation)... and so on.
Is this method wrong ? it works in simulation and also in post-route simulation but gets me warnings like this : warning :HOLD High VIOLATION ON I WITH RESPECT TO CLK; and
warning :HOLD Low VIOLATION ON I WITH RESPECT TO CLK;
is this warnings not important or will my code damage my fpga because it doesn't depend on a clock ?

The warning you are getting are timing violations. You get these because the tools detect that your design does not obey the necessary timing restrictions for the internal primitives.
For instance, inputs to lookup-tables (which is one of the main building-blocks inside an FPGA) need to be held for a specific time for the output to stabilize. This is very hard to guarantee when your entire timing relies only on the latencies and delays of the components themselves, and switch on a completely asynchronous basis.
Depending on your actual design (mostly the size and complexity of it), I'll wager the guess that you'll end up with a lot of very-hard-to-debug errors once you get it inside an FPGA. You'll have a much, much, much easier time using a clock. This will allow you to have a clear idea of when signals arrive where, and it will allow you to use the internal tools to check your timing. You'll also find it much easier to interface to other devices, and your system will be less susceptible to noisy inputs.
So all in all, use a clock. You (probably) wont damage your FPGA by not doing it, but a clock will save you from tons of trouble.

your code does most probably not damage your FPGA because it doesn't depend on a clock. however, for synthesis you should always use registered (clocked) logic. without using a clock your design will not be controllable because of timing/delay/routing/fan out/... this will let your FSM behave "mysteriously" when synthesized (even if it worked in simulation).
you'll find plenty of examples for good FSM implementation style with google's help (search for Moore or Mealy FSM)

Definitely use a clock. And only one clock throughout the design. This is the easiest way - the tools support this design style very well. You can often get away with a single timing constraint, especially if your inputs are slow and synchronous to the same clock.
When you have gained experience designing this way, you can move outside of this, but be ready for more analysis, timing constraints and potentially build iterations while you learn the pitfalls of crossing clock-domains and asynchronous signals.

What are tsetup and thold in VHDL?

I am learning VHDL. When I tried to make a testbanch I run into these words. What do they mean? I could find any simple explanaition on google.
Thanks in advance.

tSetup and tHold aren't VHDL keywords to my knowledge but the minimum setup and hold time for the device being simulated to operate correctly.
tSetup - The amount of time the data/control needs to be valid before the clock edge.
tHold - The amount of time the data/control needs to be valid after the clock edge.
A simple graphic explaining this:
http://en.wikipedia.org/wiki/Flip-flop_%28electronics%29#Setup.2C_hold.2C_recovery.2C_removal_times

As TOTA says, setup and hold times are digital logic design terms, not VHDL terms.
The vast majority of the time, you do not need to concern yourself with them in testbenches as you are almost always testing internal blocks within your chip and the tools will manage all the timing for you.
When you are working at the device pin level, you can set you models up to check the setup and hold times for violations. When simulating RTL, there are no delays (usually) modelled, so your timing should be fine. You can later simulate a back-annotated netlist which has all the real chip delays included and check that you are still going to meet all the timing requirements of your external devices.

Is it necessary to register both inputs and outputs of every hardware core?

I am aware of the need to synchronize all inputs to an FPGA before using those inputs in order to avoid metastability. I'm also aware of the need to synchronize signals that cross clock domains within a single FPGA. This question isn't about crossing clock domains.
My question is whether it is a good idea to routinely register all of the inputs and outputs of every internal hardware module in an FPGA design. The rationale is that we want to break up long chains of combinational logic in order to improve the clock rate so that we can meet the timing constraints for a chosen clock rate. This will add additional cycles of latency proportional to the number of modules that a signal must cross. Is this a good idea or a bad idea? Should one register only inputs and not outputs?
Answer Summary
Rule of thumb: register all outputs of internal FPGA cores; no need to register inputs. If an output already comes from a register, such as the state register of a state machine, then there is no need to register again.

It is difficult to give a hard and fast rule. It really depends on many factors.
It could:
Increase Fmax by breaking up combinatorial paths
Make place and route easier by allowing the tools to spread logic out in the part
Make partitioning your design easier, allowing for partial rebuilds.
It will not magically solve critical path timing issues. If there is a critical path inside one of your major "blocks", then it will still remain your critical path.
Additionally, you may encounter more problems, depending on how full your design is on the target part.
These things said, I lean to the side of registering outputs only.

Registering all of the inputs and outputs of every internal hardware module in an FPGA design is a bit of overkill. If an output register feeds an input register with no logic between them, then 2x the required registers are consumed. Unless, of course, you're doing logic path balancing.
Registering only inputs and not outputs of every internal hardware module in an FPGA design is a conservative design approach. If the design meets its performance and resource utilization requirements, then this is a valid approach.
If the design is not meeting its performance/utilization requirements, then you've got to do the extra timing analysis in order to reduce the registers in a given logic path within the FPGA.

My question is whether it is a good idea to routinely register all of the inputs and outputs of every internal hardware module in an FPGA design.
No, it's not a good idea to routinely introduce registers like this.
Doing both inputs and outputs is redundant. They'll be no logic between the output register and the next input register.
If my block contains a single AND gate, it's overkill. It depends on the timing and design complexity.
Register stages need to be properly thought about and designed. What happens when a output FIFO fills or other stall conditions? Do all signals have the right register delay so that they appear at the right stage in the right cycle? Adding registers isn't necessarily as simple as it seems.
The rationale is that we want to break up long chains of combinational logic in order to improve the clock rate so that we can meet the timing constraints for a chosen clock rate. This will add additional cycles of latency proportional to the number of modules that a signal must cross. Is this a good idea or a bad idea?
In this case it sounds like you must introduce registers, and you shouldn't read the previous points as "don't do it". Just don't do it blindly. Think about the control logic around the registers and the (now) multi-cycle nature of the logic. You are now building a "Pipeline". Being able to stall a pipeline properly when the output can't write is a huge source of bugs.
Think of cars moving on a road. If one car applies it's brakes and stops, all cars behind need to as well. If the first cars brake lights aren't working, the next car won't get the signal to brake, and it'll crash. Similarly each stage in a pipeline needs to tell the previous stage it's stopping for a moment.
What you can find is that instead of having long timing paths along your computation paths going from input to output, you end up with long timing paths on your enable controlling all these register stages from output to input.

Another option you have is, to let the tools work for you. Add add the end of your complete system a bunch of registers (if you want to pipeline more) and activate in your synthesis tool retiming. This will move the registers (hopefully) between the logic where it is most useful.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio