My design uses an Xilinx FPGA.
The synthesis report shows the following results:
Timing Summary:
---------------
Speed Grade: -3
Minimum period: No path found
Minimum input arrival time before clock: 1.903ns
Maximum output required time after clock: 150.906ns
Maximum combinational path delay: 97.819ns
I do not know if I should use 150.906 ns or 97.819 ns to calculate throught.
What is maximum clock delay?
I havn't heard the term 'throught' with respect to circuit timing anytime before, but maybe my explanantion will give you the right hint.
At first, the maximum clock delay can be found in the Static Timing Report after Place & Route. But, this figure is mostly meaningless because one must also take the maximum data delay from any input or to any output into account. The result is already provided by the synthesis report. Please note, that this report only provides estimated results. Real results are only available from the Static Timing Report.
If you look for the maximum clock frequency (the inverse of the minimum clock period), then your synthesis report states, that your design does not include a path from one FF to another driven by the same clock ("Minimum period: No path found").
If you want to synchronously communicate with another IC on your PCB then the other 3 numbers are relevant. For example, the line "maximum output required time after clock" states that, all output signals are valid 151 ns after the clock signal toggles at the input pin (rising or falling edge depending on your design). If any of this outputs drive the inputs of another IC and if this IC is driven by the same clock source, then you must add the "minimum input arrival time" of this second IC (found in its data sheet). If this time is for example 49 ns then, the minimum period of your shared clock would be (your) 151 ns + 49 ns = 200 ns, that would be 5 MHz.
Same applies for the "minimum input arrival time before clock" of your FPGA design which must be added to the "maximum output required time" of the driving IC. If this time is for example 31 ns, then the minimum period of your shared clock would be 31 ns + (your) 2 ns = 33 ns, that would be 30 MHz.
In the same way, the "maximum combinational path delay" must be added to the "maximum output required time" of the IC which drives your inputs plus the "minimum input arrival time" of the IC your FPGA is driving. Given the same example figures from above, then the minimum period of your shared clock would be 31 ns + (your) 98 ns + 49 ns = 178 ns, that would be 5.6 MHz.
More details are explained in Xilinx Timing Constraint User Guide. Above, I explained the System Synchronous mode.
A more compact representation for Xilinx Vivado is given in Vivado Design Suite User Guide - Using Constraints.
There was also this presentation earlier available on the internet, but I didn't find the source PDF anymore.
Related
Timing Summary:
Speed Grade: -1
Minimum period: 4.979ns (Maximum Frequency: 200.844MHz)
Minimum input arrival time before clock: 1.459ns
Maximum output required time after clock: 0.833ns
Maximum combinational path delay: No path found
This is obtained from xilinx 14.7 same cannot be found in xilinx vivado and by adding constraint wizard and calculating 1/(T-WNS ) = MAX Frequency is too low when compared with same programs in two different tool (but same hardware and same program)
I have a large ISim design for Spartan-6 using about 6 of the Spartan-6 FPGA IP cores. It needs to run for a simulation time of 13 seconds, but at present takes 40 seconds to run a simulation time of 1 ms. During the 13 seconds it will also write 480000 24 bit std_logic_vectors to a text file.
This equates to running time of 144 hours to run the entire simulation (almost a week!).
Is there a way, for example, of increasing the step size or turning off the settings for waveform plotting etc, or any other settings I can use to increase the simulation speed?
So far I have tried not plotting the waveform, but it doesn't seem to actually increase the speed.
Thanks very much
Yes adding signals to the waveform slowes every simulator down... but running such long simulations always create GiB of data and take hours or days.
You could check your code and:
improve sensitivity lists to reduce calculation cycles
some IP cores have a fast simulation mode which can be enabled by a generic parameter.
But in general there is only one solution: use another simulator. Especially one with optimization. (Can be disabled or restricted in free editions) E.g.:
GHDL - is open source and quite fast
QuestaSim / ModelSim
ModelSim is for example included in Altera Quartus Prime (WebPack) for free as Starter Edition.
Active-HDL
Active-HDL Student Edition is free to use. Alteratively, it's included in Lattice Diamond.
P.S. 40 sec for 1 ms (25 us per second) is very fast. My integration simulations usually calculate 20 ns per second. So you are 1000x faster)
I have designed an algorithm-SHA3 algorithm in 2 ways - combinational
and sequential.
The sequential design that is with clock when synthesized giving design summary as
Minimum clock period 1.275 ns and Maximum frequency 784.129 MHz.
While the combinational one which is designed without clock and has been put between input and output registers is giving synthesis report as
Minimum clock period 1701.691 ns and Maximum frequency 0.588 MHz.
so i want to ask is it correct that combinational will have lesser frequency than sequential?
As far as theory is concerned combinational design should be faster than sequential. But the simulation results I m getting for sequential is after 30 clock cycles where as combinational there is no delay in the output as there is no clock. In this way combinational is faster as we are getting instant output but why frequency of operation of combinational one is lesser than sequential one. Why this design is slow can any one explain please?
The design has been simulated in Xilinx ISE
Now I have applied pipe-lining to the combinational logic by inserting the registers in between the 5 main blocks which are doing the computation. And these registers are controlled by clock so now this pipelined design is giving design summary as
clock period 1.575 ns and freq 634.924 MHz
Min period 1.718 ns and freq 581.937.
So now this 1.575 ns is the delay between any of the 2 registers , its not the propagation delay of entire algorithm so how can i calculate propagation delay of entire pipelined algorithm.
What you are seeing is pipelining and its performance benefits. The combinational circuit will cause each input to go through the propagation delays of the entire algorithm, which will take at up to 1701.691ns on the FPGA you are working with, because the slowest critical path in the combinational circuitry needed to calculate the result will take up to that long. Your simulator is not telling you everything, since a behavioral simulation will not show gate propagation delays. You'll just see the instant calculation of your combinational function in your simulation.
In the sequential design, you have multiple smaller steps, the slowest of which takes 1.275ns in the worst case. Each of those steps might be easier to place-and-route efficiently, meaning that you get overall better performance because of the improved routing of each step. However, you will need to wait 30 cycles for a result, simply because the steps are part of a synchronous pipeline. With the correct design, you could improve this and get one output per clock cycle, with a 30-cycle delay, by having a full pipeline and passing data through it at every clock cycle.
I'm trying to synthetize any simple project in ISE for Spartan 6.
When I use Clocking Wizard for clk generator with f = 40 MHz (100Mhz external oscillator), XST says:
Timing Summary:
Speed Grade: -3
Minimum period: 9.482ns (Maximum Frequency: 105.458MHz)
Minimum input arrival time before clock: 2.623ns
Maximum output required time after clock: 3.597ns
Maximum combinational path delay: 5.194ns
OK, but when I change clk frequency in core generator to 100MHz, the response is Maximum Frequency is about 47MHz ...
What is wrong?
What is the right way to determine max frequency?
The reported maximum frequency in synthesis is only a rough estimation based on fanout, LUT levels, i/o-buffers, ...
The real timing analysis is done after Place & Route.
I have a project which already utilizes synthesis timing constraints (additional xcf-file), were XST reports f_max = 82 MHz. After P&R the design achieves 152 MHz :)
I have a clock of 100 mhz. I want to use DCM to create a clock of 78 mhz.
I think I should use two DCM, where the output of first DCM goes into the second DCM but I don't know if this will work.
Best Regards
Rather than using a DCM directly you can investigate using a Direct Digital Frequency Synthesizer (DDFS). It amounts to an accumulator that is incremented by a constant count value. You can control the precision by the size of the accumulator.
It is helpful if there is as much disparity between the accumulator clock and the generated frequency as possible. Consider using a DCM to scale the 100MHz up to the highest speed you can run a counter of the necessary width and still meet timing for your target device. There will be some jitter equal to one period of whatever clock is driving the accumulator but the average frequency can be made very close to 78 MHz.
accum_freq = 100 MHz * DCM_MULTIPLIER
accum_size = ceil(log2(accum_freq / (78 MHz * tolerance)))
increment = 78 MHz / accum_freq * 2**accum_size
accum = accum + increment
You then tap off the MSB of the accumulator to get your synthesized 78 MHz clock.
You can either manually compute these constants for use as magic numbers or do the arithmetic natively in VHDL to define the size and increment as machine computed constants. By reducing the tolerance you will increase the required size of the accumulator. Start off with 0.01% (0.0001) and see if it is satisfactory.
What device are you targeting? On a Spartan-6 the DCM_CLKGEN allows a multiplier of 39 and a divider of 50, which gets you your 78MHz.
If you set your multiplier to 7 and your divider to 9 you'll be able to get to 77.77 MHz. Will that work for you?