I am trying to learn VHDL and struggling with some of its basics. The question is as follows:
Process statement is described to contain code that runs sequentially (one line after the other). I want to ask why can't one run concurrent code in a process statement (means all lines execute in parallel). Secondly, if process statement contains sequential code, how can it model for example, three flip-flops concurrently e.g.,
--inside process statement
Q1 <= D1;
Q2 <= Q1;
Q3 <= Q2;
Sequential relates to the order the statements are evaluated, not when the assignment takes effect.
The VHDL Simulation Cycle
Signal assignments don't take effect immediately, they are scheduled for the current or a future time and all processes sensitive to signal transactions in the current simulation cycle being are completed before the assignments take effect. (And in VHDL everything devolves into equivalent block hierarchy, processes and function calls for simulation.)
When all currently active processes complete simulation time advances to the next time a signal is active in any signal projected output waveform (a queue) unless there are events at the current simulation time, in which case we call the next simulation cycle a delta cycle.
Each process that is sensitive to a signal's transactions is executed and any further signal assignments are made to the respective projected output waveform. There is only one 'slot' in the queue for the current simulation time for each signal.
In this way there aren't any processes hitting moving targets. Only one process executes at a time, no signal assignments take effect until all processes have completed execution. This emulates concurrency, mimicking parallel execution when processes containing sequential statements are executed sequentially.
An assignment such as Q1 <= D1; is equivalent to Q1 <= D1 after 0 ns; meaning the current simulation time. If a series of sequential statements in a process contain a subsequent assignment to the same signal at the current simulation time and the assigned value is different the second assignment will replaced the first one in the projected output waveform.
When there are no more events scheduled for signals at the current simulation time, simulation time will advance to the earliest time of any transaction time in any projected output waveform queue advancing simulation time.
When there are no further queue events simulation time will advance to Time'HIGH (the highest possible simulation time) and simulation will cease.
Also simulation can be stopped by an implementation controlling how long to allow the simulation to run or by execution of an assertion statement with a SEVERITY LEVEL of FAILURE or an implementation defined severity level threshold for stopping simulation.
Related
I have a piece of code like this in a process:
A <= '1';
A <= '0' after 5 sec;
Does it set A to 1 at first and then set A to 0 after 5 seconds? If not, what should I tweak?
No. It does this:
i) Schedule the setting of A to '1' on the next simulation (or delta) cycle.
ii) No. On second thoughts, don't do that. Instead, schedule the setting of A to '0' in 5 seconds time.
When a signal assignment operator is executed in VHDL, it does not drive the signal immediately. Instead it schedules a change on the signal (called an event) to be actioned some time in the future. If you do not specify a delay, then the event will be actioned on the next simulation (or delta) cycle. If VHDL encounters another signal assigment to the same signal, before it has actioned any previous ones, the corresponding events (usually) get deleted and replaced with the new one.
That might sound daft, but it is for good reasons. Replace your code with:
A <= '1';
wait for 5 sec;
A <= '0';
I have designed an algorithm-SHA3 algorithm in 2 ways - combinational
and sequential.
The sequential design that is with clock when synthesized giving design summary as
Minimum clock period 1.275 ns and Maximum frequency 784.129 MHz.
While the combinational one which is designed without clock and has been put between input and output registers is giving synthesis report as
Minimum clock period 1701.691 ns and Maximum frequency 0.588 MHz.
so i want to ask is it correct that combinational will have lesser frequency than sequential?
As far as theory is concerned combinational design should be faster than sequential. But the simulation results I m getting for sequential is after 30 clock cycles where as combinational there is no delay in the output as there is no clock. In this way combinational is faster as we are getting instant output but why frequency of operation of combinational one is lesser than sequential one. Why this design is slow can any one explain please?
The design has been simulated in Xilinx ISE
Now I have applied pipe-lining to the combinational logic by inserting the registers in between the 5 main blocks which are doing the computation. And these registers are controlled by clock so now this pipelined design is giving design summary as
clock period 1.575 ns and freq 634.924 MHz
Min period 1.718 ns and freq 581.937.
So now this 1.575 ns is the delay between any of the 2 registers , its not the propagation delay of entire algorithm so how can i calculate propagation delay of entire pipelined algorithm.
What you are seeing is pipelining and its performance benefits. The combinational circuit will cause each input to go through the propagation delays of the entire algorithm, which will take at up to 1701.691ns on the FPGA you are working with, because the slowest critical path in the combinational circuitry needed to calculate the result will take up to that long. Your simulator is not telling you everything, since a behavioral simulation will not show gate propagation delays. You'll just see the instant calculation of your combinational function in your simulation.
In the sequential design, you have multiple smaller steps, the slowest of which takes 1.275ns in the worst case. Each of those steps might be easier to place-and-route efficiently, meaning that you get overall better performance because of the improved routing of each step. However, you will need to wait 30 cycles for a result, simply because the steps are part of a synchronous pipeline. With the correct design, you could improve this and get one output per clock cycle, with a 30-cycle delay, by having a full pipeline and passing data through it at every clock cycle.
how can i access four elements from a 2d array or array of array in one process at the same time?
in this sample, i am trying to access intg1 at the same time, the synthesis is taking for ever.
type img_whole is array (78 downto 0, 130 downto 0) of std_logic_VECTOR(7 downto 0);
signal img1: img_whole;
signal i1_1: integer range 0 to 79:=0;
signal j1_1:integer range 0 to 131:=0;
type intg is array (78 downto 0, 130 downto 0) of integer range 0 to 1751998;--no double??
signal intg1 : intg;
integral :process (clka,finished,finished1)
variable tempo: integer range 0 to 1751998;
begin
if clka'event and clka = '1' then
if finished="1" and finished1="0" then
if i1_1 < 78 and j1_1 <130 then
j1_1<=j1_1+1;
elsif j1_1=130 and i1_1<78 then
j1_1<=0 ;
i1_1<=i1_1+1;
elsif j1_1<130 and i1_1=78 then
j1_1<=j1_1+1;
elsif j1_1=130 and i1_1=78 then
finished1<="1";
end if;
tempo:= to_integer(unsigned('0' & img1(i1_1,j1_1)));
if i1_1-1>=0 then
tempo:=intg1(i1_1-1,j1_1)+tempo;
end if;
if j1_1-1>=0 then
tempo:=intg1(i1_1,j1_1-1)+tempo;
end if;
if i1_1-1>=0 and j1_1-1>=0 then
tempo:=tempo-intg1(i1_1-1,j1_1-1);
end if;
intg1(i1_1,j1_1)<=tempo;
end if;
end if;
end process;
i am trying to access intg1 at the same time, the synthesis is taking for ever.
this code is for getting an integral image, out of a 2d array.
There are both functional and synthesis issues in the code.
Functional issues:
finished1 is only driven to '1' in the process, but never to '0', so if the initial value is '0' then the operation in the process can only be done once after power up, since the finished1 value of '1' will then inhibit further updates due to the process enable condition.
i1_1 and j1_1 are signals that are driven in the start of the process, and then used later in the process, but since signals, the value assigned with <= is not available until next process evaluation. Is that intentional?
Use a simulator to ensure correct functionality, which can be done before synthesis.
Synthesis issues:
intg1 is a table with at least 79 * 131 > 10 K entries, each of log2(1751999) <= 18 bits, thus a pretty large table. The design requires asynchronous lookup in the table, since there is no extra cycle (clock edge) available from a new value of index e.g. i1_1 and until the output of the process is generated based on the table lookup. An asynchronous lookup in a large table requires a huge mux network, which is probably the reason for the long synthesis time. And this lookup is even done multiple times based on different index values.
Minor: finished, and finished1 are not needed in the sensitivity list of the process, since this is a process clocked by the clka.
The above list of issues may not be complete.
To fix the table lookup problem (first synthesis issue), make a pipe-lined design with cycles e.g.:
Index values i1_1 etc. are generated
intg1 table lookup synchronously
Intermediate tempo is generated, and intg1 is updated.
The current design does step 2. and 3. in a single cycle, whereby it is not possible to make a synchronous lookup in the table, since there is only one clock edge in the cycle, and this is used for writing back to the intg1 table. So by splitting the lookup and write back operation in two cycles, it is possible both to have a clock edge for reading the table (synchronous read) and for writing the table. Such a synchronous read using a clock edge is much more efficient based on the available hardware resources in typical FPGAs, since these contains large synchronous RAMs similar to the intg1 table, thus the implementation will be smaller and faster. The synchronous intg1 lookup is made by simply adding a clocked process where signals are driven directly by the intg1 output based in the required index values. All the required reads must be made, then the subsequent process can then determine which of the read value that are actually used.
The specific pipeline implementation must be adapted to the design requirements.
Waveform link included I have a confusion regarding the value assignment to signal in VHDL.
Confusion is that I have read that values to signal gets assigned at end of process.
Does the value get assigned right when the process finishes or when the process is triggered the next time?
If it is assigned at the end of the process then consider this scenario (3 flip flops in series i.e output of one flip flop is input to another) then if D1 is 1 at time 0 will not the output Q3 be 1 at the same time?
(1) Right when the process finishes. More precisely, right after this and ALL processes running alongside this process have finished, and before any processes are subsequently started. So when any signal assignment happens, no process is running.
(2) Q3 will become the value on D1 three clock cycles earlier. Whether that value was '1' or not I can't tell from your question!
The signal assignment is done only at the end of the process. After signal assignment, there may exist signal updates and because of the signal updates, the process itself or maybe other processes which are sensitive to some of the updated signals will be triggered. This is the concept of delta-cycle. It happens in a zero simulation time.
signal updates -> triggers process->at the end of the process, signals are updated
----------------------------------- -----------------------------------
this is one delta cycle starting of the second delta cycle
when there will be no signal update, the process finishes and the the simulation time increments.
From what I understand, all statements inside a PROCESS is executed sequentially. So what happens to a concurrent signal assignment(<=)? Does it work the same way as sequential assignment (:=) or does it execute after a delta delay?
If it executes after a delta delay, then how can all the statements inside PROCESS be called sequential?
If it executes immediately, then is there any difference between := and <= in a process?
The signal assignment (<=) is performed after all the sequential code in the processes are done executing. This is when all the active processes for that timestep are done.
As an example why this is:
Suppose you have an event that triggers 2 processes. These 2 processes
use the same signal, but one of them changes the value of that
signal. The simulator is only be able to perform one process at the
time due to a sequential simulation model (not to confuse with the
concurrent model of vhdl). So if process A is simulated first and A
changes the signal, B would have the wrong signal value. Therefore the
signal can only be changed after all the triggered processes are done.
The variable assignment (:=) executes immidiatly and can be used to e.g. temporarely store some data inside a process.
Sequential signal assignment (<=), as opposed to sequential variable assignment (:=), sequentially schedules an event one delta delay later for the value of the signal to be updated. You can change the scheduled event by using a sequential signal assignment on the same signal in the same process. Only the last update scheduled on a particular signal will occur. For example:
signal a : std_logic := '1'; --initial value is 1
process(clk)
variable b : std_logic;
begin
--note that the variable assignment operator, :=, can only be used to assign the value of variables, never signals
--Likewise, the signal assignment operator, <=, can only be used to assign the value of signals.
if (clk'event and clk='1') then
b := '0' --b is made '0' right now.
a <= b; --a will be made the current value of b ('0') at time t+delta
a <= '0'; --a will be made '0' at time t+delta (overwrites previous event scheduling for a)
b := '1' --b will be made '1' right now. Any future uses of b will be equivalent to replacing b with '1'
a <= b; --a will be made the current value of b ('1') at time t+delta
a <= not(a); --at time t+delta, a will be inverted. None of the previous assignments to a matter, their scheduled event have been overwritten
--after the end of the process, b does not matter because it cannot be used outside of the process, and gets reset at the start of the process
end if;
end process;
It is also important to note that while sequential processes operate sequentially from a logical perspective in the VHDL, when synthesized, they are really turned into complex concurrent statements connecting flip flops. The entire process runs concurrently as a unit between every clock cycle (processes that don't operate on a clock become pure combinational logic). Signals are the values that are actually stored into the flip flops. Variables are just aliasing to make processes easier to read. They are absorbed into combinational logic after synthesis.