how many processes can be there in behavioural of vhdl? - vhdl

i want to slow down clk...and take input
entity q1 is
Port ( clk: in std_logic;
a0,a1,a2,a3,a4,a5,a6,a7,a8,a9 : in STD_LOGIC_VECTOR (3 downto 0);
b0,b1,b2,b3,b4,b5,b6,b7,b8,b9 : in STD_LOGIC_VECTOR (3 downto 0);
y0,y1,y2,y3,y4,y5,y6,y7,y8,y9 : out STD_LOGIC_VECTOR (6 downto 0));
end q1;
architecture Behavioral of q1 is
signal counter : std_logic_vector(9 downto 0)='0000000000';
signal clk_en: std_logic='0';
process(clk)
begin
if (clk'event and clk='1') then
counter <= counter +1;
if (counter = 0) then
clk_en <= '1';
else clk_en='0'
end if ;
end if;
end process;
end Behavioral;

an elaborated VHDL design executes proccesses
IEEE Std 1076-2008:
11. Concurrent statements, 11.1 General, para 1:
...Concurrent statements are used to define interconnected blocks and processes that jointly describe the overall behavior or structure of a design. ...
And
Elaboration and execution 14.1 General:
The process by which a declaration achieves its effect is called the elaboration of the declaration. After its elaboration, a declaration is said to be elaborated. Prior to the completion of its elaboration (including before the elaboration), the declaration is not yet elaborated.
Elaboration is also defined for design hierarchies, declarative parts, statement parts (containing concurrent statements), and concurrent statements. Elaboration of such constructs is necessary in order ultimately to elaborate declarative items that are declared within those constructs.
In order to execute a model, the design hierarchy defining the model shall first be elaborated. Initialization of nets (see 14.7.3.4) in the model then occurs. Finally, simulation of the model proceeds. Simulation consists of the repetitive execution of the simulation cycle, during which processes are executed and nets updated.
14.2 Elaboration of a design hierarchy, para 1:
The elaboration of a design hierarchy creates a collection of processes interconnected by nets; this collection of processes and nets can then be executed to simulate the behavior of the design.
Every concurrent statement is elaborated as a process (process statements, concurrent procedure calls, concurrent assertion statements, concurrent signal assignments) or hierarchy of blocks and processes (generate statements, component instantiations and block statements).
A process is not a routine, it isn't called. Rather it is suspended and resumed. It will wrap from the last statement to the first (goto or jump, not call).
How many processes can be held in a model isn't a matter of whether it is structural or behavioral - all VHDL models are behavioral, the distinction between the two is style not execution.
Along with a resumption address process resumption is controlled by sensitivity to events or simulation time. When a process suspends is controlled by the algorithm it implements.
simulation
14.7 Execution of a model
14.7.1 General, para 1:
The elaboration of a design hierarchy produces a model that can be executed in order to simulate the design represented by the model. Simulation involves the execution of user-defined processes that interact with each other and with the environment. ...
Execution of a processes statements can occur concurrently with any other process in a model. There is no guaranteed execution order and if they aren't concurrently executed the simulation cycle treats them as if they were.
No signal assignment occurs when any process is executing. When all processes have suspended any projected output waveform scheduled signal assignments are evaluated and the next time at which a signal update is scheduled is determined. Simulation time is advanced to that time. A signal assignment scheduled to occur after 0 simulation time units occurs in the next delta cycle. When there are no further updates scheduled simulation time is advanced to the maximum time value and simulation ends.
how many processes
How many processes is implementation limitation, based more on CPU architecture (address space, size of address pointers). How many processes there can be is dependent on the total number of addressable things, model code and their size versus the size of pointers and the amount of executable code needed to support the simulation - the host operating system routines in the simulation model's memory space. The host's ability to deal with virtual memory.
Address space is commonly associated with a CPU's word size. It's implementation dependent and not a VHDL language definition limitation.
The consequence of implementation limitations affects portability.
portability
Annex D,
(informative),
Potentially nonportable constructs, paras 1:
This annex lists those VHDL constructs whose use may result in nonportable descriptions.
A description is considered portable if it
    a) Compiles, elaborates, initializes, and simulates to termination of the simulation cycle on all conformant implementations, and
    b) The time-variant state of all signals and variables in the description are the same at all times during the simulation,
under the condition that the same stimuli are applied at the same times to the description. The stimuli applied to a model include the values supplied to generics and ports at the root of the design hierarchy of the model, if any.
VHDL design specifications can be non-portable between implementations based on implementation limitations. Annex D enumerates non portable constructs that defined in the language definition, which does not include implementation limitations.

Related

There are 2 controls whether/when a signal is handled, is that right?

In Linux, after a signal is generated asynchronously, there are couple controls whether/when the signal is handled:
1. signal mask
This mask can be controlled by user programming.
2. task_struct's state can be TASK_INTERRUPTIBLE or TASK_UNINTERRUPTIBLE which allows or blocks signal
My impression is this is controlled by kernel, so user programming cannot control.
And there is no 3rd control whether/when a signal is handled.
Is my above understanding correct?

How does Vivado ensure that signals do not transition unpredictably when they are sampled by a clock edge and how does this relate to simulation?

I wrote some VHDL code and I wrote a simple sequential code defining a clock sensitive process. Whenever the clock rises from low to high I check another signal in the architecture and I do stuff depending on its value. Nevertheless, this signal transitions at the same time instant as the rising edge of the clock occurs.
In simulation, when the rising edge arises, the system always samples the signal value before its transition. My question is: how does this work out once the code is implemented on the corresponding FPGA? Does it produce unpredictable sampling of the signal value? Do you advice to always avoid this type of scenario within a VHDL architecture?

FPGA Will pausing entities (by pausing their clock input) reduce the overall power consumption?

I'm currently creating a multiple entity project where all of the entities have clock synchronous architecture (no behaviourals) and most of the entities work on derived clocks.
I'm using DE0 Nano, so my source clock is 50MHz and I have 4 derived clocks: 1 MHz, 500 kHz, 10 Hz and 1 Hz.
Disclamer: While I am aware that doing things this way is much less power-efficient, I've been wondering if there is something I could do to remedy this at least a little (I'm open to ideas).
Now, in the top-level entity I have an "event handler", which can decide which entities should work at any given moment.
Therefore I came up with an idea to wire an on/off clock switch for the derived clock input signals to the lower level entities and disable some of them (the clock inputs) when I don't need a given entity to work for a while (as I understand, this should stop their processes from firing for that time).
Since I don't have an easy way to test that idea (I estimate it will take a moderate amount of work and time, especially setting up the power consumption measurement) I wanted to ask whether anyone tried something similar and/or knows if it's worth a shot?
For your information, currently when the entities are in "sleep" mode, their processes fire on each rising clock edge, check an internal state or flag variable/signal and stop e.g.:
process (1MHz clock) is
variable ...
begin
if rising_edge(clk) then
if state = ready and enable = '1' then
...
end if;
end if;
end process;
Or maybe there an other, better way to do it?

How to work with DDR in synthesizeable Verilog/VHDL?

I am working on implementing a DDR SDRAM controller for a class and am not allowed to use the Xilinx MIG core.
After thrashing with the design, I am currently working synchronously to my system clock at 100MHz and creating a divided signal "clock" (generated using a counter) that is sent out on the IO pins to DDR SDRAM. I have some logic that feeds me the "rising" edge strobes of this signal clock as I am aware that I cannot use a signal to clock a process. However, this divided clock method runs very slow and I have concerns that I am not meeting the minimum required frequency of the external DDR SDRAM. I am hoping to speed up my read/write bursts, but to do so, my spartan3e will struggle with anything higher than 100MHz. After looking around online, I found this code from EDA Board:
process(Input_Clk,Reset_Control)
begin
if (Reset_Control = '1') then
Output_Data <= (others => '0');
elsif rising_edge(Input_Clk) then
Output_Data <= Input_Data1;
elsif falling_edge(Input_Clk) then
Output_Data <= Input_Data2;
end if;
end process ;
I have written a lot of VHDL, but have never seen something like this before. I'm sure this works fine in simulation, but it doesn't look synthesizable to me. The poster said this should be supported by 1076.6-2004. Does this infer two flip-flops, one clocked on the rising edge and one on the falling edge whose outputs both feed into a 2:1 mux? Does Xilinx support this? I want to avoid having to instantiate a DCM as crossing these clock domains will definitely slow me down and will add undesired complexity. Is there a way to safely generate my DDR data that is being sent to and received from DDR SDRAM without the Xilinx primitive for the MIG? How would I perform the receiving of DDR data in Verilog?
For reference, we have to code in Verilog, so I'm not too sure on how to translate that VHDL process to a Verilog always block if it is synthesizable. We are using the Micron MT46V32M16 if that is relevant.
Here are the timing diagrams for what I am trying to replicate:
I would say that implementing a DDR controller 'for class' is rather challenging. In the companies I worked for they where left for senior engineers to build.
First about the Verilog code shown:
Yes, you are right that can not be synthesized.
The approach to double-clocking inputs is to have two data paths. One on the rising edge and one on the falling edge. In a second stage the two are put in parallel but with double the data width. Thus a 32-bit wide DDR produces 64 data bits per 'system' clock.
More difficult is to clock the arriving data at the right time. As your read diagram shows the data arrives in the middle of the clock edge. For that you need a delayed clock. In an ASIC that is done using a 'tune-able' delay line which is calibrated at start-up and regularly checked for the phase. In an FPGA that would requires some esoteric logic.
I have not been close to DDRs chips for a while, but I think all the modern ones (DDR2 and up?), output a clock themselves to help with the read data.
Also after you have clocked the read data in, using that shifted clock, you have to get the data back to the system clock which requires an asynchronous FIFO.
I hope that gets you started.

VHDL simulation what is the correct delta?

I am currently implementing a MUX, and to test this I've created a generator and a monitor to well generate data as input and monitor its output.
The MUX takes Avalon Streaming interface as input and output and therefor also supports back pressure.
My question is. My test bench run on falling edge while my DUT and input data is generate at rising edge. Both my input clock and my input data is generated at Delta cycle 0. However my back pressure ready signal returning from the DUT and which controls the generator is set at Delta 3. Now this gives some sampling problems because the DUT must only load data every time data from the generator (at delta 0) is valid and the DUT ready is valid (The back pressure signal at Delta 3).
Now if I skew my DUT input clock with 1 ps it fixes the problem. But it feels like that is the wrong approach. What is the correct design principle here. ?
Skew the clock 1 ps or at least move it 4 deltas so i make sure all my signals have been set before rising_edge ?
or
Move the data I generate so it aligns with the DUT output ready signal ?
or
Is it just a decision made from test bench to test bench ?
I've also thought that a clock in a test bench should be generated at delta 0 and everything else must come after.
I am simulating in Riviera-pro
You have various choices:
i) Make everything synchronous. In other words, drive the inputs and sample the outputs on the same edge of the clock as the DUT uses. Afterall, the DUT doesn't suffer any race problems, so if you just extend the clocking strategy to the testbench, everything will work fine. At RTL, but not at gate-level. So if you're doing gate-level sims (which you should be), then this strategy is no good for that.
ii) Clock everything in your testbench off the opposite edge of the clock to the edge the DUT uses. Again, fine for RTL, but whether fine for gate-level depends on the delays through your design.
iii) Drive the inputs to the DUT just after the clock edge and sample the DUT outputs just before it. The clock edge being the edge that the DUT uses. Again, this is fine for RTL, at is the most robust for gate-level, too.
iv) Implement realistic timing for each DUT interface. That ought to work for RTL and gate-level and if it doesn't work for gate-level then the fault is with the DUT not the testbench.

Resources