This is a question for those who have a good understanding of VHDL. I'm a newbie but so far I have been generating VHDL using a behavioral description. For me it is much easier to think about as it is similar to writing software. I am aware that a possible downfall is that behavioral 'executes' sequentially while structural executes concurrently within the design component/process..
So I'm just curious, if I have an architecture that uses a process for say an 8-bit shift register (SISO) and I want to create 4 instances of these (4x8-bit shift registers) would I create a component and 4 instances of the process?
Or would I generate 4 processes (executing in parallel to one another) and just call each process by a different name?
Also, just a general question to get a consensus of what good practices people use out there, which do you prefer: structural vs. behavioral?? When would be a good time to choose one over the other? I'm guessing their could be some benefits with 'faster' execution using components that allow internal concurrency vs. sequential execution in processes.. It sure does seem to me though that one could reduce the design time with behavioral designs..
Thanks!
~doddy
For my money, the role of structural HDL nowadays is restricted to interconnecting tested working behavioral blocks (or connecting untested ones to their testbenches!) - I would agree with you about the superiority of behavioural VHDL in terms of design creation and test time.
I would also agree that writing a behavioural process is in some ways similar to writing software (over the screams of some that it isn't)
HOWEVER
do not fall into the trap of equating behavioural with sequential or slow!
Given your shift register, say
type reg_type is array(7 downto 0) of bit;
signal s_in, s_out : bit;
process(clk) is
variable reg : reg_type;
begin
if rising_edge(clk) then
s_out <= reg(7);
reg := reg(6 downto 0) & s_in;
end if;
end;
I can trivially parallelise it as follows:
signal p_in, p_out : array(1 to 4) of bit;
process(clk) is
variable reg : array(1 to 4) of reg_type;
begin
if rising_edge(clk) then
for i in reg'range loop
p_out(i) <= reg(i)(7);
reg(i) := reg(i)(6 downto 0) & p_in(i);
end loop;
end if;
end;
(And yes there are simpler ways to write this!)
It is important to note that the loop does not take any longer to run : it just generates more hardware (in software terms, it is unrolled completely). Each iteration is completely independent of the others; if they weren't, things would be more complex.
Don't worry about the academic differences between structural and behavioral.
Worry about the difference between signal assignment scheduling and variable assignment scheduling in a process (learn what delta cycles and postponed assignment are - this is one of the key diffs from software, and arises because software only has variables, not VHDL's signals). That will explain why I implemented the pipeline upside down here (output first) - because I used a variable.
Worry about why so many people teach the idiotic 2-process state machine when the 1-process SM is simpler and safer.
Find and understand Mike Treseler's pages about single-process models (I hope they are still online)
Related
I have a VHDL BCD counter whose output is an integer value (digit).
But when I simulate the code in Xilinx ISE it shows the code's waveform in binary value. The code works but the output should be integer but it's not. I have tested this code in Modelsim and the output is correct and it's in integer value. This problem is in code synthesize too and the value is binary.
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
entity bcdcnt is
Port ( clk : in STD_LOGIC;
digit : out INTEGER RANGE 0 TO 9);
end bcdcnt;
architecture Behavioral of bcdcnt is
begin
count: PROCESS(clk)
VARIABLE temp : INTEGER RANGE 0 TO 10;
BEGIN
IF (clk'EVENT AND clk = '1') THEN
temp := temp + 1;
IF (temp = 10) THEN temp := 0;
END IF;
END IF;
digit <= temp;
END PROCESS count;
end Behavioral;
That's what synthesis does.
That's what synthesis MUST do : it translates your high level design to the resources in your FPGA or ASIC, which are binary. So, what's the problem here?
If you need to simulate the post-synth result, the usual approach is to create a wrapper entity that takes the correct port types, and translates between those and the post-synthesis netlist component.
Then, simulation should work with either the original entity, or this wrapper entity, both of which have integer ports.
Better still, you can re-use the same entity, and add the wrapper as a second architecture, thus guaranteeing that it uses the same interface (ports).
(You can even instantiate both the original and post-synth in its wrapper in the testbench, in parallel with a comparator on their outputs, to see they both do the same thing. But note there will be gate-level delays between them; usually you check outputs only on clock edges so these do not matter.)
Another approach is to restrict port types on the top level of the design to binary types like std_logic_vector. This plays nicer with badly designed tools, like ISE where the automatically generated testbench will have binary port types, (I generally edit them back to the correct ones; it's almost easier to write TBs from scratch).
But it restricts you to using an obscure and complex design style instead of higher level abstractions like Integers.
It's bad - really bad - that this approach is taught and encouraged so widely. But it is, and sometimes you'll just have to live with it. (Even in this approach, there's no reason to avoid decent abstractions internal to the FPGA, as long as the synthesis tool understands them).
A third approach - roughly, "trust, but verify" - is to trust that synthesis tools are competently written - which is usually true - and forget about post-synthesis simulation.
Just verify the design thoroughly at the behavioural level in simulation, then synthesise it, and test in live FPGA.
99% of the time (unless you're writing really weird VHDL), synth and P&R have done the right thing, and any differences you see are due to the aforementioned I/O timings (gate delays at the I/O pins). Then model these in the testbench and/or wrapper until you see the same behaviour in both (fix anything that needs fixing, and re-synthesise).
In this approach you only need to bother with the (MUCH slower) post-synth and post-PAR simulations if you need to track down a suspected synthesis tool bug.
This does happen : I've seen two in a quarter century.
Most of the time I just use the third approach.
I am working on projects which requires synthesis of my RTL codes specifically for ASIC development. Given the case, how much important is it, to separate sequential logic from differential logic while designing my RTLs ? And if it is important, then what should be my approach while designing, as if how should I differentiate my design for sequential and combinational logic?
I generally will separate sequential and combinational as much as possible until it results in too much code (which is usually rare, and possibly an indication of a poor design), or when something just makes more sense (which again is rare, but happens).
This segregation usually aids in initial design, brain->rtl->synthesis (what you think you're making is actually what is synthesized), CDC evaluation for multiclock designs, verification, and other things as well. It's hard for me to give a great example of something bad versus something I would call good, but here is a stab at it.
Say I have a counter that I want to reset on a certain value. I could do this (which I tend to see from people who have a strong software and/or FPGA background)
always #(posedge clk or posedge reset) begin
if(reset) begin
count <= 0;
end else begin
if((count == reset_count_val) || (~enable)) count <= 0;
else count <= count + 1;
end
end
Or I would do this (which is what I would personally do):
//Combinational Path
assign count_in = enable ? ((count == reset_count_val) ? 4'd0 : count_in + 4'd1) : 4'd0;
//Sequential Path
always #(posedge clk or posedge reset) begin
if(reset) count <= 4'd0;
else count <= count_in;
end
The above I would agree is more typing, and to some, more difficult to read. It does however split up the circuit in a way that allows me to easier see what happens on each clock edge. I know that "count_in" is being setup prior to the posedge of clk. I can easily see (as well as anyone else looking at the code) that I would expect a MUX for the reset or addition of count based on the reset_count_val, with a final MUX for gating the count based on the enable signal. Now you could see the same with the first bit of code, however IMO it's not as clear. When you look at a sim, you can see what count_in looks like prior to the rising edge of the clk. This could aid you if the conditional statement for the count_in was rather complex.
Let's say you sent this through synthesis and place and route and you get a timing violation between the Q of the count reg and the D of the count reg (since you have a loopback based on the addition). It would generally be easier to see which path is causing the issue with the 2nd batch of code. This depends on the tool (Primetime more than likely). CDC could also be easier because let's say that reset_count_val is coming from a static registers in another clock domain. The tool may try to synthesize/elaborate the OR in the 1st batch of code thinking the reset_count_val and enable are somewhat related giving you a weird looking CDC violation. Again, sometimes it's hard to come up with an example that exercises all of the "why you shouldn't do this" type of cases.
As an example about splitting combinational and sequential, I inherited a design where someone had a state machine that was written with the combiational and sequential in the same block (always #(posedge clk) where the if/else's would get deep and have next state logic in it). I'm no genius, but after days of staring at that thing and running sims, I could just not figure out what it was doing. It was quite large as well. I simply redid the design, keeping the same algorithm, but splitting up the logic in the format I described here. Even with me adding some features, the size went down ~15%. Other engineers who had the same problem of being lost with the other design, could now understand what was going on with it. This won't always be the case, but more than often it is.
TLDR;
I try to be extremely descriptive when designing RTL that is to be used in an ASIC. The more abstract the code, the more likely it is to build something you don't want, or is more complex than needed. Verification is often much easier, particularly when you get to gate sims. While I am in the camp of "the less code the better" that is not always the case with verilog, especially on an ASIC.
In general, I would not hesitate to mix combinational with sequential logic. I come from an IC design background and have always mixed up combinational logic with sequential. I think that you are restricting yourself too much if you don't and are not fully using the power of your logic synthesiser.
For example, here is how I would design a simple, asynchronously-reset counter in VHDL:
process (Clock, Reset)
begin
if Reset = '1' then
Cnt <= (others => '0');
elsif Rising_edge(Clock) then
if Enable = '1' then
Cnt <= Cnt + 1;
end if;
end if;
end process;
This style of writing a counter in VHDL is ubiquitous. I personally can see no advantage to splitting the code up into two separate processes, one sequential the other combinational. I have just taught a room full of engineers to design a counter in exactly this way.
Here are some exceptions. I would split the combinational logic from the sequential logic if:
i) I were designing a state machine:
There is what I think is a really elegant way of coding a state machine, where you do split the combinational logic from the sequential:
Registers: process (Clock, Reset)
begin
if Reset = '1' then
State <= Idle;
elsif Rising_edge(Clock) then
State <= NextState;
end if;
end process Registers;
Combinational: process (State, Inputs)
begin
NextState <= State;
Output1 <= '0';
Output2 <= '0';
-- etc
case State is
when Idle =>
if Inputs(1) = '1' then
NextState <= State2;
end if;
when State2=>
Output1 <= '1';
if Inputs = "00" then
NextState <= State3;
end if;
-- etc
end case;
end process Combinational;
The advantage of coding a state machine like this is that the combinational process looks very like the state diagram. It is less error prone to write, less error prone to modify and less error prone to read.
ii) The combinational logic were complex:
For really big block of combinational logic, I would separate. The exact definition of "really big" is a matter of judgement.
iii) The combinational logic were on the Q output of a flip-flop:
Any signal driven in a sequential process infers a flip-flop. Therefore, if you wish to implement combinational logic that drives an output of an entity* then this combinational logic must be in a separate process.
*often not a good idea - be careful.
I would post it as comment if I could, as I am not writing full answer, but giving you a source, and also I don't fully refere to the question (I don't know anything about ASICs). But there is great pdf about this problem in general here. Generally, you don't have to completely separate sequential logic from differential logic, but it is helpful to write more readable and maintainable code.
Mixing sequential and combo condenses the code which almost always makes it easier to understand.
Separating makes ECOs easier.
Which you choose is a matter of personal style and organizational coding conventions and standards.
I am starting at VHDL and while trying to implement a circuit I found that I need basic elements like leading one/zero detectors or leading one/zero counters. Since I consider these basic I would like to know if there is a library to include.
If I have to put together my own implementations is there a collection where one can copy from or do I need to reinvent the wheel?
(I can google them one by one and find implementations in forums, however some of them seem quite bad/implemented not general enough (i.e. this multiplexer)).
One reason why the approach you suggest (a library of simple re-usable elements) doesn't catch on : you are often better off without them.
The example multiplexer you linked to illustrates the point quite well:
ENTITY multiplexer IS
PORT (
a, b, c, d, e, f, g, h : IN STD_LOGIC;
sel : IN STD_LOGIC_VECTOR(2 DOWNTO 0);
y : OUT STD_LOGIC
);
END ENTITY multiplexer;
given slightly better design, using the type system you could write (perhaps in a re-usable package)
subtype sel_type is natural range 0 to 7;
and then in your design, given the signal declarations
signal sel_inputs : std_logic_vector(sel_type);
signal sel : sel_type;
signal y : std_logic;
you could simply write
sel_inputs <= a & b & c & d & e & f & g & h;
y <= sel_inputs(sel);
with much less fuss than actually instantiating the multiplexer component.
Leading zero detectors look harder at first glance : they are generally implemented as clocked processes, or as part of a larger clocked process.
Recent trends in VHDL are moving towards a semi-behavioural style of synthesisable VHDL, where quite a lot is written in a sequential style within a single clocked process. This is best seen in the "single process state machine"; smaller, easier to understand and definitely more reliable * than the "two process" style still commonly taught.
(* unless your tools support the "process(all)" sensitivity list of VHDL-2008 for the combinational process.)
Within a clocked process (i.e. assume the following is inside if rising_edge(clk) then simple tasks like counting the leading zeroes in an array can be expressed using for loops; for example:
for i in sel_type loop
if sel_inputs(i) = '1' then
zero_count <= sel_type'high - i;
end if;
end loop;
The last assignment will win; if h = '1' the count will be 0, otherwise if g = '1' the count will be 1, and so on.
Now it's not a one-liner unless you write a procedure for it (also synthesisable with most tools!) but compare it with the size of an entity instantiation (and interconnections).
Of course a library of procedures and functions can be useful; and (for less trivial tasks) the same is true for entities also, but in my opinion, making them general enough to be useful to all, without making them heavyweight, would be quite an achievement.
some IDEs come directly with an example library. e.g. Xilinx ISE has the "language Templates" section with basic synthesis constructs, coding examples,...
In VHDL, in a process all steps will be executed sequentially, but I wonder how an FPGA can execute steps sequentially. I am very confused about how sequential assignments, functions and similar are being generated in an FPGA, so can anyone throw some light on this topic?
process(d, clk)
begin
if(rising_edge(clk)) then
q <= d;
else
q <= q;
end if;
end process;
This is just code for a simple D-Latch, but how will this be implemented in an FPGA?
It is not "executed" sequentially as such - but the synthesizer interprets the code sequentially, and creates the hardware design to fit such an interpretation.
For instance, if you assign a value to a signal twice during a clocked process, the first assignment is simply ignored, while the second takes effect (remember that a signal is only assigned at the end of a process statement, not immediately):
signal a : UNSIGNED(3 downto 0) := (others => '0');
(...)
process(clk)
begin
if(rising_edge(clk)) then
a <= a - 1;
a <= a + 1;
end if;
end process;
The above process will always increment a by 1. Similarly, if you have the second assignment inside an if statement, the synthesizer will simply create two paths for a - a decrement for when the if statement is not fullfilled, and an increment for when it is.
If you use variables, the idea is the same - although intermediate values are used, as variables take on their new value immediately.
But it all boils down to that the synthesizer does all the "magic" of interpreting your process in a sequential way, then generating hardware that does what you have described.
Your example basically describes a d-flip-flop (the Xilinx FPGA tools iirc distinguish latches and flip-flops in that flip-flops are edge-sensitive, and latches are level-sensitive), although in a different way than typically recommended.
You can basically write the same code as:
process(clk)
begin
if(rising_edge(clk)) then
q <= d;
end if;
end process;
It will automatically keep its value in the other cases. This will be implemented as a flip-flop inside the FPGA. Most FPGAs consist of blocks of look-up tables and flip-flops, to which quite a lot of different hardware can be mapped. The above code will simply by-pass the look-up table, and just use the flip-flop of one of the blocks.
You can learn more about the internal workings by having a look at the datasheet for your particular FPGA. For Spartan3-series FPGAs for instance, have a look at page 24 of the Xilinx Spartan3 FPGA Family Data Sheet
Question about best VHDL design practice.
When designing state machines, should I use a signal within the Architecture or use an variable. I have until now used variables, since they are "kinda" private to the process, and IMHO makes good sense because they should not be accesses outside the process. But is this good design practice?
type state_type is (s0,s1);
signal state : state_type := s0;
A : process(clk)
begin
if rising_edge(clk) then
case state is
.....
end case;
end if;
end process;
--This process uses a variable
B : process(clk)
type state_type is (s0,s1);
variable state : state_type := s0;
begin
if rising_edge(clk) then
case state is
.....
end case;
end if;
end process;
I prefer to use signals. The reason is that it allows the design to be split between multiple processes. One process can worry about how the state machine moves from state to state, and other processes can contain logic that is dependent on the state.
Doing it this way means you can have multiple simple processes that do one thing each. Using variables, everything has to go into the one process and that can become unwieldy.
It's a stylistic choice though.
I always use variables unless I need to communicate some value to another process. Because that's what signals are for.
Another potential benefit (or alternatively, something to watch as a violation of your coding standards!) of "localising the state variable" is that you can call the type and the variable state_type and state in two state machines in the same entity without them clashing...
For synthesis, I almost exclusively use signals.
Why?
Signals seem to be more universally understood.
Code clarity. Signals are all updated once, at the same time. Mixing variables and signals can be (potentially) confusing because you must remember that variable assignment takes effect immediately, whereas subsequent signal assignment implies priority. The value of a signal will never change during the same execution of a process.
Lower risk of deep logic paths. Inexperienced designers may be tempted to use variables like a C program, which would result in logic that will not meet timing goals.
Why Not?
Large designs end up with very long lists of signals and can be difficult to find the signal types/sizes/etc. Typically this only occurs for wrappers of the largest blocks in a design, and/or a design top level.
For testbench, I use variables and signals, depending on the use.
If the signal in this case it's not needed as an input or output of your process is a better use to use variables because when you're building the architecture you will need to assign a signal to the output of the module, and if it won't be used, it's not necessary to create the signal and spend more memory for something you won't use to connect to another module of your architecture.