Same design in VHDL and Verilog. But different speed and resource usages? - vhdl

I have two codes, one in Verilog and another in vhdl, which counts the number of one's in a 16 bit binary number. Both does the same thing, but after synthesising using Xilinx ISE, I get different synthesis reports.
Verilog code:
module num_ones_for(
input [15:0] A,
output reg [4:0] ones
);
integer i;
always#(A)
begin
ones = 0; //initialize count variable.
for(i=0;i<16;i=i+1) //for all the bits.
ones = ones + A[i]; //Add the bit to the count.
end
endmodule
VHDL code:
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.NUMERIC_STD.ALL;
entity num_ones_for is
Port ( A : in STD_LOGIC_VECTOR (15 downto 0);
ones : out STD_LOGIC_VECTOR (4 downto 0));
end num_ones_for;
architecture Behavioral of num_ones_for is
begin
process(A)
variable count : unsigned(4 downto 0) := "00000";
begin
count := "00000"; --initialize count variable.
for i in 0 to 15 loop --for all the bits.
count := count + ("0000" & A(i)); --Add the bit to the count.
end loop;
ones <= std_logic_vector(count); --assign the count to output.
end process;
end Behavioral;
Number of LUT's used in VHDL and Verilog - 25 and 20.
Combination delay of the circuit - 3.330 ns and 2.597 ns.
As you can see the verilog code looks much more efficient. Why is that?
The only difference I can see is, how 4 zeros are appended on MSB side in VHDL code. But I did this, because otherwise VHDL throws an error.
Is this because of the tool I am using, or HDL language or the way I wrote the code?

You will need to try a number of different experiments before coming to any conclusions. But my observation is that Verilog is used more frequently in the most critical capacity/area/performance designs. Therefore the majority of research effort goes into handling Verilog language tools first.

Related

Scaling down a 128 bit Xorshift. - PRNG in vhdl

Im trying to figure out a way of generating random values (pseudo random will do) in vhdl using vivado (meaning that I can't use the math_real library).
These random values will determine the number of counts a prescaler will run for which will then in turn generate random timing used for the application.
This means that the values generated do not need to have a very specific value as I can always tweak the speed the prescaler runs at. Generally speaking I am looking for values between 1000 - 10,000, but a bit larger might do as well.
I found following code online which implements a 128 bit xorshift and does seem to work very well. The only problem is that the values are way too large and converting to an integer is pointless as the max value for an unsigned integer is 2^32.
This is the code:
library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
entity XORSHIFT_128 is
port (
CLK : in std_logic;
RESET : in std_logic;
OUTPUT : out std_logic_vector(127 downto 0)
);
end XORSHIFT_128;
architecture Behavioral of XORSHIFT_128 is
signal STATE : unsigned(127 downto 0) := to_unsigned(1, 128);
begin
OUTPUT <= std_logic_vector(STATE);
Update : process(CLK) is
variable tmp : unsigned(31 downto 0);
begin
if(rising_edge(CLK)) then
if(RESET = '1') then
STATE <= (others => '0');
end if;
tmp := (STATE(127 downto 96) xor (STATE(127 downto 96) sll 11));
STATE <= STATE(95 downto 0) &
((STATE(31 downto 0) xor (STATE(31 downto 0) srl 19)) xor (tmp xor (tmp srl 8)));
end if;
end process;
end Behavioral;
For the past couple of hours I have been trying to downscale this 128 bit xorshift PRNG to an 8 bit, 16 bit or even 32 bit PRNG but every time again I get either no output or my simulation (testbench) freezes after one cycle.
I've tried just dividing the value which does work in a way, but the size of the output of the 128 bit xorshift is so large that it makes it a very unwieldy way of going about the situation.
Any ideas or pointers would be very welcome.
To reduce the range of your RNG to a smaller power of two range, simply ignore some of the bits. I guess that's something like OUTPUT(15 downto 0) but I don't know VHDL at all.
The remaining bits represent working state for the generator and cannot be eliminated from the design even if you don't use them.
If you mean that the generator uses too many gates, then you'll need to find a different algorithm. Wikipedia gives an example 32-bit xorshift generator in C which you might be able to adapt.
Table 3 in the old Xilinx Application Note has the information you need to make such random generator circuit for 8-bit as you mention.
https://www.xilinx.com/support/documentation/application_notes/xapp052.pdf

Why does my implementation of an XOR-reduction has multiple drivers?

I have a binary string like "11000110".
I'm trying to XOR all bits together.
I have this code:
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.STD_LOGIC_ARITH.ALL;
use IEEE.NUMERIC_STD.ALL;
use IEEE.STD_LOGIC_UNSIGNED.ALL;
entity e_circuit is
generic(n : INTEGER:=8);
port(
d1:in std_logic_vector(n-1 downto 0);
result:out std_logic
);
end e_circuit;
architecture structure of e_circuit is
signal temp:std_logic;
begin
temp<=d1(0);
loop1:for i in 1 to n-1 generate
temp<= d1(i) or d1(i-1);
end generate;
result<=temp;
end structure;
But, when I try to compile it, I get the error below:
ERROR:Xst:528 - Multi-source in Unit <e_circuit> on signal <result>;
this signal is connected to multiple drivers.
What does it mean? How can I fix it?
This means that you are assigning to a signal (temp) multiple times (tying their outputs together directly) without any combinational circuit or if clause. When the synthesizer synthesizes the resulting circuit, all of the statements in the for...generate statement get executed simultaneously. Here are some solutions you can try:
First, you can use the VHDL-2008 reduction operator:
result <= xor d1;
Or, if your synthesizer doesn't support that, create a function to do it for you:
function xor_reduct(slv : in std_logic_vector) return std_logic is
variable res_v : std_logic := '1'; -- Null slv vector will also return '1'
begin
for i in slv'range loop
res_v := res_v xor slv(i);
end loop;
return res_v;
end function;
And call it in your corresponding architecture:
result <= xor_reduct(d1);
Or do the circuit manually, (with temp being a std_logic_vector of the same size as d1:
temp(0) <= d1(0);
gen: for i in 1 to n-1 generate
temp(i) <= temp(i-1) xor d1(i);
end generate;
result <= temp(n-1);
What does it mean?
You have concurrently assigned the signal temp at several locations in your code (and none of them is masked by an if ... generate). Each concurrent assignment makes up a driver.
The for generate repeats the concurrent statements inside the block for each value of i within the given range. Thus, for n = 8, your code is similar to:
temp <= d1(0);
temp <= d1(1) or d1(1-1);
temp <= d1(2) or d1(2-1);
temp <= d1(3) or d1(3-1);
temp <= d1(4) or d1(4-1);
temp <= d1(5) or d1(5-1);
temp <= d1(6) or d1(6-1);
temp <= d1(7) or d1(7-1);
Thus, you have connected 8 drivers to temp. The first is the input d1(0) and the others or the outputs of 7 OR gates.
The posted error message is reported only with the old parser of ISE for Spartan-3 like devices. You should switch to the new one. Right click on Synthesize -> Properties -> Synthesis Options. The set option "Other XST Command Line Options" to
-use_new_parser YES
Then, you get a more meaningful error message:
ERROR:HDLCompiler:636 - "/home/zabel/tmp/xor_reduction/e_circuit.vhdl" Line 23: Net is already driven by input port .
Line 23 is the one within the for ... generate statement.
How can I fix it?
At first, your code has to use an XOR instead of OR. ISE does not support the XOR reduction operator from VHDL'08, thus, you have to describe it manually. One solution is to use sequential statements within a process, which has also been suggested by #user1155120 in the comments:
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
entity e_circuit is
generic(n : INTEGER := 8);
port(d1 : in std_logic_vector(n-1 downto 0);
result : out std_logic);
end e_circuit;
architecture structure of e_circuit is
begin
process(d1)
variable temp : std_logic;
begin
temp := d1(0);
for i in 1 to n-1 loop
temp := temp xor d1(i);
end loop;
result <= temp;
end process;
end structure;
Here, the variable temp is updated sequentially during the execution of the process (like in an imperative software programming language). The final value is then assigned to the signal result.
I have also omitted all VHDL packages which are not required.

Process or not to Process?

I have the below code in VHDL that I use in a project. I have been using a Process within the architecture and wanted to know if there were any other means which I'm sure there are of accomplishing the same goal.. in essence to take one number compare it to another and if there is a difference of +/- 2 reflect this in the output. I am using the following:
LIBRARY IEEE;
USE IEEE.std_logic_1164.all, IEEE.std_logic_arith.all, IEEE.std_logic_signed;
ENTITY thermo IS
PORT (
CLK : in std_logic;
Tset, Tact : in std_logic_vector (6 DOWNTO 0);
Heaton : out std_logic
);
END ENTITY thermo;
ARCHITECTURE behavioral OF thermo IS
SIGNAL TsetINT, TactINT : integer RANGE 63 Downto -64; --INT range so no 32bit usage
BEGIN
Heat_on_off: PROCESS
VARIABLE ONOFF: std_logic;
BEGIN
TsetINT <= conv_integer (signed (Tset));--converts vector to Int
TactINT <= conv_integer (signed (Tact));--converts vector to Int
--If you read this why is it conv_integer not to_integer?? thx
ONOFF := '0'; --so variable does not hang on start
WAIT UNTIL CLK'EVENT and CLK = '1';
IF TactINT <= (TsetINT - 2) then
ONOFF := '1';
ELSIF TactINT >= (TsetINT + 2) then
ONOFF := '0';
END IF;
Heaton <= ONOFF;
END PROCESS;
END ARCHITECTURE behavioral;
I'm just after a comparison really and to know if there are any better ways of doing what I have already done.
Why convert Tact and Tset to an integer?
Why have the variable ONOFF? The variable initialization appears to remove any sense of hysteresis, is that what you intended? Based on your other code, I bet not. I recommend that you assign directly to the signal Heaton instead of using the variable ONOFF.
If I were to create TsetINT and TactINt, these would be good candidates to be variables. However, there is no need to do the integer conversion as you can simply do the following:
if signed(Tact) <= signed(Tset) - 2 then
...
elsif signed(Tact) >= signed(Tset) + 2 then
Please use numeric_std. Please ask your professor why they are teaching you old methodologies that are not current industry practice. Numeric_std is an IEEE standard and is updated with the standard, std_logic_arith is not an IEEE standard.
use ieee.numeric_std.all ;
In response to Jim's comment I wrote a simple thermal model test bench to test your design.
I only changed your design to use package numeric_std instead of the Synopsys packages. The rest is just prettifying and eliminating comments not germane to the question of whether or not Tact ever reaches Tset.
library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
entity thermo is
port (
CLK: in std_logic;
Tset, Tact: in std_logic_vector (6 downto 0);
Heaton: out std_logic
);
end entity thermo;
architecture behavioral of thermo is
signal TsetINT, TactINT: integer range 63 downto -64;
begin
HEAT_ON_OFF:
process
variable ONOFF: std_logic;
begin
TsetINT <= to_integer (signed (Tset)); -- package numeric_std
TactINT <= to_integer (signed (Tact)); -- instead of conv_integer
ONOFF := '0'; -- AT ISSUE -- so variable does not hang on start
wait until CLK'event and CLK = '1';
if TactINT <= TsetINT - 2 then -- operator precedence needs no parens
ONOFF := '1';
elsif TactINT >= TsetINT + 2 then
ONOFF := '0';
end if;
Heaton <= ONOFF;
end process;
end architecture behavioral;
You have a comment in your process asking why conv_integer was required instead of to_integer. That prompted the change.
I removed superfluous parentheses based on operator order precedence (adding operators being higher precedence than relational operators), notice Jim's answer did the same.
So the simple model thermal model runs with a clock set to a 1 second period, and has two coefficients, relating to the temperature increase when Heaton is '1' or not. I arbitrarily set the heating up coefficient to 1 every 4 clocks, and the temperature decay coefficient to 1 every 10 clocks. Also set the ambient temperature (tout) to 10 and tset to 22. The numbers selected are severe to keep the model run time short enhancing portability without relying on setting a simulator resolution limit.
The thermal model was implemented using fixed signed arithmetic without using fixed_generic_pkg, allowing portability to -1993 tools without math packages and includes a fractional part, responsible for the different widths of Heaton true after reaching normal operating temperature. The model could just as easily have been implemented with two different precursor counters used to tell when to increment or decrement Tact.
Using REAL types is possible, not desirable because converting REAL to INTEGER (then to SIGNED) isn't portable (IEEE Std 1076-2008 Annex D).
The idea here is to demonstrate the lack of hysteresis and demonstrate the model doesn't reach Tset:
The lack of hitting Tset (22 + 2) is based on the lack of hysteresis. Hysteresis is desirable for reducing the number of heat on and off cycles The idea is once you start the heater you leave in on for a while, and once you stop it you want to leave it off for a while too.
Using Jim's modification:
-- signal TsetINT, TactINT: integer range 63 downto -64;
begin
HEAT_ON_OFF:
process (CLK)
begin
if rising_edge(CLK) then
if signed(Tact) <= signed(Tset) - 2 then
Heaton <= '1';
elsif signed(Tact) >= signed(Tset) + 2 then
Heaton <= '0';
end if;
end if;
end process;
gives us longer Heaton on and off cycles, decreasing how many times the heater starts and stops:
And actually allows us to see the temperature reach Tset + 2 as well as Tset - 2. where these thresholds provide the hysteresis which is characterized as a minimum on or minimum off time, depending on the efficiency of the heater and heat loss rate when the heater is off.
So what changed in the execution of the thermo model process? Look at the difference in the synthesis results for the two versions.

Why can't I synthesize this VHDL program?

I am new at VHDL, and I am trying to do a Binary to BCD converter, I have serached on Internet and now I am trying to make my own to understand it and VHDL, here is my program:
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
-- Uncomment the following library declaration if using
-- arithmetic functions with Signed or Unsigned values
use IEEE.NUMERIC_STD.ALL;
-- Uncomment the following library declaration if instantiating
-- any Xilinx primitives in this code.
--library UNISIM;
--use UNISIM.VComponents.all;
entity Binary_to_BCD is
--generic(n: integer := 2);
Port ( data : in unsigned (7 downto 0);
bcdout : out unsigned (11 downto 0));
end Binary_to_BCD;
architecture Behavioral of Binary_to_BCD is
-- Inicio el proceso de conversion
begin
convert : process (data) is
variable i : integer := 0;
variable bin : unsigned (7 downto 0) := data;
variable bcd : unsigned (11 downto 0) := to_unsigned(0, 12);
begin
-- Repito por el numero de bits
for i in 0 to 7 loop
bcd := bcd sll 1; -- Desplazo un lugar a la izquierda el BCD
bcd(0) := bin(7); -- Ingreso el nuevo bit al BCD
bin := bin sll 1; -- Desplazo el bit que saque antes a la izquierda
-- Compruebo cada grupo de 4 bits del BCD, si se pasa de 4 le sumo 3
if(bcd(11 downto 8) > "0101") then
bcd(11 downto 8) := bcd(11 downto 8) + "0011";
end if;
if(bcd(7 downto 4) > "0101") then
bcd(7 downto 4) := bcd(7 downto 4) + "0011";
end if;
if(bcd(3 downto 0) > "0101") then
bcd(3 downto 0) := bcd(3 downto 0) + "0011";
end if;
end loop;
bcdout := bcd;
end process convert;
end Behavioral;
I get this error on line 66 which is bcdout := bcd;:
Signal 'bcdout' bcdout is at left hand side of variable assignment statement.
After reading on the web and books I used unsigned instead of std_logic_vector because I need to rotate bits and arithmetic operations but still it doesn't synthesize.
Tried changing unsigned to integer and := to <= but nothing works. It should be something very stupid but I don't realize. Thank you very much in advance.
The immediate problem is the incorrect use of variable assignment := instead of signal assignment <= for the bcdout signal - exactly as the error message and other answers point out.
However there is an underlying confusion about where you are in a VHDL process, that is not unusual when starting out - as revealed in the comments about functions.
A common approach to this confusion is to point out tht "VHDL is used for hardware design and not programming" that - while useful in some ways - can lead to artificially primitive and painfully low level uses of VHDL that are really holding it back.
Writing VHDL in a "software way" CAN work - and very well - however it does require a wider perspective on software AND hardware engineering than you can pick up through merely learning C.
The above code is probably synthesisable and will probably work - but it will almost certainly NOT do what you think it does. However a few small changes are in order rather than a completely different approach.
A couple of pointers may help :
the VHDL equivalent of a C function is a VHDL function.
the C equivalent of VHDL procedure is a void function.
(yes, C has procedures : it just calls them void functions to be contrary! :-)
the C equivalent of a VHDL process is ... a process. In other words, an entire C program as long as it doesn't use pthreads or fork/join.
And now you can see that VHDL is designed for parallel computation in a vastly more streamlined way than any dialect of C - processes are just building blocks, and signals are reliable forms of message passing or shared storage between processes.
So, within a process, you can (to a certain extent) think in software terms - but it is a HUGE mistake to think about "calling" a process as if it were a function.
Apologies if you've seen this Q&A before but it will help understand the semantics of a VHDL process, and the use of signals between processes.
Now, as to the specific problems with your code:
1) It is asynchronous, i.e. unclocked. That means, guaranteeing how it responds to glitches on the input is ... difficult ... and knowing when the result is valid is harder than you need. Like uncontrolled use of global variables in C - not best practice!
So move to a clocked process for a safer, more analyzable design. This is also a step towards increasing its speed later. But for now, think of a VHDL clocked process as an event loop or perhaps an interrupt handler in C. It wakes up when told to, executes in (effectively) zero time, and sleep()s until next time.
convert : process (clk) is
variable bin : unsigned (7 downto 0);
...
begin
if rising_edge(clk) then
bin := data;
for i in 0 to 7 loop
...
end loop;
end if;
bcdout <= bcd;
end process convert;
2) the loops will be unrolled and generate a lot of hardware. This may not be a problem : it will deliver a result reasonably quickly (unlike the software equivalent!) There are ways to reduce the hardware use (state machines) or increase its speed (pipelining, link above) but they can wait for now...
3) This is actually the biggest problem with your original : your assignment of data to bin is actually a process variable initialisation not an assignment! It is only executed once, at t=0... And this is the most likely cause of any mis-operation you have seen.
The modified clocked example above assigns the latest data value every time the process is woken : i.e. every clock cycle, and is thus more likely to do what you want.
4) Minor niggle : your declaration of "i" is redundant and actually hidden by a new implicit "i" created by the loop statement. This implicit declaration is both safer and better than an explicit one because it takes its type explicitly from the loop bounds. Imagine what might happen with for(int i; i<= 100000; i++) when int is a 16-bit type...
Huh, strange. Have you tried making bcd a signal instead of a variable?
However, I think your main problem here is that you are trying to write VHDL in a "software" way, using a for loop and sequential logic. That is generally not the way you should write hardware descriptions. You should either use combinational logic, which involves concurrent assignment, or sequential logic, which involves doing things on the rising edge of the clock. It seems that what you are trying to implement is a combinational circuit. In that case, you should write separate concurrent assignments for each of your decimal digits. Take a look at http://www.csee.umbc.edu/portal/help/VHDL/concurrent.html for some examples of concurrent signal assignments. You will probably want to use either selected or conditional signal assignment.
bcdout is a signal, and you are using the variable assignment operator := with it
replace line
bcdout := bcd;
with
bcdout <= bcd;
I've not tried to compile to see if there are any other problems, but that should answer your question.

How to make a simple 4 bit parity checker in VHDL?

I am trying to learn VHDL and I'm trying to make 4-bit parity checker. The idea is that the bits come from one input line (one bit per clock pulse) and the checker should find out if there is odd number of 1s in the 4-bit sequence (i.e 1011 , 0100 , etc.) and send an error output(e.g error flag: error <=´1´) if there is.
Would someone give me an example how it´s done, so that I can study it?
I have tried searching the web, but all the discussions I found were related to something way more complicated and I could not understand them.
VHDL 2008 standard offers a new xor operator to perform this operation. Much more simple than the traditional solution offered by Aaron.
signal Data : std_logic_vector(3 downto 0) ;
signal Parity : std_logic ;
. . .
Parity <= xor Data ;
This assumes "invec" is your input std_logic_vector:
parity <= invec(3) xor invec(2) xor invec(1) xor invec(0);
If it got any larger than 4 inputs, a loop would probably be best:
variable parity_v : std_logic := '0';
for i in invec'range loop
parity_v := parity_v xor invec(i);
end loop;
parity <= parity_v;
That loop would be converted into the proper LUT values at synthesis time.
(I did this from memory; may be slight syntax issues.)
small syntax error in the code. should remove ":" after loop.
library ieee;
use ieee.std_logic_1164.all;
entity bus_parity is
generic(
WPARIN : integer := 8
);
port(
parity_in : in std_logic_vector(WPARIN-1 downto 0);
parity_out : out std_logic
);
end entity;
architecture rtl of bus_parity is
begin
process(parity_in)
variable i : integer;
variable result: std_logic;
begin
result := '0';
for i in parity_in'range loop
result := result xor parity_in(i);
end loop;
parity_out <= result;
end process;
end architecture;
Or in Verilog:
`timescale 1ns/10ps
`default_nettype none
module bus_parity #(
parameter WPARIN = 8
) (
input wire [WPARIN-1:0] parity_in,
output reg parity_out
);
always #* begin : parity
integer i;
reg result;
result = 1'b0;
for(i=0; i < WPARIN-1; i=i+1) begin
result = result ^ parity_in[i];
end
parity_out = result;
end
endmodule
`default_nettype wire

Resources