Binary serial adder - VHDL - vhdl

I'm trying to design a 32bit binary serial adder in VHDL, using a structural description. The adder should make use of a full adder and a d-latch. The way I see it is:
Full adder:
architecture Behavioral of FullAdder is
begin
s <= (x xor y) xor cin;
cout <= (x and y) or (y and cin) or (x and cin);
end Behavioral;
D-Latch:
architecture Behavioral of dLatch is
begin
state: process(clk)
begin
if(clk'event and clk = '1') then
q <= d;
end if;
end process;
end Behavioral;
Serial adder:
add: process ( clk )
variable count : integer range 0 to 31;
variable aux : STD_LOGIC;
variable aux2 : STD_LOGIC;
begin
if(clk'event and clk = '1') then
fa: FullAdder port map(x(count), y(count), aux, s(count), aux2);
dl: dLatch port map(clock, aux2, aux);
count := count + 1;
end if;
end process;
However, it doesn't seem to work.
Also, what would be the simplest way to pipeline the serial adder?

"It doesn't seem to work" is pretty general, but one problem I see is that you are trying to instantiate the component fa: FullAdder within a process. Think about what component instantiation means in hardware, and you will realize that it makes no sense to instantiate the module on the rising_edge of clk...
Move the instantiation out of the process, and it should at least remove the syntax error you should be seeing ("Illegal sequential statement." in ModelSim).

For pipelining the serial adder, the best way is to connect the adders and d flip-flops one after the other. So, you would have the cout of the first adder be the input of a flip-flop. The output of that flip-flop will be the cin of the next adder and so on. Be careful though, because you will also have to pipeline the s of each adder, as well as each bit of the input, by essentially putting several d flip-flops in a row to copy them through the various pipeline stages.

Related

VHDL - synthesis results is not the same as behavioral

I have to write program in VHDL which calculate sqrt using Newton method. I wrote the code which seems to me to be ok but it does not work.
Behavioral simulation gives proper output value but post synthesis (and launched on hardware) not.
Program was implemented as state machine. Input value is an integer (used format is std_logic_vector), and output is fixed point (for calculation
purposes input value was multiplied by 64^2 so output value has 6 LSB bits are fractional part).
I used function to divide in vhdl from vhdlguru blogspot.
In behavioral simulation calculating sqrt takes about 350 ns (Tclk=10 ns) but in post synthesis only 50 ns.
Used code:
library ieee;
use ieee.std_logic_1164.all;
use ieee.std_logic_arith.all;
use ieee.std_logic_unsigned.all;
entity moore_sqrt is
port (clk : in std_logic;
enable : in std_logic;
input : in std_logic_vector (15 downto 0);
data_ready : out std_logic;
output : out std_logic_vector (31 downto 0)
);
end moore_sqrt;
architecture behavioral of moore_sqrt is
------------------------------------------------------------
function division (x : std_logic_vector; y : std_logic_vector) return std_logic_vector is
variable a1 : std_logic_vector(x'length-1 downto 0):=x;
variable b1 : std_logic_vector(y'length-1 downto 0):=y;
variable p1 : std_logic_vector(y'length downto 0):= (others => '0');
variable i : integer:=0;
begin
for i in 0 to y'length-1 loop
p1(y'length-1 downto 1) := p1(y'length-2 downto 0);
p1(0) := a1(x'length-1);
a1(x'length-1 downto 1) := a1(x'length-2 downto 0);
p1 := p1-b1;
if(p1(y'length-1) ='1') then
a1(0) :='0';
p1 := p1+b1;
else
a1(0) :='1';
end if;
end loop;
return a1;
end division;
--------------------------------------------------------------
type state_type is (s0, s1, s2, s3, s4, s5, s6); --type of state machine
signal current_state,next_state: state_type; --current and next state declaration
signal xk : std_logic_vector (31 downto 0);
signal temp : std_logic_vector (31 downto 0);
signal latched_input : std_logic_vector (15 downto 0);
signal iterations : integer := 0;
signal max_iterations : integer := 10; --corresponds with accuracy
begin
process (clk,enable)
begin
if enable = '0' then
current_state <= s0;
elsif clk'event and clk = '1' then
current_state <= next_state; --state change
end if;
end process;
--state machine
process (current_state)
begin
case current_state is
when s0 => -- reset
output <= "00000000000000000000000000000000";
data_ready <= '0';
next_state <= s1;
when s1 => -- latching input data
latched_input <= input;
next_state <= s2;
when s2 => -- start calculating
-- initial value is set as a half of input data
output <= "00000000000000000000000000000000";
data_ready <= '0';
xk <= "0000000000000000" & division(latched_input, "0000000000000010");
next_state <= s3;
iterations <= 0;
when s3 => -- division
temp <= division ("0000" & latched_input & "000000000000", xk);
next_state <= s4;
when s4 => -- calculating
if(iterations < max_iterations) then
xk <= xk + temp;
next_state <= s5;
iterations <= iterations + 1;
else
next_state <= s6;
end if;
when s5 => -- shift logic right by 1
xk <= division(xk, "00000000000000000000000000000010");
next_state <= s3;
when s6 => -- stop - proper data
-- output <= division(xk, "00000000000000000000000001000000"); --the nearest integer value
output <= xk; -- fixed point 24.6, sqrt = output/64;
data_ready <= '1';
end case;
end process;
end behavioral;
Below screenshoots of behavioral and post-sythesis simulation results:
Behavioral simulation
Post-synthesis simulation
I have only little experience with VHDL and I have no idea what can I do to fix problem. I tried to exclude other process which was for calculation but it also did not work.
I hope you can help me.
Platform: Zynq ZedBoard
IDE: Vivado 2014.4
Regards,
Michal
A lot of the problems can be eliminated if you rewrite the state machine in single process form, in a pattern similar to this. That will eliminate both the unwanted latches, and the simulation /synthesis mismatches arising from sensitivity list errors.
I believe you are also going to have to rewrite the division function with its loop in the form of a state machine - either a separate state machine, handshaking with the main one to start a divide and signal its completion, or as part of a single hierarchical state machine as described in this Q&A.
This code is neither correct for simulation nor for synthesis.
Simulation issues:
Your sensitivity list is not complete, so the simulation does not show the correct behavior of the synthesized hardware. All right-hand-side signals should be include if the process is not clocked.
Synthesis issues:
Your code produces masses of latches. There is only one register called current_state. Latches should be avoided unless you know exactly what you are doing.
You can't divide numbers in the way you are using the function, if you want to keep a proper frequency of your circuit.
=> So check your Fmax report and
=> the RTL schematic or synthesis report for resource utilization.
Don't use the devision to shift bits. Neither in software the compiler implements a division if a value is shifted by a power of two. Us a shift operation to shift a value.
Other things to rethink:
enable is a low active asynchronous reset. Synchronous resets are better for FPGA implementations.
VHDL code may by synthesizable or not, and the synthesis result may behave as the simulation, or not. This depends on the code, the synthesizer, and the target platform, and is very normal.
Behavioral code is good for test-benches, but - in general - cannot be synthesized.
Here I see the most obvious issue with your code:
process (current_state)
begin
[...]
iterations <= iterations + 1;
[...]
end process;
You are iterating over a signal which does not appear in the sensitivity list of the process. This might be ok for the simulator which executes the process blocks just like software. On the other hand side, the synthesis result is totally unpredictable. But adding iterations to the sensitivity list is not enough. You would just end up with an asynchronous design. Your target platform is a clocked device. State changes may only occur at the trigger edge of the clock.
You need to tell the synthesizer how to map the iterations required to perform this calculation over the clock cycles. The safest way to do that is to break down the behavioural code into RTL code (https://en.wikipedia.org/wiki/Register-transfer_level#RTL_in_the_circuit_design_cycle).

Variable or signal in vhdl for shared value between different process

I need to share a value (a real) between two process, but when I try to run my code, quartus gives me an error.
library IEEE;
USE ieee.std_logic_1164.all;
USE ieee.std_logic_arith.all;
USE ieee.std_logic_unsigned.all;
use IEEE.MATH_REAL.ALL;
entity de0nano is
port (
CLOCK_50 : in std_logic;
KEY : in std_logic_vector(1 downto 0);
SW : in std_logic_vector(3 downto 0);
LED : out std_logic_vector(7 downto 0);
GPIO : inout std_logic_vector(35 downto 0)
);
end de0nano;
architecture struct of de0nano is
--declarations
signal PN : real :=0.0 ;
signal PR : real :=0.0 ;
signal RC : integer :=1;
signal NC : integer :=1;
signal BET : integer :=1;
begin
count : process (CLOCK_50, GPIO)
begin
--A <= KEY(0);
GPIO(24) <= '1';
--functional coding
LED <= "00011000";
if (pn > pr) then
GPIO(26) <= '1';
LED <= "00000001";
else
GPIO(26) <= '0';
end if;
if (pn = pr) then
GPIO(26) <= '1';
LED <= "00000010";
else
GPIO(26) <= '0';
end if;
if (pn < pr) then
GPIO(26) <= '1';
LED <= "00000011";
else
GPIO(26) <= '0';
end if;
end process;
probabilityController : process (CLOCK_50, KEY)
begin
--stato iniziale
if((RC + NC + BET)=1) then
pr <= 0.5;
pn <= 0.5;
end if;
--sequenza rossi consecutivi
if(RC>0) then
pr <= (5)**RC;
pn <= 1- (5)**RC;
end if;
--sequenza neri consecutivi
if(NC>0) then
pr <= (5)**NC;
pn <= 1- (5)**NC;
end if;
end process;
betController : process (CLOCK_50)
begin
end process;
colorController : process (CLOCK_50, KEY)
begin
if(KEY(0)='1') then
NC<=0;
RC <= RC+1;
end if;
if(KEY(1)='1') then
RC<=0;
NC <= NC+1;
end if;
end process;
end str
How can I operate in the same signal/variable from two different processes?
VHDL is a hardware description language. A VHDL description can be simulated (executed a bit like you do with most programming languages) or synthesized (transformed in a network of interconnected simple hardware elements). Some tools are pure simulators (Mentor Graphics Modelsim, Cadence ncsim...), others are pure synthesizers (Mentor Graphics Precision RTL, Cadence RTL compiler...) and others can do both. Quartus pertains to the last category. So, the first thing to do is to decide whether you want to simulate, synthesize or both.
In case you want to simulate you must fix three errors:
the position of your signal declaration,
the way you assign it (:=) which is the variable assignment operator, not the signal assignment (<=)
and the fact that you drive it from two processes while it is of an unresolved type (real). See this other answer for resolved / unresolved VHDL types.
Your code could then look like this (but as I do not know what you are trying to do, it is probably not what you want):
architecture V1 of AOI is
Signal foobar : real := 0.0;
begin
OneTwo : process (clk)
Begin
Foobar <= foobar + 2.0;
End process;
end V1;
If you want to synthesize you will have to fix a few more problems:
You are using the real type which is the floating point VHDL type. This is not synthesizable by the synthesizers I know. Indeed, what would you expect the synthesizer to do? Instantiate a complete floating point unit? What brand? So, you will have to replace real by some other type (integers, bit vectors...).
You are assigning your signal on both edges of what I believe is your clock (clk). This is probably not what you want.
You are initializing the signal at declaration time. This is usually not synthesizable by the synthesizers I know. In fact this initialization time has a clear meaning for simulation: it is the beginning of the simulation. But what about hardware? What is the beginning of a piece of hardware? Manufacturing? Power up? So, if you want the signal to be initialized at some point you will have to add a hardware reset, driven by a reset input.
All in all you could have something like:
architecture V1 of AOI is
Signal foobar : natural range 0 to 255;
begin
OneTwo : process (clk)
Begin
if rising_edge(clk) then
if reset = '1' then
foobar <= 0;
else
foobar <= foobar + 2;
end if;
end if;
End process;
end V1;
Notes:
VHDL is case insensitive but you should try to be consistent, it will help you.
You should probably take a VHDL course or read a VHDL primer before trying to use the language. It is radically different from the programming languages you already know. Hardware and software are pretty different worlds, even if they are strongly connected at the end.

Why it is necessary to use internal signal for process?

I'm learning VHDL from the root, and everything is OK except this. I found this from Internet. This is the code for a left shift register.
library ieee;
use ieee.std_logic_1164.all;
entity lsr_4 is
port(CLK, RESET, SI : in std_logic;
Q : out std_logic_vector(3 downto 0);
SO : out std_logic);
end lsr_4;
architecture sequential of lsr_4 is
signal shift : std_logic_vector(3 downto 0);
begin
process (RESET, CLK)
begin
if (RESET = '1') then
shift <= "0000";
elsif (CLK'event and (CLK = '1')) then
shift <= shift(2 downto 0) & SI;
end if;
end process;
Q <= shift;
SO <= shift(3);
end sequential;
My problem is the third line from bottom. My question is, why we need to pass the internal signal value to the output? Or in other words, what would be the problem if I write Q <= shift (2 downto 0) & SI?
In the case of the shown code, the Q output of the lsr_4 entity comes from a register (shift representing a register stage and being connected to Q). If you write the code as you proposed, the SI input is connected directly (i.e. combinationally) to the Q output. This can also work (assuming you leave the rest of the code in place), it will perform the same operation logically expect eliminate one clock cycle latency. However, it's (generally) considered good design practice to have an entity's output being registered in order to not introduce long "hidden" combinational paths which are not visible when not looking inside an entity. It usually makes designing easier and avoids running into timing problems.
First, this is just a shift register, so no combinational blocks should be inferred (except for input and output buffers, which are I/O related, not related to the circuit proper).
Second, the signal called "shift" can be eliminated altogether by specifying Q as "buffer" instead of "out" (this is needed because Q would appear on both sides of the expression; "buffer" has no side effects on the inferred circuit). A suggestion for your code follows.
Note: After compiling your code, check in the Netlist Viewers / Technology Map Viewer tool what was actually implemented.
library ieee;
use ieee.std_logic_1164.all;
entity generic_shift_register is
generic (
N: integer := 4);
port(
CLK, RESET, SI: in std_logic;
Q: buffer std_logic_vector(N-1 downto 0);
SO: out std_logic);
end entity;
architecture sequential of generic_shift_register is
begin
process (RESET, CLK)
begin
if (RESET = '1') then
Q <= (others => '0');
elsif rising_edge(CLK) then
Q <= Q(N-2 downto 0) & SI;
end if;
end process;
SO <= Q(N-1);
end architecture;

8 bit serial adder with accumulator

I am writing a VHDL code to impelemt 8 bit serial adder with accumulator.
When i do simulation, the output is always zeros! And some times it gives me the same number but with a shift !
I dont know what is the problem, i tried to put A,B as inout but didnt work as well. Can anybody help please.
This is the code:
entity SA is
Port ( st : in std_logic;
A,B: inout std_logic_vector ( 7 downto 0);
clk : in std_logic;
acc : out bit_vector(7 downto 0)); end SA;
architecture Behavioral of SA is
signal ps,ns: integer range 0 to 7;
signal C,D: bit_vector (7 downto 0);
signal ci,ciplus,si,sh:bit;
begin
si<=A(0) xor B(0) xor ci ;
ciplus <=(A(0) and B(0)) or (A(0) and ci ) or ( B(0) and ci );
process(ps,st)
begin
case ps is
when 0=> if(st='0')then
ns<=0;
else
ns<=1;
sh<='1';
end if;
when 1 to 6 => sh<='1';
ns<= ps+1;
when 7=> sh<='1';
ns <=0;
end case;
end process;
process(clk)
begin
if(clk 'event and clk ='1')then
ps <= ns;
ci<= ciplus;
end if;
if(sh='1') then
C<=si & A(7 downto 1) ;
D<=B(0) & B(7 downto 1);
end if;
end process;
acc<= C;
end Behavioral;
`
Your second process is written incorrectly. Prior to writing a process, you should always decide whether the process is sequential or combinatorial, and then write the process accordingly.
To help you write your code, especially when starting out with hardware description languages, please please please always draw a block diagram first, and then describe that block diagram using VHDL.
As it is, your second process:
Mixes combinatorial and sequential logic.
Is missing signals in the process sensitivity list.
Generates a latch because C and D are not assigned in all paths through the process.
Your first process has similar problems.
try initializing ps and ns see if that does the trick I am on my phone now so i cant simulate to help but usualy my problems in VHDL design are form uninitilized integers
signal ps,ns: integer range 0 to 7:=0;
you might want to check your warnings list see if that helps

PRBS Generator module in VHDL

Here i am posting a snapshot of prbs
My code for prbs module is
-- Module Name: prbs - Behavioral
-- Project Name: modulator
-- Description:
--To make it of N bit replace existing value of N with desired value of N
----------------------------------------------------------------------------------
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.STD_LOGIC_ARITH.ALL;
use IEEE.STD_LOGIC_UNSIGNED.ALL;
---- Uncomment the following library declaration if instantiating
---- any Xilinx primitives in this code.
--library UNISIM;
--use UNISIM.VComponents.all;
entity prbs is
Port ( pclock : in STD_LOGIC;
preset : IN std_logic := '0';
prbsout : out STD_LOGIC);
end prbs;
architecture Behavioral of prbs is
COMPONENT dff is
PORT(
dclock : IN std_logic;
dreset : IN std_logic;
din : IN std_logic ;
dout : OUT std_logic
);
END COMPONENT;
signal dintern : std_logic_vector (4 downto 1); --Change value of N to change size of shift register
signal feedback : std_logic := '0';
begin
instdff : dff port map (pclock , preset , feedback , dintern(1));
genreg : for i in 2 to 4 generate --Change Value of N Here to generate that many instance of d flip flop
begin
instdff : dff port map ( pclock , preset , dintern(i-1) , dintern(i));
end generate genreg;
main : process(pclock)
begin
if pclock'event and pclock = '1' then
if preset = '0' then
if dintern /= "0" then
feedback <= dintern(1) xor dintern(3); -- For N equals four;
--feedback <= dintern(4) xor dintern(5) xor dintern(6) xor dintern(8); -- For N equals eight;
--feedback <= dintern(11) xor dintern(13) xor dintern(14) xor dintern(16); -- For N equals sixteen;
--feedback <= dintern(1) xor dintern(2) xor dintern(22) xor dintern(32); -- For N equals thirty two
else
feedback <= '1';
end if;
end if;
end if;
end process main;
prbsout <= dintern(4) ; --Change Value of N Here to take output to top entity
end Behavioral;
In it i am instantiating a d flip flop module
d ff module code
----------------------------------------------------------------------------------
-- Module Name: dff - Behavioral
-- Project Name:
-- Description:
----------------------------------------------------------------------------------
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.STD_LOGIC_ARITH.ALL;
use IEEE.STD_LOGIC_UNSIGNED.ALL;
---- Uncomment the following library declaration if instantiating
---- any Xilinx primitives in this code.
--library UNISIM;
--use UNISIM.VComponents.all;
entity dff is
Port ( dclock : in STD_LOGIC ;
dreset : in STD_LOGIC ;
din : in STD_LOGIC;
dout : out STD_LOGIC);
end dff;
architecture Behavioral of dff is
begin
process(dclock)
begin
if dclock'event and dclock = '1' then
if dreset = '0' then
dout <= din;
else
dout <= '1';
end if;
end if;
end process;
end Behavioral;
But i am not getting desired output.
In top level entity i am getting always 1 at prbsout signal.
When i try to simulate then prbsout signal becomes undefined.
What i am missing?
The prbs module reset at preset does not apply to the feedback signal,
probably because the intention was to use the initial value of 0 assigned in
the declaration of the feedback signal. However, since the dff modules
uses synchronous reset, and the dintern signal will be undriven U at start,
and since then next value for feedback is calculated using dintern(1) in an
xor, the feedback will get undefined right after start, and can't recover,
even if a lengthy reset is applied. See waveform from ModelSim below.
An immediate fix for the reset issue is to apply reset for feedback also in
the main process:
...
else -- preset /= '0'
feedback <= '0';
...
Now at least reset works, and can make the prbs generate a sequence. See
waveform below.
Just a few additional comments to the code, while at it:
Instead of dclock'event and dclock = '1' you can use rising_edge(dclock),
which I think must reader will find easier to understand, and it is less
error prone
For most tools, it is unnecessary to make a separte module just for a
flip-flop, like the dff module, since the tools can infer flip-flop
directly from the process even when advanced expressions are used for signal
assignments are used.
But, I don't think the output is what you actually want. Based on your
design, and the selected taps for the LFSR, it looks like you want to generate
maximum length LFSR sequences, that is sequences with a length of 2 ** N - 1
for a LFSR register being N bits long.
The principles of LFSR and the taps to for feedback generation is described
on Wikipedia: Linear feedback shift
register.
However, since the feedback signal is generated as a flip-flop, it becomes
part of the LSFR shift register, thus adds a bit to the length, but the tap
values are based on the dintern part of the LFSR only, the taps will be
wrong. Selecting the wrong bits will result in a LFSR sequence that is less
than then maximum sequence, and you can also see that in the simulation output,
where the sequence is only 6 cycles long, even through the dintern(4 downto
1) + feedback together makes a 5 bit register.
So a more thorough rewrite of the prbs module is required, if what you want
is to generate maximum length PRBS sequences, and below is an example of how
the prbs module can be written:
library ieee;
use ieee.std_logic_1164.all;
entity prbs_new is
generic(
BITS : natural);
port(
clk_i : in std_logic;
rst_i : in std_logic;
prbs_o : out std_logic);
end entity;
library ieee;
use ieee.numeric_std.all;
architecture syn of prbs_new is
signal lfsr : std_logic_vector(BITS downto 1); -- Flip-flops with LFSR state
function feedback(slv : std_logic_vector) return std_logic is -- For maximum length LFSR generation
begin
case slv'length is
when 3 => return slv( 3) xor slv( 2);
when 4 => return slv( 4) xor slv( 3);
when 8 => return slv( 8) xor slv( 6) xor slv( 5) xor slv(4);
when 16 => return slv(16) xor slv(15) xor slv(13) xor slv(4);
when 32 => return slv(32) xor slv(22) xor slv( 2) xor slv(1);
when others => report "feedback function not defined for slv'lenght as " & integer'image(slv'length)
severity FAILURE;
return 'X';
end case;
end function;
begin
process (clk_i, rst_i) is
begin
if rising_edge(clk_i) then
if unsigned(lfsr) /= 0 then
lfsr <= lfsr(lfsr'left - 1 downto lfsr'right) & feedback(lfsr); -- Left shift with feedback in
end if;
end if;
if rst_i = '1' then -- Asynchronous reset
lfsr <= std_logic_vector(to_unsigned(1, BITS)); -- Reset assigns 1 to lfsr signal
end if;
end process;
prbs_o <= lfsr(BITS); -- Drive output
end architecture;
Comments to ´prbs_new´ module
Generic BITS is added so diffrent LFSR length can be made from the same code.
Ports are named with "_i" for inputs and "_o" for outputs, since this naming
convension is very useful when tracing signals at a toplevel with multiple
modules.
The VHDL standard package ieee.numeric_std is used instead of
the non-standard package ieee.std_logic_unsigned.
Asynchronous reset is used instead of synchronous reset and initial value in
the signal declaration.
The advantage over synchronous reset is that asynchronous reset typical
applies to a dedicated input on the flip-flops in FPGA and ASIC
technology, and not in the potentially timing critical data path, whereby
the design can be faster.
The advantage over initial value in the signal declation is that FPGA and
ASIC technologies are more likely to be able to implement this; there are
cases where initial values are not supported. Also functional reset
makes restart possible in a test bench without having to reload the
simulator.
There is no check for an all-0 value of the lfsr signal in the process,
since the lfsr will never get an all-0 value if proper maximum length taps
are used, and the lfsr signal is reset to a non-0 value.
It looks like you are never setting your internal state (dintern) to a known value. Since all subsequent states are calculated from your initial dintern value, they are unknown as well. Try assigning an initial state to dintern, or fixing your preset code to actually do something when preset is high (and then assert it at the start of your testbench).

Resources