Array of 1-bit-wide memory - vhdl

I'm using ISE 14.7 and i'm trying to create design with some 1-bit-wide distributed RAM blocks.
My memory declaration:
type tyMemory is array (0 to MEMORY_NUM - 1) of std_logic_vector(MEMORY_SIZE - 1 downto 0) ;
signal Memory: tyMemory := ( others => (others => '0')) ;
attribute ram_style : string ;
attribute ram_style of Memory : signal is "distributed" ;
My code:
MemoryGen : for i in 0 to MEMORY_NUM - 1 generate
process( CLK )
begin
if rising_edge(CLK) then
if CE = '1' then
DataOut(i) <= Memory(i)(Addr(i)) ;
Memory(i)(Addr(i)) <= DataIn(i) ;
end if ;
end if ;
end process ;
end generate ;
After synthesis i get this warning:
WARNING:Xst:3012 - Available block RAM resources offer a maximum of two write ports.
You are apparently describing a RAM with 16 separate write ports for signal <Memory>.
The RAM will be expanded on registers.
How can i forse xst to use distributed memory blocks with size=ARRAY_LENGTH and width=1?
I can create and use separate memory component(and is works), but i need more elegant solution.

You need to create an entity that describes a variable length 1-bit wide memory, then use a generate statement to create an array of these. While what you have done would provide the functionality you are asking for in a simulator, most FPGA tools will only extract memory elements if your code is written in certain ways.
You can find documentation on what code Xilinx ISE tools will understand as a memory element by selecting the appropriate document for your device here http://www.xilinx.com/support/documentation/sw_manuals/xilinx14_7/ise_n_xst_user_guide_v6s6.htm . Look under 'HDL Coding techniques'.
Note that if your memory length is large, you will not be able to get maximum performance from it without adding manual pipelining. I think you will get a synthesis message if your memory exceeds the intended useful maximum length for distributed memories. Assuming you are using a Spartan 6 device, you can find information on what the useful supported distributed memory sizes are here: http://www.xilinx.com/support/documentation/user_guides/ug384.pdf page 52.

This should inferre 16 one bit wide BlockRAMs:
architecture ...
attribute ram_style : string;
subtype tyMemory is std_logic_vector(MEMORY_SIZE - 1 downto 0) ;
begin
genMem : for i in 0 to MEMORY_NUM - 1 generate
signal Memory : tyMemory := (others => '0');
attribute ram_style of Memory : signal is "block";
begin
process(clk)
begin
if rising_edge(clk) then
if CE = '1' then
Memory(Addr(i)) <= DataIn(i) ;
DataOut(i) <= Memory(Addr(i)) ;
end if ;
end if ;
end process ;
end generate ;
end architecture;

Related

How generate sine wave with vhdl?

I am a beginner in vhdl, I am trying to generate a sinus and square singal with a frequency of 50 Mhz, but first i'm trying to generate the sinus wave. I saw a lot of tutorials but it was quite complicated to understand. Here is the code I made. Thank you in advance for your help :)
Indications
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.STD_LOGIC_UNSIGNED.ALL;
use IEEE.STD_LOGIC_ARITH.ALL;
entity sinus is
port(clk : in std_logic;
clear : in std_logic;
sel : in std_logic_vector(1 downto 0);
Dataout : out std_logic_vector(7 downto 0));
end sinus;
architecture Behavioral of sinus is
signal in_data : std_logic_vector(Dataout'range);
signal i : integer range 0 to 77:=0;
TYPE mem_data IS ARRAY (0 TO 255) OF integer range -128 to 127;
constant sin : mem_data := (
( 0),( 3),( 6),( 9),( 12),( 15),( 18),( 21),( 24),( 28),( 31),( 34),( 37),( 40),( 43),( 46), ( 48),( 51),( 54),( 57),( 60),( 63),( 65),( 68),( 71),( 73),( 76),( 78),( 81),( 83),( 85),( 88), ( 90),( 92),( 94),( 96),( 98),( 100),( 102),( 104),( 106),( 108),( 109),( 111),( 112),( 114),( 115),( 117), ( 118),( 119),( 120),( 121),( 122),( 123),( 124),( 124),( 125),( 126),( 126),( 127),( 127),( 127),( 127),( 127), ( 127),( 127),( 127),( 127),( 127),( 127),( 126),( 126),( 125),( 124),( 124),( 123),( 122),( 121),( 120),( 119), ( 118),( 117),( 115),( 114),( 112),( 111),( 109),( 108),( 106),( 104),( 102),( 100),( 98),( 96),( 94),( 92), ( 90),( 88),( 85),( 83),( 81),( 78),( 76),( 73),( 71),( 68),( 65),( 63),( 60),( 57),( 54),( 51), ( 48),( 46),( 43),( 40),( 37),( 34),( 31),( 28),( 24),( 21),( 18),( 15),( 12),( 9),( 6),( 3), ( 0),( -3),( -6),( -9),( -12),( -15),( -18),( -21),( -24),( -28),( -31),( -34),( -37),( -40),( -43),( -46), ( -48),( -51),( -54),( -57),( -60),( -63),( -65),( -68),( -71),( -73),( -76),( -78),( -81),( -83),( -85),( -88), ( -90),( -92),( -94),( -96),( -98),(-100),(-102),(-104),(-106),(-108),(-109),(-111),(-112),(-114),(-115),(-117), (-118),(-119),(-120),(-121),(-122),(-123),(-124),(-124),(-125),(-126),(-126),(-127),(-127),(-127),(-127),(-127), (-127),(-127),(-127),(-127),(-127),(-127),(-126),(-126),(-125),(-124),(-124),(-123),(-122),(-121),(-120),(-119), (-118),(-117),(-115),(-114),(-112),(-111),(-109),(-108),(-106),(-104),(-102),(-100),( -98),( -96),( -94),( -92), ( -90),( -88),( -85),( -83),( -81),( -78),( -76),( -73),( -71),( -68),( -65),( -63),( -60),( -57),( -54),( -51), ( -48),( -46),( -43),( -40),( -37),( -34),( -31),( -28),( -24),( -21),( -18),( -15),( -12),( -9),( -6),( -3));
begin
process(clk, clear) begin
if (clear='1') then
in_data <= (others => '0');
elsif (clk'event and clk='1') then
in_data <= in_data +1;
end if;
end process;
process (in_data(3))
begin
if (in_data(3)'event and in_data(3)='1') then
in_data <=conv_std_logic_vector(sin(i).8);
i<=i+1;
if (i=77) then
i<=0;
end if;
end if;
end process;
process(in_data, sel) begin
case sel is
when "00" =>Dataout<=in_data;
when others =>Dataout<= "00000000";
end case;
end process;
end Behavioral;
Thanks for the diagram. Because VHDL is hardware description language, the question you should ask yourself is : what do I want to implement ? where are the entity ports, the internal signals on this block diagram ?
The main issue you face is that you've no real correspondence between your design and your illustration. Moreover your VHDL coding style needs improvement both regarding the VHDL language itself and how you implement things.
Start by replacing
use IEEE.STD_LOGIC_UNSIGNED.ALL;
use IEEE.STD_LOGIC_ARITH.ALL;
by
use IEEE.NUMERIC_STD.ALL;
This is the official vendor-agnostic VHDL package for signed and unsigned types. IEEE.STD_LOGIC_UNSIGNED and STD_LOGIC_ARITH.ALL are outdated vendor-dependent VHDL packages provided at a time where vendors couldn't agree on a common definition. This led to a lot of portability issues when having a design flow with tools from different vendors (Cadence, Mentor, Synopsys). The IEEE specification body solved once and for all this issue.
If you're designing real hardware, the following code can lead to implementation problems
process(clk, clear) begin
if (clear='1') then
in_data <= (others => '0');
elsif (clk'event and clk='1') then
in_data <= in_data +1;
end if;
end process;
Your in_data signal is asynchronously cleared but synchronously set. As you don't control when the clear signal will be asserted relative to the clk rising edge, there is a risk of in_data metastability or even that some bits of the in_data remains in reset while the others capture a new value.
For a description of the various ways to implement the reset Xilinx has a well documented white paper (WP272) which provides some guidelines useful for any synchronous design be it ASIC or FPGA, Xilinx or Altera. By the way if you have a look, for example, to the widely popular AMBA specifications (AHB, AXI, AXI-Stream, ...), their reset is asserted asynchronously but deasserted synchronously.
Personnally, following Xilinx guidelines, I distribute an asynchronous reset through the entire design and generate locally a synchronous reset with the following piece of code (being locally avoid the fanout issue of a system-wide synchronous reset)
if (p_reset_n = '0') then
s_reset_on_clock <= (others => '1');
else if rising_edge(p_clock) then
s_reset_on_clock <= '0' & s_reset_on_clock(3 downto 1);
end if;
end if;
s_reset_on_clock(0) is now your local synchronous reset signal that you can use like any other signal within the synchronous code block.
Please replace the old fashioned
elsif (clk'event and clk='1') then
with
elsif rising_edge(clk) then
it will be more obvious what's going on here.
You say that you want to generate a 50MHz signal but you don't say what the system frequency (or maybe I should understand it the other way around). The ratio system clock frequency / 50MHz will give you the sequence length.
Anyway, you will need to declare a free running counter as signal, let's call it counter (in your original code you have two signals, i and in_data, for this same purpose), and reset/increment it using your locally synchronous reset and your clock signal.
In case of a square signal (let's say sel = '0'), your output will be solely determined by the value of your counter. Below a given counter value, you'll have a predetermined dataout value (your 'low' state), while above this value, you'll have another predetermined dataout value (your 'high' state).
In case of a sine signal (let's say sel = '1'), the counter will represent the phase and will be used as input to your sine lookup table (that you, by the way, could initialize with a VHDL function calculating the lookup table content instead of providing precalculated literals). The output of the sine lookup table is your dataout output.
Let me know if you need more help.

block ram (BRAM) read and write using different clocks

I am relatively new to some advanced VHDL programming and have a problem i have been facing for a while.
I will try to be very thorough in my problem description.
I am using a Digilent Nexys-3 board with SPARTAN - 6 FPGA
Here is what i am trying to implement:
Read data from ADC SPI bus and store it into the BRAM.
For this i instantiate a block RAM using a memory IP core generator
ADC_BRAM: bram
port map( ..
..
addra => addra,
dina => ADC_DATA,
..
..
);
The addra, ADC_DATA (data on SPI bus stored into a std_logic_vector) is assigned during the "read_adc" state of the FSM in the ADC module.
After the ADC has sampled 100 points I stop writing into the BRAM
main: process(clk)
begin
if(rising_edge(clk)) then
case state is
when read_adc =>
.....
.....
.....
if(addra < "1100100") then
.....
<some code>
addra <= addra + 1;
.....
elsif (addra = "1100100") then
state <= endofconversion;
end if;
when endofconversion =>
addra <= "00000000";
wea <= "0"; -- write enable for BRAM
state <= read_bram;
Next, i want to access the data from the BRAM and send it using USB-UART.
From what i understand all the BRAM's are synchronous and thus have to be implemented with process(clk,reset).
Currently i have implemented the state " read_bram" under the same process as mentioned above
main: process(clk)
begin
if(rising_edge(clk)) then
case state is
<some code>
when endofconversion =>
addra <= "00000000";
wea <= "0"; -- write enable for BRAM
state <= read_bram;
when read_bram =>
....
<some code>
....
end case;
Now the UART clock needs to be much slower than the rate at which ADC has been read.
My question is " How do i implement the read operation of the BRAM using a much slower clock, which will ensure proper functioning of the UART" ?
I also tried another approach where the BRAM module is defined in the top level file and i send the " addra, ADC_DATA" generated from the ADC module into the port map of the BRAM module.
Now, on the top level if i create a slower clock to read the data ("douta") from the BRAM and send it to UART, i need to generate "addra" in this process and send it to BRAM port "addra"
This causes issues of " multiple drivers " connected to signal "addra", as its values changes in the "process()" of both the ADC module (faster clock) and the top level module (slower clock).
Please help me out. I can provide more info if the problem description is not clear.
TLDR; How can i use 2 separate clocks while implementing synchronous memory BRAM
- "faster clock => write into BRAM" -- based on ADC sampling rate
- " slower clock => read from BRAM" -- based on UART clk (slower BAUD rate than ADC sampling speed)

VHDL LFSR Output through FPGA board SMA connector

I recently started working on an FPGA project for school, I have never worked with VHDL before so I tried my best to piece my program together. Overall, my goal is to make a prbs or LFSR to generate randomly. My vhdl code checks out in xilinx ISE software and runs in testbench fine but I need to flash the project to the board and connect an oscilloscope to one of the SMA connectors on the board, My question is how can I i forward my outputs to a single SMA connector on the Spartan 6 board
library IEEE;
use IEEE.std_logic_1164.all;
entity LFSR is
port (
clock : std_logic;
reset : std_logic;
data_out : out std_logic_vector(9 downto 0)
);
end LFSR;
architecture Behavioral of LFSR is
signal lfsr_reg : std_logic_vector(9 downto 0);
begin
process (clock)
variable lfsr_tap : std_logic;
begin
if clock'EVENT and clock='1' then
if reset = '1' then
lfsr_reg <= (others => '1');
else
lfsr_tap := lfsr_reg(6) xor lfsr_reg(9);
lfsr_reg <= lfsr_reg(8 downto 0) & lfsr_tap;
end if;
end if;
end process;
data_out <= lfsr_reg;
end Behavioral;
Now I just want to forward the output/outputs to an SMA connector so I can get the results on the oscilloscope, any help would be great
You just need to map your I/Os to actual pins on your FPGA chip. This is done in a constraints file (typically a .ucf), which you can either hand-edit (it's just text), or let a tool handle for you.
In the newer ISE tools PlanAhead is responsible for this - you can open it from the ISE Processes Pane (select User Constraints -> I/O Pin Planning (PlanAhead) - Post-synthesis).
This opens PlanAhead and gives you a list of the I/Os in your design (your clock, reset and data_out). Now you just need to map these to the correct FPGA pins. Have a look in your board documentation to find the locations of your clock-input, push-buttons (for reset) and SMA connector.
PlanAhead should create the .ucf file for you, and add it to your project. Afterwards you can edit it in the ISE editor - it's pretty self-explanatory once you have some initial content in it.
Also, check out this Xilinx guide (from page 100 and onwards) for a step-by-step guide.
Your SMA connector can only hold a single output, not a bus.
To see the MSB of your LFSR just add the following lines to your .ucf file:
NET clock LOC = $PIN;
NET reset LOC = $PIN;
NET dataout<9> LOC = $PIN; # your SMA output
NET dataout<8> LOC = $PIN;
NET dataout<7> LOC = $PIN;
NET dataout<6> LOC = $PIN;
NET dataout<5> LOC = $PIN;
NET dataout<4> LOC = $PIN;
NET dataout<3> LOC = $PIN;
NET dataout<2> LOC = $PIN;
NET dataout<1> LOC = $PIN;
NET dataout<0> LOC = $PIN;
See in your board documentation (or schematic) for the right pins and add the right pin names in your .ucf file.
I suggest to use some LEDs for the remaining outputs of dataout.

Warning "has no load", but I can't see why

I got these warnings from Lattice Diamond for each instance of any uart (currently 11)
WARNING - ngdbuild: logical net 'UartGenerator_0_Uart_i/Uart/rxCounter_cry_14' has no load
WARNING - ngdbuild: logical net 'UartGenerator_0_Uart_i/Uart/rxCounter_cry_0_COUT1_9_14' has no load
WARNING - ngdbuild: logical net 'UartGenerator_0_Uart_i/Uart/rxCounter_cry_12' has no load
WARNING - ngdbuild: logical net 'UartGenerator_0_Uart_i/Uart/rxCounter_cry_10' has no load
WARNING - ngdbuild: logical net 'UartGenerator_0_Uart_i/Uart/rxCounter_cry_8' has no load
WARNING - ngdbuild: logical net 'UartGenerator_0_Uart_i/Uart/rxCounter_cry_6' has no load
WARNING - ngdbuild: logical net 'UartGenerator_0_Uart_i/Uart/rxCounter_cry_4' has no load
WARNING - ngdbuild: logical net 'UartGenerator_0_Uart_i/Uart/rxCounter_cry_2' has no load
WARNING - ngdbuild: logical net 'UartGenerator_0_Uart_i/Uart/rxCounter_cry_0' has no load
The VHDL-code is
entity UART is
generic (
dividerCounterBits: integer := 16
);
port (
Clk : in std_logic; -- Clock signal
Reset : in std_logic; -- Reset input
ClockDivider: in std_logic_vector(15 downto 0);
ParityMode : in std_logic_vector(1 downto 0); -- b00=No, b01=Even, b10=Odd, b11=UserBit
[...]
architecture Behaviour of UART is
constant oversampleExponent : integer := 4;
subtype TxCounterType is integer range 0 to (2**(dividerCounterBits+oversampleExponent))-1;
subtype RxCounterType is integer range 0 to (2**dividerCounterBits)-1;
signal rxCounter: RxCounterType;
signal txCounter: TxCounterType;
signal rxClockEn: std_logic; -- clock enable signal for receiver
signal txClockEn: std_logic; -- clock enable signal for transmitter
begin
rxClockdivider:process (Clk, Reset)
begin
if Reset='1' then
rxCounter <= 0;
rxClockEn <= '0';
elsif Rising_Edge(Clk) then
-- RX counter (oversampled)
if rxCounter = 0 then
rxClockEn <= '1';
rxCounter <= to_integer(unsigned(ClockDivider));
else
rxClockEn <= '0';
rxCounter <= rxCounter - 1;
end if;
end if;
end process;
txClockDivider: process (Clk, Reset)
[...]
rx: entity work.RxUnit
generic map (oversampleFactor=>2**oversampleExponent)
port map (Clk=>Clk, Reset=>Reset, ClockEnable=>rxClockEn, ParityMode=>ParityMode,
ReadA=>ReadA, DataO=>DataO, RxD=>RxD, RxAv=>RxAv, ParityBit=>ParityBit,
debugout=>debugout
);
end Behaviour;
This is a single Uart, to create them all (currently 11 uarts) I use this
-- UARTs
UartGenerator: For i IN 0 to uarts-1 generate
begin
Uart_i : entity work.UartBusInterface
port map (Clk=>r_qclk, Reset=>r_reset,
cs=>uartChipSelect(i), nWriteStrobe=>wr_strobe, nReadStrobe=>rd_strobe,
address=>AdrBus(1 downto 0), Databus=>DataBus,
TxD=>TxD_PAD_O(i), RxD=>RxD_PAD_I(i),
txInterrupt=>TxIRQ(i), rxInterrupt=>RxIRQ(i), debugout=>rxdebug(i));
uartChipSelect(i) <= '1' when to_integer(unsigned(adrbus(5 downto 2)))=i+4 and r_cs0='0' else '0';
end generate;
I can syntesis it and the uarts work, but why I got the warning?
IMHO the rxCounter should use each single possible value, but why each second bit creates the warning "has no load"?
I read somewhere that this mean that these net's aren't used and will be removed.
But to count from 0 to 2^n-1, I need no less than n-bits.
This warning means that nobody is "listening" to those nets.
It is OK to have signals that will be removed in synthesis. Warnings are not Errors! You just need to be aware of them.
We cannot assess what is happening from your partial code.
Is there a signal named rxCounter_cry?
What is the datatype of ClockDivider?
What is the value of dividerCounterBits?
What happens in the other process? If it is irrelevant, please try to run your synthesis without that process. If it is relevant, we need to see it.
Lattice ngdbuild is particularly spammy for the job it is doing, I pipe ngdbuild output through grep in my makefile to remove exactly these messages:
ngdbuild ... | grep -v "ngdbuild: logical net '.*' has no load"
There's more than 2500 of these otherwise, eliminating them helps concentrate on real issues.
Second worst toolchain spammer is edif2ngd complaining about Verilog parameters it does not have explicit handling for. This one is a two line message (over 300 of these) so I remove it with:
edif2ngd ... | sed '/Unsupported property/{N;d;}'
Just be aware that sometimes it implements things with adders. The highest order bit will not use the carry output of that adder, and the lowest order bit will not use the sign input. So you get a warning like:
WARNING - synthesis: logical net 'clock_chain/dcmachine/count_171_add_4_1/S0' has no load
WARNING - synthesis: logical net 'clock_chain/dcmachine/count_171_add_4_19/CO' has no load
No problem, bit 19 is the highest, so it will not carry anywhere, and bit 1 is the lowest, so it does not get a sign bit from anywhere. If, however, you get this warning on any of the bits in between highest and lowest, it normally means something is wrong, but not an error, so it will build something that "works" when you test it, but not in an error case. If you simulate it with error cases it will normally show undesirable results.

How to detect compiler

I have to load ROM from file. Quartus can use .mif files directly and for the simulator I have written (a quick and dirty) .mif file parser with the help of textio. Is there a way to detect Synthesizer tools (Quartus in my case) and to generate textio file loader process only if it is not being compiled by the Synthesizer tool?
(the source question)
You can detect the synthesis tool by using a pragma that is only accepted by your specific synthesis tool. For example, you can write:
constant isQuartus : std_logic := '1'
-- altera translate_off
and '0'
-- altera translate_on
;
This constant will be '1' only if you synthesize this using Altera's Quartus.
More about the various VHDL metacomment pragma's at: http://www.sigasi.com/content/list-known-vhdl-metacomment-pragmas
Are you trying to infer ROMs from your code? If you're using the LPM functions, everything should "just work".
I use ROMs in Quartus and ModelSim without issue. Just create a custom VHDL file for your ROM using the MegaWizard plugin and direct it to the appropriate initialization file, or directly instantiate an altsyncram component and include the init_file generic.
Then compile or simulate as normal. I have had no problems running simulations using ModelSim (Altera version or stand-alone version) with initialized ROMs, and have not had to write any initialization code to read *.mif or *.hex files.
For reference, here is a directly instantiated ROM megafunction from a bit of my code that simulates properly in ModelSim (note the init_file generic):
-- Sine lookup table
sine_lut : altsyncram
generic map (
clock_enable_input_a => "BYPASS",
clock_enable_output_a => "BYPASS",
init_file => "./video/sine_rom-512x8.mif",
intended_device_family => "Arria GX ",
lpm_hint => "ENABLE_RUNTIME_MOD=NO",
lpm_type => "altsyncram",
numwords_a => 512,
operation_mode => "ROM",
outdata_aclr_a => "NONE",
outdata_reg_a => "UNREGISTERED",
widthad_a => 9,
width_a => 8,
width_byteena_a => 1
)
port map (
-- Read port
clock0 => clk,
address_a => sine_addr,
q_a => sine_do
);
If you really need to do something different when simulating vs. synthesizing, there are a several ways to go about it. For simple things, you can use something like the following directives:
--synthesis translate-off
<code for simulation only>
--synthesis translate-on
http://quartushelp.altera.com/11.1/master.htm#mergedProjects/hdl/vhdl/vhdl_file_dir_translate.htm?GSA_pos=5&WT.oss_r=1&WT.oss=--%20synthesis%20translate_off
You should be able to find a lot of real-world examples of these directives via internet search, and I've included one example below (power-on reset is shorter when simulating vs. when running in real hardware):
-- Async. Power-on Reset, with de-assertion delay
process (clk)
begin
if rising_edge (clk) then
-- Create a delay at power-up
if rst_PowerOn='1' then
rst_pora <= '1';
rst_por_cnt <= (others=>'0');
-- synthesis translate_off
elsif rst_por_cnt(5)='1' then -- 256 nS in simulation
rst_pora <= '0';
-- synthesis translate_on
elsif rst_por_cnt(19)='1' then -- 4ms in real hardware
rst_pora <= '0';
else
rst_por_cnt <= rst_por_cnt + 1;
end if;
end if;
end process;
For more complex situations, use your imagination. Some folks use the C pre-processor, the M4 macro language, or something similar as a preliminary build step prior to synthesis/simulation. I have makefiles and scripts that convert FileName.sim.vhdl (for simulation) into FileName.vhdl (for synthesis), processing it with some text utilities to comment/un-comment various blocks of code based on rules I crafted so I could maintain both versions in a single file (think something like an enhanced C pre-processor tweaked for use with VHDL). Just do something that fits with your workflow and team dynamics/culture.

Resources