DMA PCIe read transfer from PC to FPGA - fpga

I'm trying to get DMA transfer working between an FPGA and an x86_64 Linux machine.
On the PC side I'm doing this initialization:
//driver probe
...
pci_set_master(dev); //set endpoint as master
result = pci_set_dma_mask(dev, 0xffffffffffffffff); //set as 64bit capable
...
//read
pagePointer = __get_free_page(__GFP_HIGHMEM); //get 1 page
temp_addr = dma_map_page(&myPCIDev->dev,pagePointer,0,PAGE_SIZE,DMA_TO_DEVICE);
printk(KERN_WARNING "[%s]Page address: 0x%lx Bus address: 0x%lx\n",DEVICE_NAME,pagePointer,temp_addr);
writeq(cpu_to_be64(temp_addr),bar0Addr); //send address to FPGA
wmb();
writeq(cpu_to_be64(1),bar1Addr); //start trasnfer
wmb();
The bus address is a 64bits address.
On the FPGA side the TLP I'm sending out for the read of 1 DW:
Fmt: "001"
Type: "00000"
R|TC|R|Attr|R|TH : "00000000"
TD|EP|Attr|AT : "000000"
Length : "0000000001"
Requester ID
Tag : "00000000"
Byte Enable : "00001111";
Address : (address from dma map page)
The completion that I get back from the PC is :
Fmt: "000"
Type: "01010"
R|TC|R|Attr|R|TH : "00000000"
TD|EP|Attr|AT : "000000"
Length : "0000000000"
Completer ID
Compl Status|BCM : "0010"
Length : "0000000000";
Requester ID
Tag : "00000000"
R|Lower address : "00000000"
so basically a completion without data and with the status Unsupported Request.
I don't think there is something wrong on the construction of the TLP but I cannot see any problem on the driver side either.
The kernel I'm using has the PCIe error reporting enabled but I see nothing in the dmesg output.
What's wrong? Or, is there a way to find why I get that Unsupported Request
Completion?
Marco

This is an extract from one of my designs (that works!). It's VHDL and slightly different but hopefully it will help you:
-- First dword of TLP Header
tlp_header_0(31 downto 30) <= "01"; -- Format = MemWr
tlp_header_0(29) <= '0' when pcie_addr(63 downto 32) = 0 else '1'; -- 3DW header or 4DW header
tlp_header_0(28 downto 24) <= "00000"; -- Type
tlp_header_0(23) <= '0'; -- Reserved
tlp_header_0(22 downto 20) <= "000"; -- Default traffic class
tlp_header_0(19) <= '0'; -- Reserved
tlp_header_0(18) <= '0'; -- No ID-based ordering
tlp_header_0(17) <= '0'; -- Reserved
tlp_header_0(16) <= '0'; -- No TLP processing hint
tlp_header_0(15) <= '0'; -- No TLP Digest
tlp_header_0(14) <= '0'; -- Not poisoned
tlp_header_0(13 downto 12) <= "00"; -- No PCI-X relaxed ordering, no snooping
tlp_header_0(11 downto 10) <= "00"; -- No address translation
tlp_header_0( 9 downto 0) <= "00" & X"20"; -- Length = 32 dwords
-- Second dword of TLP Header
-- Bits 31 downto 16 are Requester ID, set by hardware PCIe core
tlp_header_1(15 downto 8) <= X"00"; -- Tag, it may have to increment
tlp_header_1( 7 downto 4) <= "1111"; -- Last dword byte enable
tlp_header_1( 3 downto 0) <= "1111"; -- First dword byte enable
-- Third and fourth dwords of TLP Header, fourth is *not* sent when pcie_addr is 32 bits
tlp_header_2 <= std_logic_vector(pcie_addr(31 downto 0)) when pcie_addr(63 downto 32) = 0 else std_logic_vector(pcie_addr(31 downto 0));
tlp_header_3 <= std_logic_vector(pcie_addr(31 downto 0));
Let's ignore the obvious difference that I was performing MemWr of 32 dwords instead of reading a dword. The other difference, which caused me trouble the first time I did this, is that you have to use 3DW header if the address is below 4GB.
That means you have to check the address you get from the host and determine if you need to use the 3DW header (with only LSBs of address) or the full 4DW header mode.
Unless you need to transfer ungodly amount of data, you can set the dma address mask to 32 bits to be always in the 3DW case, Linux should reserve plenty of memory location below 4GB by default.

Related

VHDL wrapper for 1-wire core for DS18B20 temperature sensor

currently I am trying to write a VHDL wrapper for this Opencore Verilog module (1-wire master) so that I can send/receive from this temperature sensor (DS18B20).
However I am struggling to understand the usage. Namely the read/write enable vs. the cyc bit in the control/status register of the 1-wire master module.
The code I have so far sets the cyc bit to 1 and the read/write enable to one simultaneously but does not cycle them during each bit. Is this correct or am I misunderstanding it? I'm new to VHDL/ reading a datasheet so I have been struggling over this for a few days. Any help would be appreciated.
I found this site that I have been using as a reference but it does not deal with the Verilog module that I am using.
I am also looking for tips on my code style, and VHDL tips in general.
My current code:
LIBRARY IEEE;
USE IEEE.STD_LOGIC_1164.ALL;
use IEEE.NUMERIC_STD.ALL; --may need to remove if signed not used
ENTITY one_wire_temp_probe_control IS
GENERIC (
one_us_divider_g : integer range 0 to 50 := 50 -- clock divider for one micro second
);
PORT (
i_clk_50mhz : IN STD_LOGIC;
i_read_enable : IN std_logic;
io_temp_probe : INOUT STD_LOGIC; --how do i register an inout
o_temperature : OUT signed(6 DOWNTO 0);
o_temp_ready : OUT std_logic
);
END one_wire_temp_probe_control;
ARCHITECTURE rtl of one_wire_temp_probe_control IS
----temp commands----
CONSTANT skip_rom_c : std_logic_vector(7 DOWNTO 0) := x"CC"; --command to skip ROM identity of temperature sensor
CONSTANT convert_temp_c : std_logic_vector(7 DOWNTO 0) := x"44"; --command to start temperature conversion
CONSTANT read_scratchpad_c : std_logic_vector(7 DOWNTO 0) := x"BE"; --command to read the scratchpad i.e. get temperature data
CONSTANT command_bits_c : integer RANGE 0 TO 8 := 8; --number of bits in the above commands (note: range used to limit number of bits to minimum needed)
CONSTANT data_bits_c : integer RANGE 0 to 12 := 12; --number of bits in received data
----1-wire commands----
CONSTANT send_reset_pulse : std_logic_vector(7 DOWNTO 0) := "00001010"; --command to send reset pulse
CONSTANT write_command_structure_c : std_logic_vector(6 DOWNTO 0) := "0000000"; --structure of the command that must be passed to the 1-wire controller (----EDIT----)
----timing constants----
CONSTANT delay_65us_c : integer := one_us_divider_g * 65; --65 micro-second delay
CONSTANT delay_960us_c : integer := one_us_divider_g * 960; --960 micro-second delay
CONSTANT delay_750ms : integer := one_us_divider_g * 1000 * 750; --760 milli-second delay
----state machine----
TYPE state_type IS (idle, presence_pulse, wait_presence_pulse, skip_rom, temp_conversion, wait_for_conversion,
read_scratchpad, data_read, convert_data, wait_65us);
SIGNAL state : state_type := idle;
SIGNAL previous_state : state_type := idle;
----1-wire----
SIGNAL read_enable_s, write_enable_s, reset_s, owr_e_s : std_logic := '0';
SIGNAL write_data_s, read_data_s : std_logic_vector(7 DOWNTO 0):= (OTHERS => '0'); --8 bit mode chosen in sockit_owm
SIGNAL address_s : std_logic_vector(1 DOWNTO 0) := "00";
SIGNAL timer_s : integer := 0;
----commands---
SIGNAL bit_counter_command_s : integer RANGE 0 TO command_bits_c := 0; --counter for bits in commands (note: not -1 due to using 9th bit as state change)
SIGNAL bit_counter_data_s : integer RANGE 0 TO data_bits_c := 0; --counter for bits in data recieved
----temperature----
SIGNAL temperature_raw_data : std_logic_vector(11 DOWNTO 0) := (OTHERS => '0');
----one wire control----
COMPONENT sockit_owm IS
PORT (
----control interface----
clk : IN std_logic;
rst : IN std_logic;
bus_ren : IN std_logic;
bus_wen : IN std_logic;
bus_adr : IN std_logic_vector(7 DOWNTO 0);
bus_wdt : IN std_logic_vector(7 DOWNTO 0);
bus_rdt : OUT std_logic_vector(7 DOWNTO 0);
bus_irq : OUT std_logic;
----1-wire interface----
owr_p : OUT std_logic; --verilog code is a one bit wide vector
owr_e : OUT std_logic;
owr_i : IN std_logic
);
END COMPONENT;
BEGIN
address_s <= "00"; --for the temp probe control we're not interested in other address spaces
PROCESS(i_clk_50mhz) BEGIN --state change
IF rising_edge(i_clk_50mhz) THEN
CASE state is
WHEN idle =>
o_temp_ready <= '0';
IF (i_read_enable = '1') THEN
state <= presence_pulse;
ELSE
state <= idle;
END IF;
WHEN presence_pulse =>
----send reset/presence pulse----
write_enable_s <= '1';
write_data_s <= send_reset_pulse;
timer_s <= delay_960us_c;
state <= wait_presence_pulse;
WHEN wait_presence_pulse =>
----wait for 960 micro seconds----
read_enable_s <= '1';
IF (timer_s = 0) THEN
IF (read_data_s(0) = '0') THEN
state <= skip_rom;
ELSIF (read_data_s(0) = '1') THEN
--precence not detected
ELSE
state <= wait_presence_pulse;
END IF;
ELSE
timer_s <= timer_s - 1;
state <= wait_presence_pulse;
END IF;
WHEN skip_rom =>
----send skip rom command----
previous_state <= skip_rom;
write_enable_s <= '1';
IF (bit_counter_command_s = command_bits_c) THEN
bit_counter_command_s <= 0;
state <= temp_conversion;
ELSE
write_data_s <= write_command_structure_c & skip_rom_c(bit_counter_command_s); ---command structure concatonated with 1 bit from command
bit_counter_command_s <= bit_counter_command_s + 1;
timer_s <= delay_65us_c;
state <= wait_65us;
END IF;
WHEN temp_conversion =>
----send temp conversion command to probe----
previous_state <= temp_conversion;
IF (bit_counter_command_s = bit_counter_command_s) THEN
bit_counter_command_s <= 0;
timer_s <= delay_750ms;
state <= wait_for_conversion;
ELSE
write_data_s <= write_command_structure_c & convert_temp_c(bit_counter_command_s); ---command structure concatonated with 1 bit from command
bit_counter_command_s <= bit_counter_command_s + 1;
timer_s <= delay_65us_c;
state <= wait_65us;
END IF;
WHEN wait_for_conversion =>
----wait for temperature conversion to finish----
IF (timer_s = 0) then
state <= read_scratchpad;
ELSE
timer_s <= timer_s - 1;
END IF;
WHEN read_scratchpad =>
----send read scratchpad command----
previous_state <= read_scratchpad;
IF (bit_counter_command_s = command_bits_c) THEN
state <= data_read;
bit_counter_command_s <= 0;
ELSE
write_data_s <= write_command_structure_c & read_scratchpad_c(bit_counter_command_s); ---command structure concatonated with 1 bit from command
bit_counter_command_s <= bit_counter_command_s + 1;
timer_s <= delay_65us_c;
state <= wait_65us;
END IF;
WHEN data_read =>
----read incoming data----
previous_state <= data_read;
read_enable_s <= '1';
IF (bit_counter_data_s = data_bits_c) THEN
bit_counter_data_s <= 0; --may need to invert this
state <= convert_data;
ELSE
temperature_raw_data(bit_counter_data_s) <= read_data_s(0);
bit_counter_data_s <= bit_counter_data_s + 1;
timer_s <= delay_65us_c;
state <= wait_65us;
END IF;
WHEN convert_data =>
----convert raw data into temperature----
o_temp_ready <= '1';
WHEN wait_65us =>
----wait for read/write cycle to finish----
IF (timer_s = 0) THEN
state <= previous_state;
ELSE
timer_s <= timer_s - 1;
state <= wait_65us;
END IF;
END CASE;
END IF;
END PROCESS;
----one wire component instantiation----
one_wire_control : sockit_owm
PORT MAP(
----control interface----
clk => i_clk_50mhz,
rst => reset_s,
bus_ren => read_enable_s,
bus_wen => write_enable_s,
bus_adr => address_s,
bus_wdt => write_data_s,
bus_rdt => read_data_s,
bus_irq => OPEN,
----1-wire interface----
owr_p => OPEN,
owr_e => owr_e_s,
owr_i => io_temp_probe
);
io_temp_probe <= owr_e_s ? '0' : 'Z'; --I also need help converting this line to VHDL
END rtl;
Thank you in advance.
Best
Tom
I am also looking for tips on my code style, and VHDL tips in general.
OK.
First thing: don't make the lines so long. So don't put comments at the end of a line. Put them a line before.
use IEEE.NUMERIC_STD.ALL; --may need to remove if signed not used
then remove, as I don't see any signed
one_us_divider_g : integer range 0 to 50 := 50 -- clock divider for one micro second
So... what happens is one_us_divider_g is set to 0? Seems an illegal value. Using it for simulation?
io_temp_probe : INOUT STD_LOGIC; --how do i register an inout
One option is to use a tristate IOBUFFER. This is a special FPGA edge element which splits the input and output to separate signals. You can tristate the ouput by setting a control port.
Alternatively you could just do it the way you do in your code (this is also explained in for instance the Xilinx synthesis user guide). Which leads me to another question in your code.
io_temp_probe <= owr_e_s ? '0' : 'Z'; --I also need help converting this line to VHDL
io_temp_probe <= '0' when owr_e_s = '1' else 'Z';
CONSTANT command_bits_c : integer RANGE 0 TO 8 := 8;
No need for an integer range if it is a constant.
CONSTANT send_reset_pulse : ...
CONSTANT delay_750ms : ...
Missing the "_c" you put behind all your constants. But I would not add this "s", "_c" or "_g" anyhow. A lot of work for little gain.
COMPONENT sockit_owm IS
PORT (
[...]
);
END COMPONENT;
Component declarations are not required anymore since some time now. You can remove it and change your instantiation:
one_wire_control : entity work.sockit_owm
PORT MAP(
[...]
WHEN idle =>
[...]
ELSE
state <= idle;
END IF;
not required. If you don't change state, it stays at idle.
WHEN wait_presence_pulse =>
IF (timer_s = 0) THEN
IF (read_data_s(0) = '0') THEN
[...]
ELSIF (read_data_s(0) = '1') THEN
[...]
ELSE
state <= wait_presence_pulse;
END IF;
read_data_s(0) '0' and '1' are covered. Do you expect any other value? That can only happen in simulation, not in implementation. So the code in the last else-statement is unreachable then.
[...]
timer_s <= delay_65us_c;
state <= wait_65us;
[...]
WHEN wait_65us =>
IF (timer_s = 0) THEN
[...]
ELSE
timer_s <= timer_s - 1;
END IF;
Let's say a delay is 65 us lasts 10 clock cycles. Setting the divider to 1, delay_65us_c=10. So at t=0, timer_s is set to 10. at t=1 -state is wait_65us now- timer_s is set to 9. And so on: at t=10, timer_s is set to 0... but state is still wait_65us. So at t=11, timer_s is detected 0, and state is changed to the previous one. Which it will enter at t=12.
So, instead of a 10 clock cycle delay, you get a 12 clock cycle delay.
Is this a problem? If yes, you should reconsider your code.
SIGNAL read_enable_s, write_enable_s, reset_s, owr_e_s : std_logic := '0';
[... not used anywhere else... ]
one_wire_control : sockit_owm
PORT MAP(
[...]
rst => reset_s,
Are you sure this is correct? A lot of components need to be properly reset before they operate correctly.
If you're working with Quartus, you can mix VHDL code with Verilog and even schematic elements. In the link below, I use a verilog driver for the same chip (DS18B20).
See here for details:
https://physnoct.wordpress.com/2016/12/14/altera-quartus-combining-verilog-and-vhdl/

During implementing FIFO buffer code for serial communication taking too much time

I am a new bee in VHDL coding. I am currently working on starter kit spartan 3e. I have written a code for transmitting 5 bytes to PC and receiving 4 bytes. Now I have to add fifo buffer before transmitting and after receiving bytes.I have written code( taken from Pong P Chu) also but not working. Its taking too much time for synthesis. Please tell me where I am going wrong.
Thanks in advance.
entity fifo is
generic (
B : natural :=32; --------------------------------------------------- number of bits
W : natural := 16 ----------------------------------------------------number of address bits
);
port ( ck : in std_logic; ------------ clock
reset : in std_logic;
rd : in std_logic; -------- control signal for read
wr : in std_logic; -------- control signal for write
-- btn0 : in std_logic;
write_data : in std_logic_vector ( B-1 downto 0); ----------------------------data to be written in FIFO
read_data : out std_logic_vector ( B-1 downto 0) :=( others=> '0');------------------------------ data read from FIFO
empty : out std_logic; ------------ shows FIFO is empty, cannot be read
full : out std_logic ------------ shows FIFO is full, cannot be written
);
end fifo;
architecture arch4 of fifo is
--------------state machines declared----------------------
type reg_file_type is array (2**W - 1 downto 0) of std_logic_vector (B-1 downto 0); --------------------array of 32 cross 32 for data
signal array_reg : reg_file_type;
--------------------------------------------------------------------
----variables-------------------------------------------------------
--signal read_data : std_logic_vector ( B-1 downto 0); ------------------------------ data read from FIFO
signal write_ptr_reg : std_logic_vector (W-1 downto 0); ----- addressing the data in fifo to write
signal write_ptr_next: std_logic_vector (W-1 downto 0); ----- addressing next data in fifo to write
signal write_ptr_succ : std_logic_vector (W-1 downto 0); ---- addressing next to next data in fifo to write
signal read_ptr_reg : std_logic_vector (W-1 downto 0); ----- addressing the data in fifo to read
signal read_ptr_next: std_logic_vector (W-1 downto 0); ----- addressing next data in fifo to read
signal read_ptr_succ : std_logic_vector (W-1 downto 0); ---- addressing next to next data in fifo to read
signal write_operation : std_logic_vector (1 downto 0); ---- 00,01,10,11 only 00,01 and 10 are valid
signal write_enable : std_logic ; ---------------- write enable
-- signal rd : std_logic; -------- control signal for read
-- signal wr : std_logic; -------- control signal for write
--signal empty : std_logic; ------------ shows FIFO is empty, cannot be read
--signal full : std_logic ;------------ shows FIFO is full, cannot be written
signal full_reg : std_logic ;
signal empty_reg : std_logic ;
signal full_next : std_logic ;
signal empty_next : std_logic ;
egin
------------------------------------------initialising register ------------------------------------------------------
process (ck, reset)
begin
if (reset = '1') then
array_reg <= ( others => (others=> '0')); ---------------------initialsing
else
if rising_edge(ck) then
if write_enable ='1' then
--array_reg (std_logic_vector(write_ptr_reg)) <= write_data;
array_reg (CONV_INTEGER( unsigned(write_ptr_reg))) <= write_data; -------------------------- directed towards address for writing data (first position)
end if;
end if;
end if;
end process;
--------------------------------------------------------------------------------------------------------------------------------------
--read_data <= array_reg (std_logic_vector(read_ptr_reg)) ;
read_data <= array_reg (CONV_INTEGER( unsigned(read_ptr_reg))) ; ------ directed towards address for reading data (first position)
write_enable <= wr and ( not full_reg ); ------ write enabled only when FIFO is not full
---============================ control logic for fifo==================================================
--------------------reading and writing process with address pointers
--======================================================================================================
process (ck, reset)
begin
if (reset = '1') then
write_ptr_reg <= (others => '0');
read_ptr_reg <= (others => '0');
full_reg <= '0';
empty_reg <='1';
else
if rising_edge(ck) then
write_ptr_reg <= write_ptr_next; ---- control of pointers
read_ptr_reg <= read_ptr_next;
full_reg <= full_next;
empty_reg <= empty_next;
end if;
end if;
end process;
------======================= successive pointer values update--=====================================
write_ptr_succ <= std_logic_vector(( write_ptr_reg)+1);
read_ptr_succ <= std_logic_vector ((read_ptr_reg)+1);
---==========main process for read write operation, shifting pointers and checking status of fifo ====================
write_operation <= wr & rd ; ----- concatinating two signals so 10 or 01 is valid states
process (write_ptr_reg, write_ptr_succ, read_ptr_reg, read_ptr_succ,write_operation, empty_reg, full_reg)
begin
write_ptr_next <= write_ptr_reg;
read_ptr_next <= read_ptr_reg;
full_next <= full_reg;
empty_next <= empty_reg;
case write_operation is
when "00" =>
------------------------ wr =0 and read = 0 , no operation
when "01" =>
----------------------------- wr =0 and read = 1 , read operation
--if state_button = transit_pressed then
if (empty_reg /= '1') then ----------------not empty
read_ptr_next <= read_ptr_succ; ---- updating the address pointers
full_next <= '0'; ------ clearing full status
if (read_ptr_succ = write_ptr_reg) then ---- checking the pointer positions whether equal
empty_next <= '1';
end if;
end if;
-- end if;
when "10" =>
------------------------ wr =1 and read = 0, write operation
if (full_reg /= '1') then ---------- not full
write_ptr_next <= write_ptr_succ;---- updating the address pointers
empty_next <= '0'; --- clearing empty status
if ( write_ptr_succ = read_ptr_reg) then ------------- checking the pointer positions of successors to read pointer whether it is same or not
full_next <= '1'; -------------- fifo full only above condition is true
end if;
end if;
when others =>
------------------------------write and read i.e for 11
write_ptr_next <= write_ptr_succ;
read_ptr_next <= read_ptr_succ;
end case;
end process;
----- updating the flag
full <= full_reg;
empty <= empty_reg;
end architecture arch4;
The design is too big, it can't fit any Spartan-3E device even if all memory is mapped to Block RAMs. Indeed there are 2097152 flip-flops for array_reg signal.
2**W = 2**16 = 65536
65536*B = 65536*32 = 2097152
The size of the FIFO should be reduced and it's better to use Block RAMs instead of flip-flops. Here is a RAM module in VHDL, which can be mapped to Block RAMs by Xilinx tools.

VHDL - extra latch with array register

I know the model below is by no means the most efficent means of describing a 8 bit register but I made this to learn about the use of arrays. When the model is simulated a 9th latch appears that stores the MSB before passing it onto the output s_out. Let me demonstrate.
Model
ENTITY Q1 IS
PORT (clk,s_enable,s_right,s_in : IN std_logic := '0';
s_out : OUT std_logic := '0');
END ENTITY Q1;
ARCHITECTURE behavioural OF Q1 IS
TYPE reg1 IS ARRAY (7 DOWNTO 0) OF std_logic; -- array decleration
SIGNAL mem : reg1 := "00000000"; -- assignment of arry values
SIGNAL sel : std_logic_vector(1 DOWNTO 0) := "00";
BEGIN
sel <= (s_enable & s_right); -- concatination of s_enable and s_right
-->Additional multiplexer <--
--s_out <= mem(0) when sel = "11" ELSE
-- mem(7) when sell = "10" ELSE
-- s_out;
REG : PROCESS (clk) -- shift process
BEGIN
IF clk'EVENT AND clk = '1' THEN
IF sel = "11" THEN
s_out <= mem(0);--< remove with new mix
mem <= s_in & mem (7 DOWNTO 1);-- shifting of mem by concatination of s_in as the LSB to mem
ELSIF sel = "10" THEN
s_out <= mem(7);--< remove with new mux
mem <= mem(6 DOWNTO 0) & s_in;
END IF;
END IF;
END PROCESS REG;
END ARCHITECTURE behavioural;
Simulation
You can see that an extra latch is added, where by the final bit depending on if it is shifting left or right is only present on the output s_out after another clock pulse. Its also available to see in the RTL viewer:
While not changing the use of arrays for the register is it possible to remove this extra latch ? I tried changing the signals to variables inside the process but still had the same result.
Many Thanks.

How to correctly storage registers in an FPGA

I need to write in VHDL a program that initialize a sensor registers using i2c. My problem is to write an efficent program that don't waste all FPGA space. The number of registers I need to storage are 400 register composed by 8bit address and 8 bit data.
Program I write is:
entity i2cReg is
port (
RegSel : in std_logic;
Address : out std_logic_vector (15 downto 0);
Data : out std_logic_vector (7 downto 0);
RegStop : out std_logic;
ModuleEN : in std_logic
);
end i2cReg;
architecture i2cReg_archi of i2cReg is
signal counter :integer := 0;
begin
process(RegSel, ModuleEN)
begin
if ModuleEN = '0' then
Address <= x"10";
Data <= x"10";
RegStop <= '0';
counter <= 0;
elsif rising_edge(RegSel) then
counter <= counter + 1;
case counter is
when 0 =>
Address <= x"10";
Data <= x"10";
when 1 =>
Address <= x"10";
Data <= x"10";
when 2 =>
Address <= x"10";
Data <= x"10";
when 3 =>
Address <= x"10";
Data <= x"10";
when 4 =>
Address <= x"10";
Data <= x"10";
when 5 =>
Address <= x"10";
Data <= x"10";
when 400 =>
RegStop <= '1';
when others =>
end case;
end if;
end process;
end i2cReg_archi;
There is a way to optimize this code? Or you advice me to use an external eeprom?
Yaro - you have not mentioned the FPGA vendor or the device but the answer is: Yes, you can initialize ROM in an FPGA so that the values you need are present after configuration. Both Altera and Xilinx allow you to provide a file with the initial values during synthesis.
Kevin.
Initialized BlockRAM is in general the correct solution if you are on Xilinx or Altera.
But there are exceptions where a logic implementation can also work:
For example, if the content of your 400 registers has repeating patterns or many registers with the same value (like in your example code). In this case, if you implement it as logic, your synthesis tool will optimize it heavily. You may actually end up with a very small amount of logic if the register content is very repeating. It is sometimes also possible to improve the optimization by clever reordering of the registers.
100-200 logic cells is often considered "cheaper" than a BlockRAM. But it depends mostly on which resource is most scarce in your particular application.
Regardless if you go for initialized BlockRAM or logic, I would suggest that you model it as an array of std_logic_vector instead of using case/when.
The "array of std_logic_vector" approach is platform independent, and can be synthesized to either BlockRAM or logic. Your synthesis tool will usually try to automatically select the best implementation. But you can also force the sythesis tool to use either logic or BlockRAM by using vendor specific attributes. (I can't tell you which attributes to use, since I don't know which platform you are using)
Example:
type REG_TYPE is array (0 to 3) of std_logic_vector(15 downto 0);
constant REGISTERS : REG_TYPE :=
(x"0000",
x"0001",
x"0010",
x"0100");
And in your process, something like:
if rising_edge(RegSel) then
Address <= REGISTERS( counter )(15 downto 8);
Data <= REGISTERS( counter )( 7 downto 0);
end if;

State Machine with VHDL for UA(R)T

I am trying to create a state machine in vhdl for UA(R)T (Only the sending portion).
I am having an issue with the flow of the program. I know the buad rate portion does not work at the moment. I am trying to get it working with just a clock at the moment, and then will implement the baud rate divider.
When I run it through my test bench (nothing complicated, just assign a couple of initial values reset = 1 for x time, din = z, baud = y, etc), nothing happens. My output txd stays at the initial '1' value that is set in the reset stage and if I set it to '0' it will stay like that for the cycles.
My issue that I had when designing the state machine is the it has two values that it will transition on BUT not in ever state.
Basically, what it is supposed to do is:
reset: txd = 1, count = 1, busy = 0, we = 0
idle: when busy = 1 set shift = init values
wait: transition on next clock signal
trans: if count < 9, txd = shift(0), and shift shift
if count = 9, busy = 0, count = 0
and back to idle
I think my issue is somehow related to the busy signal not being properly set.
-- Universal Asynch Receiver Transmitter
---------------------
library ieee;
use ieee.std_logic_1164.all;
entity eds_uart is
generic (width : positive := 16);
port ( clk,reset: in std_logic ;
din_wen: buffer std_logic; -- state machine sets value thus buffer needed
brd : in std_logic_vector(23 downto 0); -- buad rate dividor
din : in std_logic_vector(7 downto 0); -- input value
txd: out std_logic; -- sent data bit
tx_busy : buffer std_logic -- sent data bit active
);
end entity eds_uart;
architecture behaviour of eds_uart is
type state_type is (idle_s, wait_s, transmit_s); -- three possible states of uat
signal current_s: state_type;
signal tick: std_logic; -- baud rate clock
signal count: integer := 0; -- count number of characters sent
signal shift: std_logic_vector(9 downto 0); -- intermediate vector to be shifted
begin
-- assign tick value based on baud rate
-- need to implement divisor
process(clk, brd) begin
tick <= clk;
end process;
process(tick, reset, din) begin
if (reset = '1') then
current_s <= idle_s; -- default state
count <= 0; -- reset character counter
txd <= '1';
tx_busy <= '0';
din_wen <= '0'; -- able to start sending
elsif (current_s = idle_s and din_wen = '1') then -- transition when write enable is high
current_s <= wait_s; -- transition
tx_busy <= '1';
shift <= '1' & din & '0'; -- init shift value
elsif (current_s = wait_s and rising_edge(tick)) then -- transition on clock signal
current_s <= transmit_s;
elsif (current_s = transmit_s and rising_edge(tick)) then -- test transition on clock signal
if (count < 9) then
txd <= shift(0); -- output value
shift <= '0' & shift(9 downto 1); -- shift to next value
count <= count + 1; -- increment counter
current_s <= transmit_s; -- dont change state
elsif (count = 9) then
txd <= shift(0); -- send last element
count <= 0;
tx_busy <= '0'; -- reset busy signal
current_s <= idle_s; -- start process again
end if;
end if;
end process;
end architecture behaviour ;
The comments:
-- state machine sets value thus buffer needed
and
-- transition when write enable is high
suggest that you may be expecting to have an additional external driver for din_wen. If that is the case the buffer mode is not doing you any good as it only exposes the value of the internal driver of din_wen which is only ever driving '0'. Post VHDL-2002, buffer is effectively a fancy, readable version of out without the limitations from earlier standards. It does not implement an input port. More significantly, it does not let you see the external resolved value if you have additional signal driver(s) outside this entity.
It isn't clear why you even need to drive din_wen internally since it is intended to be a control input that causes the transition into the wait_s state. Consider changing it to an in port mode and removing the reset assignment.
Style note: You are courting danger with the mixture of synchronous and asynchronous logic described here. You should stick to the pattern of having a single call to rising_edge() in a top level if block that wraps all of your synchronous logic.

Resources