Timing between 7-segment display and enable - vhdl

I am working through Altera University LABS but I am using a board of a slightly different design so I am having to mimic the way the boards used in the labs display to 7 Segment LED.
I have sorted it out with the code below:
LIBRARY ieee;
USE ieee.std_logic_1164.all;
ENTITY DE1_disp IS
PORT ( HEX0, HEX1, HEX2, HEX3: IN STD_LOGIC_VECTOR(6 DOWNTO 0);
clk : IN STD_LOGIC;
HEX : OUT STD_LOGIC_VECTOR(6 DOWNTO 0);
DISPn: OUT STD_LOGIC_VECTOR(3 DOWNTO 0));
END DE1_disp;
ARCHITECTURE Behavior OF DE1_disp IS
COMPONENT sweep
Port ( mclk : in STD_LOGIC;
sweep_out : out std_logic_vector(1 downto 0));
END COMPONENT;
SIGNAL M : STD_LOGIC_VECTOR(1 DOWNTO 0);
BEGIN -- Behavior
S0: sweep PORT MAP (clk,M);
DISPProcess: process (clk,M) is
begin
CASE M IS
WHEN "00" => HEX <= HEX0; DISPn <= "1110";
WHEN "01" => HEX <= HEX1; DISPn <= "1101";
WHEN "10" => HEX <= HEX2; DISPn <= "1011";
WHEN "11" => HEX <= HEX3; DISPn <= "0111";
END CASE;
end process DISPProcess;
END Behavior;
The gist is that my board has one lot of segment drivers and you have to scan the LED enable. Whilst the LAB boards simply have n sets of segment drivers.
The code above works except for a pesky "ghost" character. What appears to be happening is that the enable is likely held low whilst a character change is occurring so the following display is lit for a poofteenth enable time.
As you can see from the code I am taking four 7-segment display inputs and generating a scanned and the ghost is always on the digit following the last enable - so it will also wrap from 4th to 1st display. Obviously, this is most apparent when a display is blanked.
For the purposes of the labs this code is fine. However, I would love to better understand what I have done to incur the ghost as understanding that would help me understand VHDL design a tad more.
Can anyone please suggested then what principle I need to grasp here or at least how to code up the enable so it falls after the digit change?
Note I have tried a default case (both using NULL and setting DISPn to "1111"). I suspect a way to do it is to expand case statement and alternatively set HEX and then set DISPn on successive case statements. But are there any other VHDL tricks that might work?
Cheers,
A

It is possible that your diagnosis is slightly wrong.
Check the schematic for your board : it is likely that the Enables (called Disp) drive the bases of bipolar transistors into saturation. Then - even though Hex and Disp change in the same delta cycle, charge storage in the external transistors maintain the Enable for long enough to see the ghost.
The fix is to provide a dead time, turning the Enables off for a short while until the enable transistors are fully off - probably 10's of microseconds - then you can change the digit and re-enable at the same time.
Your solution accomplishes this elegantly simply, but at the cost of half the potential brightness.

Related

VHDL: Correctly way to infer a single port ram with synchronous read

I've been having this debate for years... What's the correct why to infer a single port ram with synchronous read.
Let's Suppose the interface for my inferred memory in VHDL is:
library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
entity sram1 is
generic(
aw :integer := 8; --address width of memory
dw :integer := 8 --data width of memory
);
port(
--arm clock
aclk :in std_logic;
aclear :in std_logic;
waddr :in std_logic_vector(aw-1 downto 0);
wdata :in std_logic_vector(dw-1 downto 0);
wen :in std_logic;
raddr :in std_logic_vector(aw-1 downto 0);
rdata :out std_logic_vector(dw-1 downto 0)
);
end entity;
is this this way: Door #1
-- I LIKE THIS ONE
architecture rtl of sram1 is
constant mem_len :integer := 2**aw;
type mem_type is array (0 to mem_len-1) of std_logic_vector(dw-1 downto 0);
signal block_ram : mem_type := (others => (others => '0'));
begin
process(aclk)
begin
if (rising_edge(aclk)) then
if (wen = '1') then
block_ram(to_integer(unsigned(waddr))) <= wdata(dw-1 downto 0);
end if;
-- QUESTION: REGISTERING THE READ DATA (ALL OUTPUT REGISTERED)?
rdata <= block_ram(to_integer(unsigned(raddr)));
end if;
end process;
end architecture;
Or this way: Door #2
-- TEXTBOOKS LIKE THIS ONE
architecture rtl of sram1 is
constant mem_len :integer := 2**aw;
type mem_type is array (0 to mem_len-1) of std_logic_vector(dw-1 downto 0);
signal block_ram : mem_type := (others => (others => '0'));
signal raddr_dff : std_logic_vector(aw-1 downto 0);
begin
process(aclk)
begin
if (rising_edge(aclk)) then
if (wen = '1') then
block_ram(to_integer(unsigned(waddr))) <= wdata(dw-1 downto 0);
end if;
-- QUESTION: REGISTERING THE READ ADDRESS?
raddr_dff <= raddr;
end if;
end process;
-- QUESTION: HOT ADDRESS SELECTION OF DATA
rdata <= block_ram(to_integer(unsigned(raddr_dff)));
end architecture;
I'm a fan of the first version because I think its good practice to register all of the output of your vhdl module. However, many textbook list the later version as the correct way to infer a single port ram with synchronous read.
Does it really matter from a Xilinx or Altera synthesis point of view, as long as you already have taken into account the different between delaying the data verses the address (and determined it doesn't matter for your application.)
I mean...they both still give you block rams in the FPGA? right?
or does one give you LUTS and the other Block rams?
Which would infer a better timing and better capacity in an FPGA, door #1 or door #2?
Unfortunately, the synthesis tool vendors have made the RAM inference functions so that they typically recognize both styles, regardless of the physical implementation of the RAM in the FPGA in question.
So even if you specify registered output, the syntesis tool may silently ignore that and infer a RAM with registered inputs instead. This is not functionally equivalent, so it may actually lead to undesired behaviour, particularly in the case of dual port RAMs.
To avoid this pitfall, you can add vendor specific attributes telling the syntehsis tool exactly which kind of RAM you need.
In general, most FPGAs have mandatory registered inputs on the physical RAM, and can add a additional optional register on the output.
So using the code style code with registered inputs will probably make simulation match reality, which is typically a good thing.
The differences can matter, and it really depends on the specific family you are targeting. Most modern FPGAs have options for the block ram that allow them to function either way, but I wouldn't count on that in practice.
If I am inferring RAM, I typically start with the example design provided with the tools (there's almost always a "how to infer ram" section of the user guide). If targeting cross-platform (eg: Altera + Xilinx) I'd stick with a "minimal common supported" set of features, merging the two example designs.
All that said, I typically register BOTH the address and the data. It's one more clock, but it helps close timings and I'm usually more concerned with throughput vs. overall latency. I also typically use wrapper functions (eg: My_Simple_Dual_Port_RAM) and directly instantiate the low-level block rams using primitives which makes it easy to switch between FPGA vendors (or swap out the inferred logic if/when needed). I just drop the modules in a directory (eg: Altera, Lattice, Xilinx) and include the appropriate directory in the project file. I also do the same thing with dual clock FIFOs, where you're typically a LOT better off using the library parts vs. trying to build your own.
You can take a look at the results of the synthesis. My Vivado gives me the following reports after synthesizing your solutions (default settings).
First solution:
BRAM: 0.5 (from 60 Blocks)
IO: 34
BUFG: 1
And the schematic looks like this
Second solution:
BRAM: 0.5 (from 60 Blocks)
IO: 34
BUFG: 1
With the following result:
So you see that the synthesis will generate the same output for both variants. It is up to you which one you want to use. I prefer the first variant because the second is slightly more code.

Synchronous counter in VHDL with compare match and load

I created the following Counter with a compare match functionality:
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use ieee.numeric_std.all;
entity Counter is
generic (
N : natural := 24
);
port (
-- Input counter clock
clk : in std_logic := '0';
-- Enable the counter
enable : in std_logic := '0';
-- Preload value loaded when clk is rising and load is 1
load_value : in std_logic_vector((N-1) downto 0) := (others => '0');
-- Set to 1 to load a value
load : in std_logic := '0';
-- Compare match input is compared with the counter value
compare_match_value : in std_logic_vector((N-1) downto 0) := (others => '0');
-- Is 1 when compare_match_value = counter_value
compare_match : out std_logic := '0';
output_value : out std_logic_vector((N-1) downto 0) := (others => '0')
);
end Counter;
architecture Behavioral of Counter is
signal counter_value : unsigned((N - 1) downto 0) := to_unsigned(0, N);
begin
output_value <= std_logic_vector(counter_value);
process (clk) is
begin
if rising_edge(clk) then
if enable = '1' then
if load = '1' then
counter_value <= unsigned(load_value);
else
counter_value <= counter_value + 1;
end if;
else
if load = '1' then
counter_value <= unsigned(load_value);
end if;
end if;
end if;
end process;
process (counter_value) is
begin
if unsigned(compare_match_value) = counter_value then
compare_match <= '1';
else
compare_match <= '0';
end if;
end process;
end Behavioral;
The behavior of my counter is to be fully synchronous with the input clk signal. Disabling the counter is always possible and the value is held at the current count value. A load value can be assigned with the load and load_value signal. Whenever the load signal is high and a rising edge is detected, the counter value is updated to the load_value.
Another feature is the compare unit which outputs high on compare_match output. The simulation works as expected but I have a few questions when synthesizing this design on spartan 3 fpga.
Is this considered a good design of my counter because I'm still not much experienced in VHDL.
Are there any undefined states when using the compare unit in further logic in my design? As I see it compare_match is calculated whenever the counter_value is updated.
When using a large number for N, is there anything special about the delay I need to consider?
In general it seems to me a quite good description.
However, I would like to point out some minor things (that might me some answers to your 1st question).
1) As, I see right now your counter does not contain any reset (neither asynchronous nor synchronous). In general you cannot predict the starting point of your counting (even if, probably, it will be all '0's at start-up).
In my opinion, it would be a neater design if you could have a reset signal.
I also noticed that the loading is activated regardless of the fact that the counter is enabled or not. I have no comment about this since it could be a specification for your design. Maybe you can compact the code by moving the "if load" part outside of the "if enable" (i.e. changing the order to the comparisons).
To improve the readability (especially when the designs will be more complex), I advise you to label the process. This will help you to identify the different part of the design.
You can skip lot of the extra-typing if you use the VHDL-mode of emacs. It has built in templates that would take care of the "boring" part related to coding.
I also see that you have default values for your input ports. In my opinion, this is not a very good practice; they would be ignored by synthesizer leading to an IP that might behave differently than what you expect. In general, do not make assumptions (a part those that are specified) on the external signals.
Finally, I have a comment about the compare part.
This goes for both question 1) & 2)
1-2)
In the compare process, you have just listed counter_value in the sensitivity-list.
This means that the process would be activated only when counter_value changes.
Since you compare it with a signal (compare_match_value) that is an input to the block (hence it can change values) it would be better to have it too in the sensitivity-list. Otherwise, the comparison would not apply (i.e. the process would not be activated) when you change compare_match_value.
Linting tools and synthesizer might complain about it (stating warning like incomplete sensitivity-list). As matter of fact it is good practice to list all the signals that may change in the list for combinatorial processes.
Regarding the comparison itself, the way you described it is absolutely fine and you would not have uncovered states. Basically you have specified all possible conditions so no surprises there.
3)
Regarding your 3rd question, since you are targeting an FPGA, you could "relax" about it. FPGAs have dedicated structure for fast arithmetic operations and (as long as you do not use all of them) the synthesizer would use them to close time.
Also in ASIC, synthesizer would probably select an appropriate arithmetic structure to close time.
If you want to be on the safe side, you can add a register at the output of the comparison block. This will prevent creating a long combinatorial path especially if you IP has to be integrated with other blocks. Of course this extra-register would add an 1-clock-cycle latenc, but it will improve your overall timing.
I hope these suggestions could be useful to you and cover (at least partially) your doubts.
Keep on coding :)

Design of MAC unit (dsp processors) using VHDL

My project is design of 32bit MAC(Multiply and Accumlate) unit using reversible logic. For the project , i have designed 32bit mulitplier and 64 bit adder using reversible logic. Now, in the next step i want to design a 64 bit accumlator which takes the value from the adder and stores it and adds with the previous value present in it. I am not getting any idea how to design Accumlator.
Please help in completion of my project.
A basic VHDL accumulator can be implemented in only a few lines of code. How you decide to implement it, and any additional features necessary are going to depend on your specific requirements.
For example:
Are the inputs signed or unsigned?
What is the type of the inputs?
Does the accumulator saturate, or will it roll over?
Here is a sample unsigned accumulator to give you an idea of what you need to implement (based on this source):
library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
entity accumulator is
port (
DIN: in std_logic_vector(3 downto 0);
CLK: in std_logic;
RST: in std_logic;
DOUT: out std_logic_vector(3 downto 0)
);
end entity accumulator;
architecture behave of accumulator is
signal acc_value : std_logic_vector(3 downto 0);
begin
process(CLK)
begin
if rising_edge(CLK) then
if RST='1' then
acc_value <= (others => '0'); -- reset accumulated value to 0
else
acc_value <= std_logic_vector( unsigned(acc_value) + unsigned(DIN) );
end if;
end if;
end process;
-- Assign output
DOUT <= acc_value;
end behave;
To describe what this design does in words: Every clock cycle on the rising edge, the data input DIN is interpreted as an unsigned value, and added to the currently accumulated value acc_value. If the RST input is asserted, instead of accumulating the DIN input, the accumulated value is cleared back to 0. The value of the accumulator is always presented on the output of the block, DOUT.
Based on what you are interfacing with, you might want to consider the following changes/modifications:
Perhaps DIN should be signed or unsigned types instead of std_logic_vector. I actually recommend this, but it depends on how you are representing your values in other places of your design.
DOUT could also be a signed or unsigned value instead of std_logic_vector - it depends your requirements.
In this case, acc_value, the accumulated value register, will rollover if the values accumulated get too high. Maybe you want to generate an error condition when this happens, or perform a check to ensure that you saturate at the maximum value of acc_value instead.
acc_value need not be the same width as DIN -- it could be twice as wide (or whatever your requirements are). The wider it is, the more you can accumulate before the rollover condition occurs.

VHDL modify one signal with mutiple clock

I met a problem in using 3 clock in one process
if i make a process like this:
HC1,HC2 may function at the same time and they are much more slower than H , H is the base clock which works at 16MHZ.
entity fifo is
Port ( H : in STD_LOGIC;
HC1 : in STD_LOGIC;
HC2 : in STD_LOGIC;
C1data : in STD_LOGIC_VECTOR (2 downto 0);
C2data : in STD_LOGIC_VECTOR (2 downto 0);
Buffer1 : out STD_LOGIC_VECTOR (3 downto 0);
Buffer2 : out STD_LOGIC_VECTOR (3 downto 0));
end fifo;
architecture Behavioral of fifo is
signal Full1,Full2 : STD_LOGIC;
begin
process(H,HC1,HC2)
begin
if(rising_edge(H)) then
Full1 <= '0';
Full2 <= '0';
else
if(rising_edge(HC1)) then
Buffer1(3 downto 1) <= C1data;
Buffer1(0) <= C1data(2) xor C1data(1) xor C1data(0);
Full1 <= '1';
end if;
if(rising_edge(HC2)) then
Buffer2(3 downto 1) <= C2data;
Buffer2(0) <= C2data(2) xor C2data(1) xor C2data(0);
Full2 <= '1';
end if;
end if;
end process;
and it says:
ERROR:Xst:827 - "C:/Users/Administrator/Desktop/test/justatest/fifo.vhd" line 45: Signal Buffer1<0> cannot be synthesized, bad synchronous description. The description style you are using to describe a synchronous element (register, memory, etc.) is not supported in the current software release.
why? Many thanks !
Not all valid VHDL is synthesizable. What is considered synthesizable varies between tools and the target architecture. The Xilinx hardware architectures have no way to represent the logic described by your code (without resorting to gated clocks). Synthesizers only support a subset of the language and expect hardware primitives to be described using certain "set" templates. Modern tools are more forgiving in what they will accept for a high level description but there is a limit to what they can accomplish.
Digital logic synthesis tools make certain assumptions about the types of circuits they will support. Your circuit description applies the rising_edge() function to three different signals in the same process. Complex clocking arrangements like this are generally not supported. The usual expectation is that a circuit consists of isolated clock domains activated by a single clock edge. They will not automatically create gated clocks to suit atypical code like your example because this introduces potential hazards into the circuit that may not be detected with timing constraints and static timing analysis.
In the case of FPGAs, the clocking architecture is baked in and no amount of fiddling with the input description can change that. Feeding clocks into the logic fabric to be gated upsets the default expectations of the synthesizer and is best avoided if at all possible.
If HC1 and HC2 are actually control signals and not clocks then you shouldn't be using the rising_edge() function to detect changes in their state. Instead you should create delayed versions registered by the common clock H. A change from '0' to '1' is then detected by the expression HC1 = '1' and HC1_prev = '0'.
The else condition to the top level if statement is not supported by XST as it doesn't conform to XSTs expectations for describing synchronous logic. You should instead eliminate the else and move the initialization of Full1 and Full2 to a separate reset/clear section. This can be done synchronously or asynchronously. Refer to the XST synthesis guide for examples on how to accomplish that.

issue with vhdl structural coding

The code below is a simple vhdl structural architecture, however, the
concurrent assignment to the signal, comb1, is upsetting the simulation
with the outputs (tb_lfsr_out) and comb1 becoming undefined. Please, please help,
thank you, Louise.
library IEEE;
use IEEE.STD_LOGIC_1164.all;
entity testbench is
end testbench;
architecture behavioural of testbench is
CONSTANT clock_frequency : REAL := 1.0e9;
CONSTANT clock_period : REAL := (1.0/clock_frequency)/2.0;
signal tb_master_clk, comb1: STD_LOGIC := '0';
signal tb_lfsr_out : std_logic_vector(2 DOWNTO 0) := "111";
component dff
port
(
q: out STD_LOGIC;
d, clk: in STD_LOGIC
);
end component;
begin
-- Clock/Start Conversion Generator
tb_master_clk <= (NOT tb_master_clk) AFTER (1 SEC * clock_period);
comb1 <= tb_lfsr_out(0) xor tb_lfsr_out(2);
dff6: dff port map (tb_lfsr_out(2), tb_lfsr_out(1), tb_master_clk);
dff7: dff port map (tb_lfsr_out(1), tb_lfsr_out(0), tb_master_clk);
dff8: dff port map (tb_lfsr_out(0), comb1, tb_master_clk);
end behavioural;
It's just a little more complex than Radix Ciano(1) says. All tb_lfsr_out elements are showing 'U' from Now = 0 ns. The reason why is that all of the D flip flops aren't initialized.
All tb_lfsr_out elements are showing 'U' from Now = 0 ns. The reason why is that all of the D flip flops aren't initialized.
If you reset all the flip flops the result will always be '0' without a '1' to cause a flip in the XOR gate.
Preset the D flip flops (which can come for free in an FPGA implementation):
This was done by simply adding a default value:
q: out std_logic := '1';
(1) Yes it's a minor change, and to all appearances someone changed their user name and if asked I would have changed Radix to Ciano. Making changes simply to cross a threshold is ridiculous.
The entire purpose of this answer was to avoid stepping on the other answerer's rights of authorship and now he's done the very thing. The issue with his answer being that the complimentary property of XOR prevented the LFSR from working when all inputs were '0's or any inputs were metavalues.
And while you're at it no one noticed the error in the waveform now corrected, apparently too self absorbed in playing games with answer edits. (The signals after the label dff8 were actually from dff7).
There's a message here which is in the form of a question. What's the purpose in answering questions on stackoverflow? See Why I no longer contribute to StackOverflow . And note Mr. Richter's reputation has continued to eek upward, including for the example post on goto he cites as likely to induce severe ire. (And the message there is have patience all you petty editors, sooner or later you're 'reputation' will reach self sustaining levels unless the system is altered to prevent it).
Also note the question's author has to date and after an impassioned plea closing his question not show acceptance of nor use for any answer.
In the mean time quit spoiling why I answer questions on VHDL by changing the words I write, although I have to admit the edit voting history was entertaining.

Resources