VHDL Frequency Meter that reads NCO - vhdl

<hi guys im trying to make an auto-ranging 4-digit frequency meter for measuring signals with frequencies from
10 Hz to 10 MHz. An external 1 MHz generator is used to provide a clock signal to the meter. To do this I have made an NCO to generate frequency but I dont know how to make the frequency meter. Any clues or examples you can show me? From what I understand is you can use "1 sec gate" to compare your rising_edge from NCO pulses to another clock, maybe Im wrong Help Please.
NCO code
ENTITY PA IS
-- N:width of address 2^N; M:width of accumultor;
GENERIC(N:natural:=12;
M:natural:=26);
PORT(
clk: IN STD_LOGIC; -- 50 MHz
nrst: IN STD_LOGIC;
-- 100Hz resolution, Fres=Fclk/(2^M)
phase_inc: IN STD_LOGIC_VECTOR(M-1 DOWNTO 0);
address: OUT STD_LOGIC_VECTOR(N-1 DOWNTO 0);
);
END PA;
ARCHITECTURE PA_IMPL OF PA IS
SIGNAL accum: STD_LOGIC_VECTOR(M-1 DOWNTO 0):=(OTHERS=>'0');
BEGIN
PROCESS(clk,nrst)
BEGIN
IF(nrst='0') THEN
accum <= (OTHERS=>'0');
ELSIF(rising_edge(clk)) THEN
if (accum >= "10111110101111000010000000") then --capping it at 50MHZ
accum <=phase_inc;
else
accum <= phase_inc + accum ; --accum into phase freq in?
end if;
END IF;
END PROCESS;
address <= accum(M-1 DOWNTO M-N);
--msb_o <= count(count'left);
END ARCHITECTURE;

Doing an actual frequency meter is not that easy.
The simpliest way I see is to perform a Fast fourier transform on your sampled signal (you have IPs and VHDL code on internet for this), then find the peaks of the FFT to obtains the frequency. To do this you can just search on your entire FFT to find the highest value.
Also, be sure to have a sampling frequency of at least twice the maximum frequency your detector can measure.

Related

Is There Any Limit to How Wide 2 VHDL Numbers Can Be To Add Them In 1 Clock Cycle?

I am considering adding two 1024-bit numbers in VHDL.
Ideally, I would like to hit a 100 MHz clock frequency.
Target is a Xilinx 7-series.
When you add 2 numbers together, there are inevitably carry bits. Since carry bits on the left cannot be computed until bits on the right have been calculated, to me it seems there should be a limit on how wide a register can be and still be added in 1 clock cycle.
Here are my questions:
1.) Do FPGAs add numbers in this way? Or do they have some way of performing addition that does not suffer from the carry problem?
2.) Is there a limit to the width? If so, is 1024 within the realm of reason for a 100 MHz clock, or is that asking for trouble?
No. You just need to choose a suitably long clock cycle.
Practically, though there is no fundamental limit, for any given cycle time, there will be some limit which depends on the FPGA technology.
At 1024 bits, I'd look at breaking the addition and pipelining it.
Implemented as a single cycle, I would expect a 1024 bit addition to have a speed somewhere around 5, maybe 10 MHz. (This would be easy to check : synthesise one and look at the timing reports!)
Pipelining is not the only approach to overcoming that limit.
There are also "fast adder" architectures like carry look-ahead, carry-save (details via the usual sources) ... these pretty much fell out of fashion when FPGAs built fast carry chains into the LUT fabric, but they may have niche uses such as yours. However they may not be optimally supported by synthesis since (for most purposes) the fast carry chain is adequate.
Maybe this works, have not tried it:
library ieee;
USE ieee.std_logic_1164.all;
use ieee.numeric_std.all;
entity Calculator is
generic(
num_length : integer := 1024
);
port(
EN: in std_logic;
clk: in std_logic;
number1 : in std_logic_vector((num_length) - 1 downto 0);
number2 : in std_logic_vector((num_length) - 1 downto 0);
CTRL : in std_logic_vector(2 downto 0);
result : out std_logic_vector(((num_length * 2) - 1) downto 0));
end Calculator;
architecture Beh of Calculator is
signal temp : unsigned(((num_length * 2) - 1) downto 0) := (others => '0');
begin
result <= std_logic_vector(temp);
process(EN, clk)
begin
if EN ='0' then
temp <= (others => '0');
elsif (rising_edge(clk))then
case ctrl is
when "00" => temp <= unsigned(number1) + unsigned(number2);
when "01" => temp <= unsigned(number1) - unsigned(number2);
when "10" => temp <= unsigned(number1) * unsigned(number2);
when "11" => temp <= unsigned(number1) / unsigned(number2);
end case;
end if;
end process;
end Beh;

How to calculate the RPM of a hometrainer with VHDL

I've got a problem; I need to calculate / measure the RPM of a hometrainer using a hall sensor and a magnet on the wheel, the hardware needs to be described in VHDL, my current method is this:
If the hall sensor detects a pulse, reset a counter
Increment counter every clockcycle
On the next pulse, store the previous value, reset, and repeat.
The code:
library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
use IEEE.STD_LOGIC_ARITH.ALL;
use IEEE.STD_LOGIC_UNSIGNED.ALL;
entity teller is
port(
hallsens : in std_logic;
counter : out std_logic_vector(15 downto 0);
areset : in std_logic;
clk : in std_logic
);
end entity teller;
architecture rtl of teller is
signal counttemp : std_logic_vector(15 downto 0);
signal timerval2 : std_logic_vector(15 downto 0);
signal lastcount : std_logic_vector(15 downto 0);
begin
process(clk, areset)
begin
if areset = '1' then
counter <= "0000000000000000";
counttemp <= "0000000000000000";
timerval2 <= "0000001111101000";
elsif hallsens = '1' then
counter <= lastcount + "1";
timerval2 <= "0000001111101000";
counttemp <= "0000000000000000";
elsif rising_edge(clk) then
timerval2 <= timerval2 - "1";
if timerval2 = "0000000000000000" then
lastcount <= counttemp;
counttemp <= counttemp + "1";
timerval2 <= "0000001111101000";
end if;
end if;
end process;
end rtl;
But to calculate the RPM from this I have to divide the counter by the clockspeed, and multiply by 60. This takes up a lot of hardware on the FPGA (Altera Cyclone 2).
Is there a more efficient way to do this?
TIA
I don't have a computer at hand now, but I'll try to point different things I see:
Don't mix numerical libraries (preferably only use the numeric_std) #tricky suggests.
If handling numerical values, and including libraries for that.. you can should use numerical types for signals (integer, unsigned, signed..) it makes things clear and helps to distinguish numeric signals and no numercial-meant signals.
Hallsens is read as a pseudo-reset, but is not in the sensitivity list of the process, this could cause mismatches between Sims and hw. Anyway this is not a good approach, stick with a simple reset and clock pair.
I would detect hallsens within the clocked region of the process and increment the counter of events there. It should be simpler.
I'm assuming your hallsens asserted time is wide enough to be captured by the clock.
Once timer signal has reached zero (I'm assuming this gives you a known time based on your clk frequency) you can reload again the timer (as you do), output the count value and reset the counter, starting again.
For math operations 1/Freq and *60, you could use some numerical tricks if needed, based on the frequency value.. but you could:
multiply by inverse of frequency instead of dividing.
approximate it to sums of power of 2. (60 = 64-4)
make Freq to be multiple of 60 to simplify calcs.
Ps: to be less error prone, you can initialize your vectors (as theyre multiple of 4) in hex format like: signal<=X"0003" avoiding big binary numbers.

Clock divider in vhdl from 100MHz to 1Hz code

I wrote this code for dividing the clock an a nexys4 fpga that has its integrated clock at 100Mhz frequency by default , and i need to divide it to 1hz. Can someone tell me if its correct or if not what needs to be changed ?
LIBRARY IEEE;
USE IEEE.STD_LOGIC_1164.ALL;
entity digi_clk is
port (clk1 : in std_logic;
clk : out std_logic
);
end digi_clk;
architecture Behavioral of digi_clk is
signal count : integer :=0;
signal b : std_logic :='0';
begin
--clk generation.For 100 MHz clock this generates 1 Hz clock.
process(clk1)
begin
if(rising_edge(clk1)) then
count <=count+1;
if(count = 50000000) then
b <= not b;
count <=0;
end if;
end if;
clk<=b;
end process;
end;
The code looks OK. However the existing code will produce an output frequency that is just below 1 Hz. To get a precise 100000000:1 ratio, you will want to change the conditional statement from:
if(count = 50000000) then
... to:
if(count = 50000000-1) then
The program seems correct, but you should be declaring the internal signal (count) as an integer. Then your code should compile successfully. But you will get some warnings and will find some problems in testbech simulation. To avoid that you need to declare the internal signal ( count ) as: signal count : std_logic_vector (25 downto 0); because 100MHz coded in 26 bits. I prefer to convert the 50000000 to Hexadecimal format and it should work without any problem.

VHDL beginner - what's going wrong wrt to timing in this circuit?

I'm very new to VHDL and hardware design and was wondering if someone could tell me if my understanding of the following problem I ran into is right.
I've been working on a simple BCD-to-7 segment display driver for the Nexys4 board - this is my VHDL code (with the headers stripped).
entity BCDTo7SegDriver is
Port ( CLK : in STD_LOGIC;
VAL : in STD_LOGIC_VECTOR (31 downto 0);
ANODE : out STD_LOGIC_VECTOR (7 downto 0);
SEGMENT : out STD_LOGIC_VECTOR (6 downto 0));
function BCD_TO_DEC7(bcd : std_logic_vector(3 downto 0))
return std_logic_vector is
begin
case bcd is
when "0000" => return "1000000";
when "0001" => return "1111001";
when "0010" => return "0100100";
when "0011" => return "0110000";
when others => return "1111111";
end case;
end BCD_TO_DEC7;
end BCDTo7SegDriver;
architecture Behavioral of BCDTo7SegDriver is
signal cur_val : std_logic_vector(31 downto 0);
signal cur_anode : unsigned(7 downto 0) := "11111101";
signal cur_seg : std_logic_vector(6 downto 0) := "0000001";
begin
process (CLK, VAL, cur_anode, cur_seg)
begin
if rising_edge(CLK) then
cur_val <= VAL;
cur_anode <= cur_anode rol 1;
ANODE <= std_logic_vector(cur_anode);
SEGMENT <= cur_seg;
end if;
-- Decode segments
case cur_anode is
when "11111110" => cur_seg <= BCD_TO_DEC7(cur_val(3 downto 0));
when "11111101" => cur_seg <= BCD_TO_DEC7(cur_val(7 downto 4));
when "11111011" => cur_seg <= BCD_TO_DEC7(cur_val(11 downto 8));
when "11110111" => cur_seg <= BCD_TO_DEC7(cur_val(15 downto 12));
when "11101111" => cur_seg <= BCD_TO_DEC7(cur_val(19 downto 16));
when "11011111" => cur_seg <= BCD_TO_DEC7(cur_val(23 downto 20));
when "10111111" => cur_seg <= BCD_TO_DEC7(cur_val(27 downto 24));
when "01111111" => cur_seg <= BCD_TO_DEC7(cur_val(31 downto 28));
when others => cur_seg <= "0011111";
end case;
end process;
end Behavioral;
Now, at first I tried to naively drive this circuit from the board clock defined in the constraints file:
## Clock signal
##Bank = 35, Pin name = IO_L12P_T1_MRCC_35, Sch name = CLK100MHZ
set_property PACKAGE_PIN E3 [get_ports clk]
set_property IOSTANDARD LVCMOS33 [get_ports clk]
create_clock -add -name sys_clk_pin -period 10.00 -waveform {0 5} [get_ports clk]
This gave me what looked like almost garbage output on the seven-segment displays - it looked like every decoded digit was being superimposed onto every digit place. Basically if bits 3 downto 0 of the value being decoded were "0001", the display was showing 8 1s in a row instead of 00000001 (but not quite - the other segments were lit but appeared dimmer).
Slowing down the clock to something more reasonable did the trick and the circuit works how I expected it to.
When I look at what elaboration gives me (I'm using Vivado 2014.1), it gives me a circuit with VAL connected to 8 RTL_ROMs in parallel (each one decoding 4 bits of the input). The outputs from these ROMs are fed into an RTL_MUX and the value of cur_anode is being used as the selector. The output of the RTL_MUX feeds the cur_val register; the cur_val and cur_anode registers are then linked to the outputs.
So, with that in mind, which part of the circuit couldn't handle the clock rate? From what I've read I feel like this is related to timing constraints that I may need to add; am I thinking along the right track?
Did your timing report indicate that you had a timing problem? It looks to me like you were just rolling through the segment values extremely fast. No matter how well you design for higher clock speeds, you're rotating cur_anode every clock cycle, and therefore your display will change accordingly. If your clock is too fast, the display will change much faster than a human would be able to read it.
Some other suggestions:
You should split your single process into separate clocked and unclocked processes. It's not that what you're doing won't end up synthesizing (obviously), but it's unconventional, and may lead to unexpected results.
Your initialization on cur_seg won't really do anything, as it's always driven (combinationally) by your process. It's not a problem - just wanted to make sure you were aware.
Well there are two parts to this.
Your segments appeared so dimly because you are basically running them at a 1/8th duty cycle at a faster rate than the segments have time to react(every clock pulse you are changing which segment is lit up and then you stop driving it on the next pulse).
By increasing the period your segments got brighter by switching from a transient current (segments need time to ramp up) to a steady state current (longer period lets current go to desired levels when you drive the segments slower than their inherent driving frequency). Hence the brightness increase.
One other thing about your code. You may be aware of this, but when you latch with your clock there, the variable labeled cur_anode is advanced and actually represents the NEXT anode. You also latch ANODE and SEGMENT to the current anode and segment respectively. Just pointing out that the cur_anode may be a misnomer (and is confusing because its usually the NEXT one).
Keeping in mind Paul Seeb's and fru1bat's answers on clock speed, Paul's comment on NEXT anode, and fru1bat's suggestion on separating clocked and un-clocked processes as well as your noting that you had 8 ROMs, there are alternative architectures.
Your architecture with a ring counter for ANODE and multiple ROMs happens to be optimal for speed, which as both Paul and fru1bat note isn't needed. Instead you can optimize for area.
Because the clock speed is either external or controlled by the addition of an enable supplied periodically it isn't addressed in area optimization:
architecture foo of BCDTo7SegDriver is
signal digit: natural range 0 to 7; -- 3 bit binary counter
signal bcd: std_logic_vector (3 downto 0); -- input to ROM
begin
UNLABELED:
process (CLK)
begin
if rising_edge(CLK) then
if digit = 7 then -- integer/unsigned "+" result range
digit <= 0; -- not tied to digit range in simulation
else
digit <= digit + 1;
end if;
SEGMENT_REG:
SEGMENT <= BCD_TO_DEC7(bcd); -- single ROM look up
ANODE_REG:
for i in ANODE'range loop
if digit = i then
ANODE(i) <= '0';
else
ANODE(i) <= '1';
end if;
end loop;
end if;
end process;
BCD_MUX:
with digit select
bcd <= VAL(3 downto 0) when 0,
VAL(7 downto 4) when 1,
VAL(11 downto 8) when 2,
VAL(15 downto 12) when 3,
VAL(19 downto 16) when 4,
VAL(23 downto 20) when 5,
VAL(27 downto 24) when 6,
VAL(31 downto 28) when 7;
end architecture;
This trades off a 32 bit register (cur_val), an 8 bit ring counter (cur_anode) and seven copies of the ROM implied by function BCD_TO_DEC7 for a three bit binary counter.
In truth the argument over whether or not you should be using separate sequential (clocked) and combinatorial (non clocked) processes is somewhat reminiscent of Liliput and Blefuscu going to war over Endian-ness.
Separate processes generally execute a little more efficiently due to not sharing sensitivity lists. You could also note that all concurrent statements have process or block statement equivalents. There's also nothing in this design that can take particular advantage of using variables which can result in more efficient simulation while implying a single process. (Shared variables aren't supported by XST).
I haven't verified this will synthesize but after reading through the 14.1 version of the XST user guide think it should. If not you can convert digit to a std_logic_vector with a length of 3.
The + 1 for digit will get optimized, an incrementer is smaller than a full adder.

How to take samples using fpga?

I want to take samples of digital data coming externaly to FPGA spartan 3.
I want to take 1000 samples/sec initially. How to select a clock frequency in vhdl coding?
Thanks.
Do not use a counter to generate a lower frequency clock signal.
Multiple clock frequencies in an FPGA cause a variety of design problems, some of which come under the heading of "advanced topics" and, while they can (if necessary) all be dealt with and solved, learning how to use a single fast clock is both simpler and generally better practice (synchronous design).
Instead, use whatever fast clock your FPGA board provides, and generate lower frequency timing signals from it, and - crucially - use them as clock enables, not clock signals.
DLLs, DCMs, PLLs and other clock managers do have their uses, but generating 1 kHz clock signals is generally not a good use, even if their limitations permit it. This application is just crying out for a clock enable...
Also, don't mess around with magic numbers, let the VHDL compiler do the work! I have put the timing requirements in a package, so you can share them with the testbench and anything else that needs to use them.
package timing is
-- Change the first two constants to match your system requirements...
constant Clock_Freq : real := 40.0E6;
constant Sample_Rate : real := 1000.0;
-- These are calculated from the above, so stay correct when you make changes
constant Divide : natural := natural(Clock_Freq / Sample_Rate);
-- sometimes you also need a period, e.g. in a testbench.
constant clock_period : time := 1 sec / Clock_Freq;
end package timing;
And we can write the sampler as follows:
(I have split the clock enable out into a separate process to clarify the use of clock enables, but the two processes could be easily rolled into one for some further simplification; the "sample" signal would then be unnecessary)
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.numeric_std.all;
use work.timing.all;
entity sampler is
Port (
Clock : in std_logic;
Reset : in std_logic;
ADC_In : in signed(7 downto 0);
-- signed for audio, or unsigned, depending on your app
Sampled : out signed(7 downto 0);
);
end sampler;
architecture Behavioral of Sampler is
signal Sample : std_logic;
begin
Gen_Sample : process (Clock,Reset)
variable Count : natural;
begin
if reset = '1' then
Sample <= '0';
Count := 0;
elsif rising_edge(Clock) then
Sample <= '0';
Count := Count + 1;
if Count = Divide then
Sample <= '1';
Count := 0;
end if;
end if;
end process;
Sample_Data : process (Clock)
begin
if rising_edge(Clock) then
if Sample = '1' then
Sampled <= ADC_In;
end if;
end if;
end process;
end Behavioral;
The base clock must be based on an external clock, and can't be generated just through internal resources in a Spartan-3 FPGA. If required, you can use the Spartan-3 FPGA Digital Clock Manager (DCM) resources to scale the external clock. Synthesized VHDL code in itself can't generate a clock.
Once you have some base clock at a higher frequency, for example 100 MHz, you can easily divide this down to generate an indication at 1 kHz for sampling of the external input.
It depends on what clock frequency you have available. If you have a 20MHz clock source, you need to divided it by 20000 in order to get 1KHz, you can do it in VHDL or use a DCM to do this.
This is from an example on how to create a 1kHz clock from a 20MHz input:
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
entity clk20Hz is
Port (
clk_in : in STD_LOGIC;
reset : in STD_LOGIC;
clk_out: out STD_LOGIC
);
end clk200Hz;
architecture Behavioral of clk20Hz is
signal temporal: STD_LOGIC;
signal counter : integer range 0 to 10000 := 0;
begin
frequency_divider: process (reset, clk_in) begin
if (reset = '1') then
temporal <= '0';
counter <= 0;
elsif rising_edge(clk_in) then
if (counter = 10000) then
temporal <= NOT(temporal);
counter <= 0;
else
counter <= counter + 1;
end if;
end if;
end process;
clk_out <= temporal;
end Behavioral;

Resources