Round Robin gate-level diagram - fpga

May I know why 'priority' signal is fed back to AND gate input ?
What is the purpose of the AND gate in the picture below ?
From googling, I also found this article , but I am not sure if the AND gate in this article serves the similar purpose.
The article implementation uses some mask vector which seems a bit strange and complicated in terms of hardware resources as well.

'priority' signal is fed back so that the given priority stays on for multiple cycles since the registers are not conditionally clocked
So, if priority 1 is high and all the grant inputs are low, it will stay high forever.
Well, better wording would be: it is looped back into the AND gate for the purpose of it staying on forever and the AND gate is there to cut it off in case a grant input becomes high

Related

What are the difference in delay times of the basic AND, OR, NOT, NAND, NOR, XOR, XNOR gates?

1-1 What are the difference in delay times of the basic logic gates?
I found that NAND and NOR gates are preferred in digital circuit design for shorter delay time and that AND and OR gates might even be implemented with NOT and NAND/NOR gates.
1-2 Are there set or known difference in delay time between AND, OR, NOT gates?
For a typical fpga (LUT-based logical elements) there's no difference at all.
Single cell can implement a complex function based on its resulting truth table, and multiple expressions might be folded into single cell, so you wouldn't even find individual and/or/not "gates".
It might be different for ASIC, I don't know. But in a typical fpga you don't have gates, there are ram-based lookup tables, implementing complex functions of its inputs - 4-6 inputs, not just 2.
You'll find that in a big enough design the routing costs are much higher than delays in a single logical cell.
If you look at how these different gates are constructed you can see some of the reasons for differences. An inverter consists of one pull-up transistor and one pull down transistor. This is the simplest gate and is therefore potentially the fastest. A NAND has two pull-down devices in series and two pull-up transistors in parallel. The NOR is basically the opposite of the NAND. And yes: AND is usually just NAND + inverter.
The on resistance of a path will be higher with two transistors in series (making it slower), and the number of transistors connected to a single node will increase the captive load (making it slower). You can make things faster by using larger transistors (with lower on resistance) but that increases the load of whatever cell is driving it, which slows that cell down.
It is a big optimization problem which you probably shouldn't try to solve yourself. That is what the EDA tools are for.
Like most answers in life, it depends. There are many ways to build each type of logic gate and different types of transistors can be used to make each type of gate. You can build all gates from multiple universal gates like NAND and NOR. So the other gates would have a larger delay time. BJT transistors will have a larger delay than MOFET transistors. You can also use Schottky transistors to reduce delays compared to BJT. If you use an IC there are lots of components within the chip, some which may reduce delays and some that may increase delays. So you really have to compare what you are working with. Here is a video that shows the design of logic gates at the transistor level. https://youtu.be/nB6724G3b3E

Racing/ S-R Circuits?

Following truth table resulted from the circuit below. SR(NOR) latch is used. I have tried several times to trace through the circuit to see how truth table values are produced but its not working. Can someone explain to me what is going on ? This circuit was introduced in conjunction with racing although I am not sure if it has anything to do with it.
NOTE: "CLOCK" appears as a straight line to show how its connected everything. It is a normal clock that oscillates between 1 and 0. (this is how my instructor drew it).
Strictly, this does belong on EE. The other questions you've found are likely to be old - before EE was established.
You should look at the 1-to-0 transitions of the clock. When that occurs and only when that occurs, the value currently on S is transferred to Q.
The Race condition appears when the clock signal is delayed, even with the tiny amount of copper track between real components. The actual waveform is not 1-0 or 0-1, it ramps between the two values. A tiny variation between two components, one seeing the transition at say 2.7V and the other at 2.5 would mean that the first component moves the value from S to Q fractionally before the second, so when the second component decides to transfer the value, it may see the value after the transfer has occurred on the prior component. You therefore may have a race between the two. These delays can also be affected by supply-rail stability and temperature, so the whole arrangement can become unreliable if not carefully designed. The condition is often overcome be deliberately routing the clock so that it will arrive at the last component in the chain first, giving that end of the chain a head-start.
I've worked on systems where replacing a component with a faster version caused the circuit to stop working. The new component was working too fast for the remainder of the circuit - and you needed to deliberately select (or use factory-selected) slower versions.
On a related note, before hard-drives became cheap, and floppy-drives (you may need to google that) before them it was common to use casste tapes (even more likely you'd need google on those.) Cheap and cheerful was best. If you used a professional quality recorder/player, you'd often get unusable results.

Arrays as buffer VHDL

I need to create a FIFO buffer in VHDL. I need to use a 2 dimensional array to storage data like (number of data)(n-bit data).
If I create a single "big" array that storage for example 1000 entrys. Every new data clock I storage one slot. And every output data clock I output a data. What happen if this two clocks occour near at the same time?
For example:
if rising_edge(INPUT_DATA) then
Register_Array(Counter_IN) <= DataIN;
Counter_IN <= Counter_IN + 1;
end if;
if rising_edge(OUTPUT_DATA) then
DataOUT <= Register_Array(Counter_OUT);
Counter_OUT <= Counter_OUT + 1;
end if;
If it's possible to create a process like this, what happen if two clock are near at the same time?
Consider I can't lose any data.
What you are asking about here is a clock domain crossing FIFO, or CDC FIFO.
Clock domain crossing FIFOs are surprisingly difficult to design. There are many pitfalls, and most of them cannot be checked by simulation.
As for your arrays, you should use arrays of std_logic_vector, like in the answer linked to by #Nicolas Roudel.
But this is still far from a functioning CDC FIFO. You also need read and write pointers in gray format, gray to bin pointer conversion, clock domain crossings for the two gray pointers, empty and full indications, read and write signals, proper attributes to prevent the synthesizer from breaking the clock domain crossings, and timing constraints.
All this is needed to properly protect against exactly the thing you ask about: "What happens when two clocks occur at almost the same time?"
The thing that happens when two clocks occur at almost the same time is called "metastability", and it will cause all kinds of bad and unpredictable things in your design.
If you get only one thing in the design of the CDC FIFO wrong, your design will likely work fine in simulation, and even in hardware. Most of the time........ :-)
All FPGA vendors have ready-made CDC FIFOs which you can use. I would highly recommend that beginners consider using the ready-made FIFOs for production designs.
But at the same time, designing a CDC FIFOa is a nice challenge to learn about clock domain crossings and metastablity.
This is one of many pages where you can find information about how to handle clock domain crossings: https://filebox.ece.vt.edu/~athanas/4514/ledadoc/html/pol_cdc.html
There is also a related stackexchange answer here: https://electronics.stackexchange.com/questions/97280/trying-to-understand-fifo-in-hardware-context

Design tips for synchronising signals through a VHDL pipeline

I am designing a video pixel data processing pipeline in VHDL which involves several steps including multiply and divide.
I want to keep signals synchronised so that I can e.g. maintain a sync signal and output it correctly at the end of the pipeline along with manipulated pixel data which has been through several processing stages.
I assume I want to use shift registers or something to delay signals by the right number of cycles so that the output is correct, but I'm looking for advice about good ways to design this, particularly as the number of pipeline stages for different signals may vary as I evolve the design.
Good question.
I'm not aware of a complete solution but here are two partial strategies...
Interconnecting components... It would be really nice if a component could export a generic whose value was its pipeline depth. Unfortunately you can't, and dedicating a port to this seems silly (though it's probably workable; as it would be an integer constant, it would disappear in synthesis)
Failing that, pass IN a generic indicating the budget for this module. Inside the module, assert (severity FAILURE) if the budget can't be met... (this assert is checkable at synth time and at least Xilinx XST handles similar asserts)
Make the budget a hard number, and either assert if not equal to actual pipeline depth, or add pipe stages inside the module if the budget is too large, and only assert if the budget is too small.
That way you are connecting predictable modules, and the top level can perform pipeline arithmetic to balance things (e.g. passing a computed constant value to a programmable delay line)
Within a component... I use a single process, with registers represented as internal signals whose names reflect their pipe stage, exponent_1, exponent_2, exponent_3 and so on. Within the process, the first section describes all the actions for the first cycle, the second section describes the second cycle, and so on. Typically the "easier" paths may be copied verbatim to the next pipe stage, just to sync them with the critical path. The process is fairly organised and easy to maintain.
I might break a 32-bit multiply down into 16*16 chunks and pipeline the partial product additions. The control this gives, USED to give better results than XST gave alone...
I know some people prefer variables within a process, and I use them for intermediate results in a pipe stage, but using signals I can describe the pipeline in its natural order (thanks to postponed assignment) whereas using variables, I would have to describe it backwards!
I create a package for each of my major processing blocks, one of the constants in there is the processing delay of that block. I can then connect that up to my general-purpose "delay-line" block which has a generic for the number of cycles.
Keeping that constant in "sync" with the actual implementation is best done by a self-checking testbench.
Something to consider is delay lines (i.e. back to back registers) vs FIFOs.
Consider a module X with a pipeline delay N. FIFOs work well when there is a N is variable. The trick is remembering that you can only request new work when both the module and the FIFO can accept it. Ideally you size the FIFO so that it can contain the maximum number of items that X can work on concurrently, but sometimes that's not practical. For example, if your calculation includes accesses to a distant memory.
Another option is integrating the side channel (i.e. the path that your sync flag is taking) into the module X rather than it going outside. If you do this then if any part of the calculation has to stall, you can also stall the side channel and the two stay in sync. You can do this because you're in a scope that has all the necessary signals in it. Then all signals, whether used in the calculation or not, appear at the output at the same time.

How to detect threshing in accelerometer?

I'm writing an application controlled by an accelerometer built into a wrist watch. I want one of the commands to be "wildly swinging your forehand". How do I detect it and measure for how long it goes?
To supplement John Fisher's suggestion, I would add: Look at analyzing this with spectral/Fourier transform techniques. I would expect to see a strong signal characteristic at low frequencies, but it could easily vary from user to user.
If the characteristic is there, signal processing techniques can help you isolate it and detect it.
Write something to record the measurements while you flail your arm around. Do it several times in different ways. Analyze the measurements for a pattern you can use.
Use a Kalman filter to track a moving average of the absolute value of the current acceleration? Or, if you can, the current acceleration minus gravity?
If that goes over a threshold, that indicates a lot of acceleration has been happening recently. That suggests thrashing.

Resources