I am working in 8 bit pixel values..for ease of coding i want to use conv_integer to convert this 8 bit std_logic_vector.is it cause any synthesise problem?is it reduce the speed of hardware...
No, integers synthesise just fine. Don't use conv_integer though - that's from an old non-standard library.
You want to use ieee.numeric_std; and then to_integer(unsigned(some_vector));
If you still want to access the bits, and treat the vector as a number, then use the signed or unsigned type - they define vectors of bits (which can still have -, Z etc.) which behave as numbers, so you can write unsigned_vector <= unsigned_vector + 1.
You will lose a lot of the functionality that comes with the standard logic vector such as having the value 'Z' or 'X'. If you need access to the bits leave it as std_logic_vector, or cast it to numeric_std. If you don't and you need to do some fancy arithmetic maybe it's better to have as an int. At the end of the day its all bits. Its normally best to keep to a vector type (std_logic_vector, unsigned, signed etc) at the top level so you can map each bit to a specific pin, but otherwise, you can use whatever types you want. Don't forget you are designing hardware now, not software, and there is a difference.
Related
I want to be able to have a shift register that does an XOR against another register loaded with some value. The issue is that I wish to do this with a large scale vector, something on the order of thousands of bits wide.
The obvious way to do this in VHDL would be something like
generic( length : integer := 15);
signal shiftreg : std_logic_vector(length downto 0);
process(clk)
begin
if rising_edge(clk) then
shiftreg<= shiftreg(length-1 downto 0) & input;
endif;
end process;
However, if length here is set to some very high number, attempting to synthesize this becomes a massive undertaking. Since this is a relatively simple structure I imagine it is taking so long because the length is far beyond the number of registers in a single block.
My question is if there is some way to implement a large vector like this in a way that would be quicker to synthesize. For example, is it quicker to use something like
array(length downto 0) of std_logic;
or does a synthesis tool recognize those are equivalent?
Synthesis time is not typically relevant in FPGA design, although area utilization and timing usually is. If your shift register takes most of the resources that your target FPGA has, synthesis will take a long time trying to figure out a way to make it work, and likewise builds take longer as you fill up larger parts. For some ballpark, an 80% full design with tight timing in a modern midrange FPGA usually takes about 30 minutes to synthesize and 3 hours to place&route. This will not be significantly affected by coding style if you're still describing the same functionality.
If you describe a shift register (with the same functional features) in VHDL using std_logic_vector, a type you defined as an array of std_logic, or anything else, it will synthesize into the same thing.
In recent-ish Xilinx parts at least, a single LUT can be used for a 64-deep shift register as long as you haven't described a reset (synchronous OR asynchronous). You can likewise produce a 1000 deep shift register with just a handful of LUTs.
Now if you're looking to use the whole thousand+ bits of this shift register to xor against some other register, you can't use SRLs (LUT used as a shift register) because only the final bit is accessible as an output. This makes it put the whole thing in registers which may be rather large, and could require more registers than your part has. The key thing here is that you have to think about the scale of the hardware you describe, and whether that's feasible in your target part.
If you want a really deep shift register, block rams can be used to act like shift registers at depths exceeding 100,000 but these have the same issue where you only access the final output.
I have a problem which is easier solved with a HLS tool than with writing down the raw VHDL / verilog. Currently I'm using a Xilinx Virtex-7 as I think this has been solved already by some other vendors.
I can use VHDL 2008.
So imagine in VHDL you have many calculations such as:
p1 <= a x b - c;
p2 <= p1 x d - e;
p3 <= p2 x f - g;
p4 <= p2 x p1 - p3;
Currently if I were to write this with IP Cores, it would be four DSP IP cores, and because of the different port widths, I'd have to generate this IP core 4 times. Anytime I make a change to some of these external signals, all the widths would change again. Keeping track of all this resizing is a pain, especially when resizing signed vectors down.
I have a lot of maths and thus a lot of DSP logic. It would be easier to write this block with a HLS tool. Ideally I would like it to handle the widths and bitshift the data accordingly.
Does such a tool exist? Which one would you recommend?
Bonus points:
Do any of these tools handle floating point maths and let you control precision?
There are lots of ways to accomplish your goal. But first to address your points.
Currently if I were to write this with IP Cores, it would be three DSP IP cores, and because of the different port widths, I'd have to generate this IP core 3 times.
Not necessarily. If your inputs a through g are all fixed point, you can use ieee.numeric_std or in VHDL-2008 you can use ieee.fixed_pkg. These will infer DSP cores (such as the DSP48 on Xilinx). For example:
-- Assume a, b, and c are all signed integers (or implicit fixed point)
signal a : signed(7 downto 0);
signal b : signed(7 downto 0);
signal c : signed(7 downto 0);
signal p1 : signed(a'length+b'length downto 0); -- a times b produces a'length + b'length +1 (which also corresponds to (a times b) - c adding one bit).
...
p1 <= a*b - resize(c, p1'length);
This will imply multipliers and adders.
And this can be similarly done with UFIXED or SFIXED. But you do need to track the bit widths.
Also, there is a floating point package (ieee.float_pkg), but I would NOT recommend that for hardware. You are better off timing and resource-wise to implement it in fixed point.
Anytime I make a change to some of these external signals, all the widths would change again. Keeping track of all this resizing is a pain.
You can do this automatically. Look at my example above. You can easily determine widths based on the operations. Multiplications sum the number of bits. Additions add a single bit. So, if I have:
y <= a * b;
Then I can derive the length of y as simply a'length + b'length. It can be done. The issue, however, is bit growth. The chain of operations you describe will grow significantly if you keep full precision. At certain points you will need to truncate or round to reduce the number of bits. This is the hard part, it how much error you can tolerate is dependent upon the algorithm and expected data input.
I have a lot of maths and thus a lot of DSP logic. It would be easier to write this block with a HLS tool. Ideally I would like it to handle the widths and bitshift the data accordingly.
Automatic handling is the hard part. In VHDL this will not happen (nor Verilog for that matter). But you can track it fairly well and have bit widths update as necessary. But it will not automatically handle things like rounding, truncation, and managing error bounds. A DSP engineer should be handing those issues and directing the RTL developer on the appropriate widths and when to round or truncate.
Does such a tool exist? Which one would you recommend?
There are a variety of options to do this at a higher level. None of these are particularly frugal with respect to resources. Matlab has a code generation tool that will convert Matlab models (suitably constructed) into RTL. It will even analyze issues such as rounding, truncation, and determine appropriate bit widths. You can control the precision, but it is fixed point. We've played with it, and found it very far from producing efficient, high-speed code.
Alternatively, Xilinx does have an HLS suite (see Vivado). I'm not all that well versed in the methodology, but as I understand it, it allows writing C code to implement algorithms. The C doe is then "synthesized" to something that executes in some sort of execution engine. You still have to interface that C code to RTL infrastructure, and that's a challenge in its own right. The reason we have so far not pursued it heavily (even though we do DSP heavy designs) is that it is a big challenge to simulate both the HLS and RTL together as a system.
In the past I found flopoco to generate arbitrary math functions in hardware. If I recall correctly, it supports many types of functions. For instance it could generate a arithmetic core to compute something like a=3*sinĀ²(x+pi/3). For these calculations allows you to specify the overall precision of the inputs/outputs (for floating point/fixed point) or the width of the inputs ( integer ). Execution frequency and whether or not to pipeline the function can also be specified.
Here is an old tutorial I found on how to use it: tutorial
I'm new to VHDL and am trying to find a way to take a n bit (stored as a generic) signed number and truncate it to a form that requires the minimum number of bits.
For example, if I have 5 as its 8 bit signed number (stored in a std_logic_vector of length 8) 00000101, I'd like to make a function to return 0101 as a std_logic_vector. Any ideas on how I can accomplish this?
Since you have specified that you're using a signed value, you may want to use the signed type (from the numeric_std library) instead of the more generic std_logic_vector.
If your number is a compile time constant, you can write a function starting from the leftmost bit (in a for loop for example) that counts how many identical bits it sees, then returns signed_input(8-result downto 0). The issue with this is that as a compile time constant, there isn't much advantage in removing the redundant bits. The whole vector will be optimized away in synthesis.
You might want to include special cases to make the result at least 1 bit (0 technically doesn't need any bits to represest) or 2 bits (-1 only needs the sign bit to distinguish it from 0) depending on how you want to use your signed type value.
If your number is a real signal (the value changes during operation), you can still count the number of identical bits from the left, but variable location slicing of the vector will be iffy. Are you trying to pack the most of several numbers into a fixed bit width? Doing that will synthesize into multiplexers for each bit as well as the LUTs used for calculating the number of redundant bits for each of the numbers.
I have two numbers A and B, both of different sizes and i need to multiply them using VHDL. I don't know the exact logic to multiply them.
If you are trying to multiply two std_logic_vector, then * will fails,
since std_logic_vector is just an array of std_logic elements, but does not
have an inherit numerical representation.
So take a look a the
ieee.numeric_std VHDL
package. This defines unsigned and signed types that assume a typical
numerical representation of an array, along with operators on these types,
including *. Using this package you can do:
use ieee.numeric_std.all;
...
c <= std_logic_vector(unsigned(a) * unsigned(b));
Note that for * the c'length is a'length + b'length.
Btw. welcome to Stack Overflow, and please spend some time in Stack Overflow
Help Center, so you can get better answers in
the future, and avoid being voted down or get the answer closed.
I'm creating a full adder with a variable number of bits. I've got a component that is a half-adder which takes in three inputs (the two bits to add, and a carry in bit) and gives 2 outputs (one bit output and a carry out bit).
I need to tie the carry out of one half-adder to the carry in of another. And I need to do this a variable number of times (if I'm adding 4 digit numbers, I'll need 4 half adders. If I'm doing 32 bit numbers, I'll need 32 half adders).
I was going to tie the carry outs of one half-adder to the carry in of another using signals, but I don't know how to create a variable number of signals.
I can instantiate a variable number of half-adders using a for-loop in a process, but since signals are defined outside of processes, I can't use a for loop for it. I don't know how I should tie the half-adders together.
The easiest way to write an adder in VHDL is not to worry about full adders and half adders, but just type:
a <= b + c;
where a,b and c are signed or unsigned
95% of the time, the synthesis tools will do a better job than you would.
I think you want variable-width signals not variable numbers of signals
Your signals need to be std_logic_vector(31 downto 0) for example - and then you wire up the bits of those signals to your half-adders appropriately.
Of course, as those signals are numbers, then don't use std_logic_vector use signed or unsigned (and the ieee.numeric_std lib).
And (as Philippe rightly points out) unless this is a learning exercise, just use the + operator.