Storing a number bigger than the integer limit in vhdl - vhdl

Let me explain my problem with an example.
I have two variables
a=74686 and b=20930625.
I want to store
c= (a x 2^16) + b.
This exceeds the integer limit(32bits) in vhdl.
It is okay for me to store c in two separate registers say c1 and c2, and tell users to concatenate bits of c1 and c2 to get the actual result. i.e I want to store lower 32 bits of c in c1 and remaining bits in c2.
How can I do this?

Related

Understanding Modified Baugh-Wooley multiplication algorithm

For Modified Baugh-Wooley multiplication algorithm , why is it !(A0*B5) instead of just (A0*B5) ?
Same questions for !(A1*B5), !(A2*B5), !(A3*B5), !(A4*B5), !(A5*B4), !(A5*3), !(A5*B2), !(A5*B1) and !(A5*B0)
Besides, why there are two extra '1' ?
In signed 6-bit 2s complement notation, the place values of the bits are:
-32 16 8 4 2 1
Notice that the top bit has a negative value. When addition, subtraction, and multiplication are performed mod 64, however, that minus sign makes absolutely no difference to how those operations work, because 32 = -32 mod 64.
Your multiplication is not being performed mod 64, though, so that sign must be taken into account.
One way to think of your multiplication is that the 6-bit numbers are extended to 12 bits, and multiplication is then performed mod 4096. When extending a signed number, the top bit is replicated, so -32 becomes -2048 + 1024 + 512 ... +32, which all together has the same value of -32. So extend the signed numbers and multiply. I'll do it with 3 bits, multiplying mod 64:
Given: Sign-extended:
A2 A1 A0 A2 A2 A2 A2 A1 A0
B2 B1 B0 B2 B2 B2 B2 B1 B0
Multiply:
A0B2 A0B2 A0B2 A0B2 A0B1 A0B0
A1B2 A1B2 A1B2 A1B1 A1B0
A2B2 A2B2 A2B1 A2B0
A2B2 A2B1 A2B0
A2B1 A2B0
A2B0
Since we replicated the same bits in multiple positions, you'll see the same bit products at multiple positions.
A0B2 appears 4 times with with total place value 60 or 15<<2, and so on. Let write the multipliers in:
A0B2*15 A0B1 A0B0
A1B2*7 A1B1 A1B0
A2B2*5 A2B1*7 A2B0*15
Again, because of modular arithmetic, the *15s and *7s are the same as *-1, and the *5 is the same as *1:
-A0B2 A0B1 A0B0
-A1B2 A1B1 A1B0
A2B2 -A2B1 -A2B0
That pattern is starting to look familiar. Now, of course -1 is not a bit value, but ~A0B2 = 1-A0B2, so we can translate -A0B2 into ~A0B2 and then subtract the extra 1 we added. If we do this for all the subtracted products:
~A0B2 A0B1 A0B0
~A1B2 A1B1 A1B0
A2B2 ~A2B1 ~A2B0
-2 -2
If we add up the place values of those -2s and expand them into the equivalent bits, we discover the source of the additional 1s in your diagram:
~A0B2 A0B1 A0B0
~A1B2 A1B1 A1B0
A2B2 ~A2B1 ~A2B0
1 1
why two extra '1'?
See some previous explanation in Matt Timmermans's answer
Note : '-2' in two complement is 110, and this contributes to the carries, thus two extra '1'
why flipping the values of some of the partial product bits.
It is due to signed bit in the MSB (A5 and B5).
Besides, please see below the Countermeasure for modified baugh-wooley algorithm in the case of A_WIDTH != B_WIDTH with the help of others.
I have written a hardware verilog code for this algorithm
Hopefully, this post helps some readers.
The short answer is that's because how 2's-complement representation works: the top bit is effectively a sign bit so 1 there means -. In other words you have to subtract
A5*(B4 B3 B2 B1 B0) << 5
and
B5*(A4 A3 A2 A1 A0) << 5
from the sum (note that A5*B5 is added again because both have the same - sign). And those two 1 is the result of substituting those two subtractions with additions of -X.
If you need more details, then you probably just need to re-read how 2's-complement work and then the whole math behind the Baugh-Wooley multiplication algorithm. It is not that complicated.

Check if a vector lies in the span a subset of columns of a matrix in Sage

I'm new to programming with Sage. I have a rectangular R*C matrix (R rows and C columns) and the rank of M is (possibly) smaller than both R and C. I want to check if a target vector T is in the span of a subset of columns of M. I have written the following code in Sage (I haven't included the whole code because the way I get M and T are rather cumbersome). I just want to check if the code does what I want.
Briefly, this is what my code is trying to do: M is my given matrix, I first check that T is indeed in the span of columns of M (the first if condition). If they do, I proceed to trim down M (which had C columns) to a matrix M1 which has exactly rank(M) many columns (this is what the first while loop does). After that, I keep removing the columns one by one to check if the rest of the columns contain T in their span (this is the second while loop). In the second while loop, I first remove a column from M2 (which is essentially a copy of M1) and call this matrix M3. To M3. I augment the vector T and check if the rank decreases. Since T was already in the span of M2, rank([M2 T]) should be the same as rank(M2). Now by removing column c and augmenting T to M2 doesn't decrease the rank, then I know that c is not necessary to generate T. This way I only keep those columns that are necessary to generate T.
It does return correct answers for the examples I tried, but I am going to run this code on a matrix with entries which vary a lot in magnitude (say the maximum is as large as 20^20 and minimum is 1)and typically the matrix dimensions could go up to 300. So planning to run it over a set of few hundred test cases over the weekend. It'll be really helpful if you can tell me if something looks fishy/wrong -- for eg. will I run into precision errors? How should I modify my code so that it works for all values/ranges as mentioned above? Also, if there is any way to speed up my code (or write the same thing in a shorter/nicer way), I'd like to know.
R = 155
C= 167
T = vector(QQ, R)
M1 = matrix(ZZ, R, C)
M1 = M
C1 = C
i2 = 0
if rank(M.augment(T)) == rank(M):
print("The rank of M is")
print(rank(M))
while i2 < C1 :
if rank(M1.delete_columns([i2])) == rank(M1) :
M1 = M1.delete_columns([i2])
C1 = C1 - 1
else :
i2 = i2+1
C2 = M1.ncols()
print("The number of columns in the trimmed down matrix M1 is")
print(C2)
i3 = 0
M2 = M1
print("The rank of M1 which is now also the rank of M2 is")
print(rank(M2))
while i3 < C2 :
M3 = M2.delete_columns([i3])
if rank(M3.augment(T)) < rank(M2) :
M2 = M3
C2 = C2 - 1
else :
i3 = i3 + 1
print("Rank of matrix M is")
print(M.rank())
If I wanted to use Sage to decide whether a vector T was in the image of some a matrix M1 constructed from some subset of columns of another matrix M, I would do this:
M1 = M.matrix_from_columns([list of indices of the columns to use])
T in M1.column_space()
or use a while loop to modify M1 each time, as you do. (But I think T in M1.column_space() should work better than testing equality of ranks.)

Pre-processing data for existing compressor

I have an existing "compression" algorithm, which compresses an association into ranges. Something like
type Assoc = [(P,C)]
type RangeCompress :: Assoc -> [(P, [(C,C)]]
Read P as "product" and C as "code". The result is a list of products each associated with a list of code-ranges. To find the product associated with a given code, one traverses the compressed data until one finds a range, the given code falls into.
This mechanism works well if consecutive codes are likely to belong to the same product. However, if the codes of different product interleave, they no longer form compact ranges and I end up with lots of ranges, where the upper bound is equal to the lower bound and nil compression.
What I am looking for is a pre-compressor, which looks at the original association and determines a "good enough" transformation of codes, such that the association expressed in terms of transformed codes can be range-compressed into compact ranges. Something like
preCompress :: Assoc -> (C->C)
or more fine-grained (by Product)
preCompress ::Assoc -> P -> (C->C)
In that case a product lookup would have to first transform the code in question and then do the lookup as before. Therefore the transformation must be expressible by a handful of parameters, which would have to be attached to the compressed data, either once for the entire association or by product.
I checked some compression algorithms, but they all seem to focus on reconstructing the original data (which is not strictly needed here), while being totally free about the way they store the compressed data. In my case however, the compressed data must be ranges, only enriched by the parameters of the pre-compression.
Is this a known problem?
Does it have a solution?
Where to look next?
Please note:
I am not primarily interested in restoring the original data
I am primarily interested in the product lookup
The number of codes is apx 7,000,000
The number of products is apx 200
Assuming that for each code there is only one product, you can encode the data as a list (string) of products. Let's pretend we have the following list of products and codes randomly generated from 9 products (or no product) and 10 codes.
[(P6,C2),
(P1,C4),
(P2,C10),
(P3,C9),
(P3,C1),
(P4,C7),
(P6,C8),
(P5,C3),
(P1,C5)]
If we sort them by code we have
[(P3,C1),
(P6,C2),
(P5,C3),
(P1,C4),
(P1,C5),
-- C6 didn't have a product
(P4,C7),
(P6,C8),
(P3,C9),
(P2,C10)]
We can convert this into a string of products + nothing (N). The position in the string determines the code.
{- C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 -}
[ P3, P6, P5, P1, P1, N , P4, P6, P3, P2 ]
A string of symbols in some alphabet (in this case products+nothing) puts us squarely in the realm of well-studied string compression problems.
If we run-length encode this list, we have an encoding similar to the encoding you originally presented. For each range of associations we store a single product+nothing and the run length. We only need a single small integer for the run length instead of two (possibly large) codes for the interval.
{- P3 , P6, P5, P1, P1, N , P4, P6, P3, P2 -}
[ (P3, 1), (P6, 1), (P5, 1), (P1, 2), (N, 1), (P4, 1), (P6, 1), (P3, 1), (P2, 1) ]
We can serialize this to a string of bytes and use any of the existing compression libraries on bytes to perform the actual compression. Some compression libraries, such as the frequently used zlib, are found in the codec category.
Sorted the other way
We'll take the same data from before and sort it by product instead of by code
[(P1,C4),
(P1,C5),
(P2,C10),
(P3,C1),
(P3,C9),
(P4,C7),
(P5,C3),
(P6,C2),
(P6,C8)]
We'd like to allocate new codes to each product so that the codes for a product are always consecutive. We'll just allocate these codes in order.
-- v-- New code --v v -- Old code
[(P1,C1), [(C1,C4)
(P1,C2), (C2,C5),
(P2,C3), (C3,C10),
(P3,C4), (C4,C1),
(P3,C5), (C5,C9),
(P4,C6), (C6,C7),
(P5,C7), (C7,C3),
(P6,C8), (C8,C2),
(P6,C9), (C9,C8)]
We now have two pieces of data to save. We have a (now) one-to-one mapping between products and code ranges and a new one-to-one mapping between new codes and old codes. If we follow the steps from the previous section, we can convert the mapping between new codes and old codes into a list of old codes. The position in the list determines the new code.
{- C1 C2 C3 C4 C5 C6 C7 C8 C9 -} -- new codes
[ C4, C5, C10, C1, C9, C7, C3, C2, C8 ] -- old codes
We can throw any of our existing compression algorithms at this string. Each symbol only occurs at most once in this list so traditional compression mechanisms will not compress this list any more. This isn't significantly different from grouping by the product and storing the list of codes as an array with the product; the new intermediate codes are just (larger) pointers to the start of the array and the length of the array.
[(P1,[C4,C5]),
(P2,[C10]),
(P3,[C1,C9]),
(P4,[C7]),
(P5,[C3]),
(P6,[C2,C8])]
A better representation for the list of old codes is probably the difference between old codes. If the codes already tend to be in consecutive ranges these consecutive ranges can be run-length encoded away.
{- C4, C5, C10, C1, C9, C7, C3, C2, C8 -} -- old codes
[ +4, +1, +5, -9, +8, -2, -4, -1, +6 ] -- old code difference
I might be tempted to store the difference in product and the difference in codes with the products. This should increase the opportunity for common sub-strings for the final compression algorithm to compress away.
[(+1,[+4,+1]),
(+1,[+5]),
(+1,[-9,+8]),
(+1,[-2]),
(+1,[-4]),
(+1,[-1,+6])]

Weighting Different Outcomes when Pseudorandomly Choosing from an Arbitrarily Large Sample

So, I was sitting in my backyard thinking about Pokemon, as we're all wont to do, and it got me thinking: When you encounter a 'random' Pokemon, some specimen appear a lot more often than others, which means that they're weighted differently than the ones that appear less.
Now, were I to approach the problem of getting the different Pokemon to appear with a certain probability, I would most likely do so by simply increasing the number of entries that certain Pokemon have in the pool of choices (like so),
Pool:
C1 C1 C1 C1
C2 C2
C3 C3 C3 C3 C3
C4
so C1 has a 1/3 chance of being pulled, C2 has a 1/6th chance, etc, but I understand that this may be a very simple and naive approach, and is unlikely to scale well with a large number of choices.
So, my question is this, S/O: Given an arbitrarily large sample size, how would you go about weighting the chance of one outcome as greater than another? And, as a follow up question, assume that you want the probability of certain options to occur in a ratio with floating-point precision as opposed to whole number ratios?
If you know the probability of each event happening you need to map these probabilities to the range 0-100 (or 0 to 1 if you want to use real numbers and probabilities.)
So in the example above there are 12 Cs. C1 is 4/12 or ~33%,
C2 is 2/12 of ~17%, C3 is 5/12 or ~42%, and C4 is 1/12 or ~8%.
Notice that these all add up to 100%. So if we choose a random number between 0 and 100 we can map C1 to 0-33, C2 to 33-50 (17 more than C1's value) , C3 to 50-92, and C4 to 92-100.
An if statement could make the choice:
r = rand() # between 0-100
if (r <33)
return "C1"
elsif (r < 50)
return "C2"
elsif (r < 92)
return "C3"
elsif (r < 100)
return "C4"
If you wanted more accuracy than 1 in 100 just go from 1-1000 or whatever range you want. It's probably better form to use integers and scale them rather than use floating point numbers as floating point can have odd behavior if the spread between values gets large.
If you wanted to go the binning route like you show above you could try something like so (in ruby though the idea is more general):
a = ["C1"]*4 + ["C2"]*2 + ["C3"]*5 + ["C4"]
# ["C1", "C1", "C1", "C1", "C2", "C2",
# "C3", "C3", "C3", "C3", "C3", "C4"]
a[rand(a.length)] # => "C1' w/ probability 4/12
Binning would be slower as you need to create the array, but easier to add alternatives as you wouldn't need to recalculate the probabilities each time.
You could also generate the above if code from the array representation so you'd just take the pre-processing hit once when the code was generated and then get a fast answer from the created code.

Number base conversion as a stream operation

Is there a way in constant working space to do arbitrary size and arbitrary base conversions. That is, to convert a sequence of n numbers in the range [1,m] to a sequence of ceiling(n*log(m)/log(p)) numbers in the range [1,p] using a 1-to-1 mapping that (preferably but not necessarily) preservers lexigraphical order and gives sequential results?
I'm particularly interested in solutions that are viable as a pipe function, e.i. are able to handle larger dataset than can be stored in RAM.
I have found a number of solutions that require "working space" proportional to the size of the input but none yet that can get away with constant "working space".
Does dropping the sequential constraint make any difference? That is: allow lexicographically sequential inputs to result in non lexicographically sequential outputs:
F(1,2,6,4,3,7,8) -> (5,6,3,2,1,3,5,2,4,3)
F(1,2,6,4,3,7,9) -> (5,6,3,2,1,3,5,2,4,5)
some thoughts:
might this work?
streamBasen -> convert(n, lcm(n,p)) -> convert(lcm(n,p), p) -> streamBasep
(where lcm is least common multiple)
I don't think it's possible in the general case. If m is a power of p (or vice-versa), or if they're both powers of a common base, you can do it, since each group of logm(p) is then independent. However, in the general case, suppose you're converting the number a1 a2 a3 ... an. The equivalent number in base p is
sum(ai * mi-1 for i in 1..n)
If we've processed the first i digits, then we have the ith partial sum. To compute the i+1'th partial sum, we need to add ai+1 * mi. In the general case, this number is going have non-zero digits in most places, so we'll need to modify all of the digits we've processed so far. In other words, we'll have to process all of the input digits before we'll know what the final output digits will be.
In the special case where m are both powers of a common base, or equivalently if logm(p) is a rational number, then mi will only have a few non-zero digits in base p near the front, so we can safely output most of the digits we've computed so far.
I think there is a way of doing radix conversion in a stream-oriented fashion in lexicographic order. However, what I've come up with isn't sufficient for actually doing it, and it has a couple of assumptions:
The length of the positional numbers are already known.
The numbers described are integers. I've not considered what happens with the maths and -ive indices.
We have a sequence of values a of length p, where each value is in the range [0,m-1]. We want a sequence of values b of length q in the range [0,n-1]. We can work out the kth digit of our output sequence b from a as follows:
bk = floor[ sum(ai * mi for i in 0 to p-1) / nk ] mod n
Lets rearrange that sum into two parts, splitting it at an arbitrary point z
bk = floor[ ( sum(ai * mi for i in z to p-1) + sum(ai * mi for i in 0 to z-1) ) / nk ] mod n
Suppose that we don't yet know the values of a between [0,z-1] and can't compute the second sum term. We're left with having to deal with ranges. But that still gives us information about bk.
The minimum value bk can be is:
bk >= floor[ sum(ai * mi for i in z to p-1) / nk ] mod n
and the maximum value bk can be is:
bk <= floor[ ( sum(ai * mi for i in z to p-1) + mz - 1 ) / nk ] mod n
We should be able to do a process like this:
Initialise z to be p. We will count down from p as we receive each character of a.
Initialise k to the index of the most significant value in b. If my brain is still working, ceil[ logn(mp) ].
Read a value of a. Decrement z.
Compute the min and max value for bk.
If the min and max are the same, output bk, and decrement k. Goto 4. (It may be possible that we already have enough values for several consecutive values of bk)
If z!=0 then we expect more values of a. Goto 3.
Hopefully, at this point we're done.
I've not considered how to efficiently compute the range values as yet, but I'm reasonably confident that computing the sum from the incoming characters of a can be done much more reasonably than storing all of a. Without doing the maths though, I won't make any hard claims about it though!
Yes, it is possible
For every I character(s) you read in, you will write out O character(s)
based on Ceiling(Length * log(In) / log(Out)).
Allocate enough space
Set x to 1
Loop over digits from end to beginning # Horner's method
Set a to x * digit
Set t to O - 1
Loop while a > 0 and t >= 0
Set a to a + out digit
Set out digit at position t to a mod to base
Set a to a / to base
Set x to x * from base
Return converted digit(s)
Thus, for base 16 to 2 (which is easy), using "192FE" we read '1' and convert it, then repeat on '9', then '2' and so on giving us '0001', '1001', '0010', '1111', and '1110'.
Note that for bases that are not common powers, such as base 17 to base 2 would mean reading 1 characters and writing 5.

Resources