MIPS Roman numeral converter - converters

So I am in the beginning stages of planning my Roman numeral converter (Roman numerals to integers) in MIPS. I am stuck on how to convert the numbers where there's a number less than the next number (i.e. IV [4]). I was thinking about doing the below but it still doesn't solve my problem.
put this somewhere in the data section:
all_numerals: .asciiz "IVXLCDMivxlcdm"
all_values: .byte 1, 5, 10, 50, 100, 500, 1000, 1, 5, 10, 50, 100, 500, 1000
I would then read the character and compare it to its index in all_values and add that to a register and repeat.
Does anyone know an efficient way to do this? As it stands I am going to do a loop that would compare the last character (I [1]) to the next character (V [5]) and if it is less subtract the character from the most recent (5 - 1), but I think this would be inefficient. Any help is much appreciated.

Related

How to find the xth decibinary number?

Hackerrank has a problem called Decibinary numbers which are essentially numbers with 0-9 digit values but are exponentiated using powers of 2. The question asks us to display the xth decibinary number. There is another twist to the problem. Multiple decibinary numbers can equal the same decimal number. For example, 4 in decimal can be 100, 20, 12, and 4 in decibinary.
At first, I thought that finding how many decibinary numbers for a given decimal number would be helpful.
I consulted this post for a bit help ( https://math.stackexchange.com/questions/3540243/whats-the-number-of-decibinary-numbers-that-evaluate-to-given-decimal-number ). The post was a bit too hard to understand but then I also realized that even though we have how many decibinary numbers a decimal number can have, this doesn't help FINDING them (at least to my knowledge) which is the original goal of the question.
I do realize that for any decimal number, the largest decibinary number for it will simply be its binary representation. For ex, for 4 it is 100. So the brute force approach would be to check all numbers in this range for each decimal number and see if their decibinary representation evaluates to the given decimal number, but it is clearly evident that this approach will never pass since the input constraints define x to be from 1 to 10^16. Not only that, we have to find the xth decibinary number for a q amount of queries where q is from 1 to 10^5.
This question falls under the section of dp but I am confused how dp will be used or how it is even possible. In order for calculating the xth decibinary number q times (which is described in the brute force method above) it would be better to use a table (like the problem suggests). But for that, we would need to store and calculate 10^16 integers since that is the how big x can be. Assuming an integer is 4 Bytes, 4B * 10^16 ~= 4B * (2^3)^16 = 2^50 Bytes.
Can someone please explain how this problem is solved optimally. I am still new to CP so if I have made an error in something, please let me know.
(see link below for full problem statement):
https://www.hackerrank.com/challenges/decibinary-numbers/problem
This is solvable with about 80 MB of data. I won't give code, but I will explain the strategy.
Build a lookup count[n][i] that gives you the number of ways to get the decimal number n using the first i digits. You start by inserting 0 everywhere, and then put a 1 in count[0][0]. Now start filling in using the rule:
count[n][i] = count[n][i-1] + count[n - 2**i][i-1] + count[n - 2*2**i][i-1] + ... + count[n - 9*2**i][i-1]
It turns out that you only need the first 19 digits, and you only need counts of n up to 2**19-1. And the counts all fit in 8 byte longs.
Once you have that, create a second data structure count_below[n] which is the count of how many decibinary numbers will give a value less than n. Use the same range of n as before.
And now a lookup proceeds as follows. First you do a binary search on count_below to find the last value that has less than your target number below it. Subtracting count_below from your query, you know which decibinary number of that value you want.
Next, search through count[n][i] to find the i such that you get your target query with i digits, and not with less. This will be the position of the leading digit of your answer. You then subtract off count[n][i-1] from your query (all the decibinaries with fewer digits). Then subtract off count[n-2**i][i-1], count[n-2* 2**i][i-1], ... count[n-8*2**i][i-1] until you find what that leading digit is. Now you subtract the contribution of that digit from the value, and repeat the logic for finding the correct decibinary for that smaller value with fewer digits.
Here is a worked example to clarify. First the data structures for the first 3 digits and up to 2**3 - 1:
count = [
[1, 1, 1, 1], # sum 0
[0, 1, 1, 1], # sum 1
[0, 1, 2, 2], # sum 2
[0, 1, 2, 2], # sum 3
[0, 1, 3, 4], # sum 4
[0, 1, 3, 4], # sum 5
[0, 1, 4, 6], # sum 6
[0, 1, 4, 6], # sum 7
]
count_below = [
0, 1, 2, 4, 6, 10, 14, 20, 26, ...
]
Let's find the 20th.
count_below[6] is 14 and count_below[7] is 20 so our decimal sum is 6.
We want the 20 - count_below[6] = 6th decibinary with decimal sum 6.
count[6][2] is 4 while count[6][3] is 6 so we have a non-zero third digit.
We want the count[6][3] - count[6][2] = 2 with a non-zero third digit.
count[1][6 - 2**2] is 2, so 2 have 3rd digit 1.
The third digit is 1
We are now looking for the second decibinary whose decimal sum is 2.
count[2][1] is 1 and count[2][2] is 2 so it has a non-zero second digit.
We want the count[2][2] - count[2][1] = 1st with a non-zero second digit.
The second digit is 1
The rest is 0 because 2 - 2**1 = 0.
And thus you find that the answer is 110.
Now for such a small number, this was a lot of work. But even for your hardest lookup you'll only need about 20 steps of a binary search to find your decimal sum, another 20 steps to find the position of the first non-zero digit, and for each of of those digits, you'll have to do 1-9 different calculations to find what that digit is. Which means only hundreds of calculations to find the number.

1000 Digit Fibonacci - Error on Euler?

Below is my code. It runs. It works.
The problem is, the INDEX of the first 1000 digit fibonacci number isn't 4872...it's 4871. 4872 is the POSITION, not the INDEX. Is Euler accepting the wrong answer, or did they use the word index when they should have used position?
def fib_of_a_certain_digit(num)
fibs = [1, 1]
idx = 1
while true
fib = fibs[idx] + fibs[idx-1]
fibs << fib
idx += 1
digilength = fib.to_s.split("").length
return "The first #{num} digit Fibonacci number is at index #{idx}, the fibonacci array is #{fibs.length} long" if digilength == num
end
end
puts fib_of_a_certain_digit(3)
puts fib_of_a_certain_digit(1000)
Here is the output.
The first 3 digit Fibonacci number is at index 11, the fibonacci array is 12 long
The first 1000 digit Fibonacci number is at index 4781, the fibonacci array is 4782 long
As you can see, the control case matches the known data.
[1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144]
The last number in the array is 144. It is at index 11, but is the 12th number in the array.
The same principle applies to the larger number (it's just too big to paste here). It winds up in the last position of the array (4872), which has the index of 4871.
Why has nobody else noticed this?
No, that's not an error. Project Euler says:
Hence the first 12 terms will be:
F1 = 1
F2 = 1
F3 = 2
...
F11 = 89
F12 = 144
Note the little subscript numbers bottom right of each "F". Those are the indexes. So they start indexing with 1, and thus "position" and "index" are equivalent here. In particular, we can see that the first Fibonacci number with three digits is at index 12.
Your choice of programming language and data type and that language's choice of indexing doesn't override what's in the problem statement. And if it did, there'd be a problem because there are programming languages that start indexing with 1.
In the comments below you talk about "common terms" and what they "usually mean". I'm sure you realized that Project Euler is very mathematical, and in mathematics, those subscripts are the indexes. See for example Index notation in mathematics. Btw, all the examples there start indexing with 1 (not 0), because that's a common/usual way in mathematics as well.

Unique pair of two integers [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Mapping two integers to one, in a unique and deterministic way
I'm trying to create unique identificator for pair of two integers (Ruby) :
f(i1,i2) = f(i2, i1) = some_unique_value
So, i1+i2, i1*i2, i1^i2 -not unique as well as (i1>i2) ? "i1" + "i2" : "i2" + "i1".
I think following solution will be ok:
(i1>i2) ? "i1" + "_" + "i2" : "i2" + "_" + "i1"
but:
I have to save result in DB and index it. So I prefer it to be an integer and as small as it possible.
Is Zlib.crc32(f(i1,i2)) can guaranty uniqueness?
Thanks.
UPD:
Actually, I'm not sure the result MUST be integer. Maybe I can convert it to decimal:
(i1>i2) ? i1.i2 : i2.i1
?
What you're looking for is called a Pairing function.
The following illustration from the German wikipedia page clearly shows how it works:
Implemented in Ruby:
def cantor_pairing(n, m)
(n + m) * (n + m + 1) / 2 + m
end
(0..5).map do |n|
(0..5).map do |m|
cantor_pairing(n, m)
end
end
=> [[ 0, 2, 5, 9, 14, 20],
[ 1, 4, 8, 13, 19, 26],
[ 3, 7, 12, 18, 25, 33],
[ 6, 11, 17, 24, 32, 41],
[10, 16, 23, 31, 40, 50],
[15, 22, 30, 39, 49, 60]]
Note that you will need to store the result of this pairing in a datatype with as many bits as both your input numbers put together. (If both input numbers are 32-bit, you will need a 64-bit datatype to be able to store all possible combinations, obviously.)
No, Zlib.crc32(f(i1,i2)) is not unique for all integer values of i1 and i2.
If i1 and i2 are also 32bit numbers then there are many more combinations of them than can be stored in a 32bit number, which is returned by CRC32.
CRC32 is not unique, and wouldn't be good to use as a key. Assuming you know the maximum value of your integers i1 and i2:
unique_id = (max_i2+1)*i1 + i2
If your integers can be negative, or will never be below a certain positive integer, you'll need the max and min values:
(max_i2-min_i2+1) * (i1-min_i1) + (i2-min_i2)
This will give you the absolute smallest number possible to identify both integers.
Well, no 4-byte hash will be unique when its input is an arbitrary binary string of more than 4 bytes. Your strings are from a highly restricted symbol set, so collisions will be fewer, but "no, not unique".
There are two ways to use a smaller integer than the possible range of values for both of your integers:
Have a system that works despite occasional collisions
Check for collisions and use some sort of rehash
The obvious way to solve your problem with a 1:1 mapping requires that you know the maximum value of one of the integers. Just multiply one by the maximum value and add the other, or determine a power of two ceiling, shift one value accordingly, then OR in the other. Either way, every bit is reserved for one or the other of the integers. This may or may not meet your "as small as possible" requirement.
Your ###_### string is unique per pair; if you could just store that as a string you win.
Here's a better, more space efficient solution:. My answer on it here

algorithm to find number of integers with given digits within a given range

If I am given the full set of digits in the form of a list list and I want to know how many (valid) integers they can form within a given range [A, B], what algorithm can I use to do it efficiently?
For example, given a list of digits (containing duplicates and zeros) list={5, 3, 3, 2, 0, 0}, I want to know how many integers can be formed in the range [A, B]=[20, 400] inclusive. For example, in this case, 20, 23, 25, 30, 32, 33, 35, 50, 52, 53, 200, 203, 205, 230, 233, 235, 250, 253, 300, 302, 303, 305, 320, 323, 325, 330, 332, 335, 350, 352, 353 are all valid.
Step 1: Find the number of digits your answers are likely to fall in. In your
example it is 2 or 3.
Step 2: For a given number size (number of digits)
Step 2a: Pick the possibilities for the first (most significant digit).
Find the min and max number starting with that digit (ascend or descending
order of rest of the digits). If both of them fall into the range:
step 2ai: Count the number of digits starting with that first digit and
update that count
Step 2b: Else if both max and min are out of range, ignore.
Step 2c: Otherwise, add each possible digit as second most significant digit
and repeat the same step
Solving by example of your case:
For number size of 2 i.e. __:
0_ : Ignore since it starts with 0
2_ : Minimum=20, Max=25. Both are in range. So update count by 3 (second digit might be 0,3,5)
3_ : Minimum=30, Max=35. Both are in range. So update count by 4 (second digit might be 0,2,3,5)
5_ : Minimum=50, Max=53. Both are in range. So update count by 3 (second digit might be 0,2,3)
For size 3:
0__ : Ignore since it starts with 0
2__ : Minimum=200, max=253. Both are in range. Find the number of ways you can choose 2 numbers from a set of {0,0,3,3,5}, and update the count.
3__ : Minimum=300, max=353. Both are in range. Find the number of ways you can choose 2 numbers from a set of {0,0,2,3,5}, and update the count.
5__ : Minimum=500, max=532. Both are out of range. Ignore.
A more interesting case is when max limit is 522 (instead of 400):
5__ : Minimum=500, max=532. Max out of range.
50_: Minimum=500, Max=503. Both in range. Add number of ways you can choose one digit from {0,2,3,5}
52_: Minimum=520, Max=523. Max out of range.
520: In range. Add 1 to count.
522: In range. Add 1 to count.
523: Out of range. Ignore.
53_: Minimum=530, Max=532. Both are out of range. Ignore.
def countComb(currentVal, digSize, maxVal, minVal, remSet):
minPosVal, maxPosVal = calculateMinMax( currentVal, digSize, remSet)
if maxVal>= minPosVal >= minVal and maxVal>= maxPosVal >= minVal
return numberPermutations(remSet,digSize, currentVal)
elif minPosVal< minVal and maxPosVal < minVal or minPosVal> maxVal and maxPosVal > maxVal:
return 0
else:
count=0
for k in unique(remSet):
tmpRemSet = [i for i in remSet]
tmpRemSet.remove(k)
count+= countComb(currentVal+k, digSize, maxVal, minVal, tmpRemSet)
return count
In your case: countComb('',2,400,20,['0','0','2','3','3','5']) +
countComb('',3,400,20,['0','0','2','3','3','5']) will give the answer.
def calculateMinMax( currentVal, digSize, remSet):
numRemain = digSize - len(currentVal)
minPosVal = int( sorted(remSet)[:numRemain] )
maxPosVal = int( sorted(remSet,reverse=True)[:numRemain] )
return minPosVal,maxPosVal
numberPermutations(remSet,digSize, currentVal): Basically number of ways
you can choose (digSize-len(currentVal)) values from remSet. See permutations
with repeats.
If the range is small but the list is big, the easy solution is just loop over the range and check if every number can be generated from the list. The checking can be made fast by using a hash table or an array with a count for how many times each number in the list can still be used.
For a list of n digits, z of which are zero, a lower bound l, and an upper bound u...
Step 1: The Easy Stuff
Consider a situation in which you have a 2-digit lower bound and a 4-digit upper bound. While it might be tricky to determine how many 2- and 4-digit numbers are within the bounds, we at least know that all 3-digit numbers are. And if the bounds were a 2-digit number and a 5-digit number, you know that all 3- and 4-digit numbers are fair game.
So let's generalize this to to a lower bound with a digits and an upper bound with b digits. For every k between a and b (not including a and b, themselves), all k-digit numbers are within the range.
How many such numbers are there? Consider how you'd pick them: the first digit must be one of the n numbers which is non-zero (so one of (n - z) numbers), and the rest are picked from the yet-unpicked list, i.e. (n-1) choices for the second digit, (n-2) for the third, etc. So this is looking like a factorial, but with a weird first term. How many numbers of the n are picked? Why, k of them, which means we have to divide by (n - k)! to ensure we only pick k digits in total. So the equation for each k looks something like: (n - z)(n - 1)!/(n - k)! Plug in every k in the range (a, b), and you have the number of (a+1)- to (b-1)-digit numbers possible, all of which must be valid.
Step 2: The Edge Cases
Things are a little bit trickier when you consider a- and b-digit numbers. I don't think you can avoid starting a depth-first search through all possible combinations of digits, but you can at least abort on an entire branch if it exceeds the boundary.
For example, if your list contained { 7, 5, 2, 3, 0 } and you had an upper bound of 520, your search might go something like the following:
Pick the 7: does 7 work in the hundreds place? No, because 700 > 520;
abort this branch entirely (i.e. don't consider 752, 753, 750, 725, etc.)
Pick the 5: does 5 work in the hundreds place? Yes, because 500 <= 520.
Pick the 7: does 7 work in the tens place? No, because 570 > 520.
Abort this branch (i.e. don't consider 573, 570, etc.)
Pick the 2: does 2 work in the tens place? Yes, because 520 <= 520.
Pick the 7: does 7 work in the ones place? No, because 527 > 520.
Pick the 3: does 3 work in the ones place? No, because 523 > 520.
Pick the 0: does 0 work in the ones place? Yes, because 520 <= 520.
Oh hey, we found a number. Make sure to count it.
Pick the 3: does 3 work in the tens place? No; abort this branch.
Pick the 0: does 0 work in the tens place? Yes.
...and so on.
...and then you'd do the same for the lower bound, but flipping the comparators. It's not nearly as efficient as the k-digit combinations in the (a, b) interval (i.e. O(1)), but at least you can avoid a good deal by pruning branches that must be impossible early on. In any case, this strategy ensures you only have to actually enumerate the two edge cases that are the boundaries, regardless of how wide your (a, b) interval is (or if you have 0 as your lower bound, only one edge case).
EDIT:
Something I forgot to mention (sorry, I typed all of the above on the bus home):
When doing the depth-first search, you actually only have to recurse when your first number equals the first number of the bound. That is, if your bound is 520 and you've just picked 3 as your first number, you can just add (n-1)!/(n-3)! immediately and skip the entire branch, because all 3-digit numbers beginning with 300 are certainly all below 500.

Algorithm to find the next number in a sequence

Ever since I started programming this has been something I have been curious about. But seems too complicated for me to even attempt.
I'd love to see a solution.
1, 2, 3, 4, 5 // returns 6 (n + 1)
10, 20, 30, 40, 50 //returns 60 (n + 10)
10, 17, 31, 59, 115 //returns 227 ((n * 2) - 3)
What you want to do is called polynomial interpolation. There are many methods (see http://en.wikipedia.org/wiki/Polynomial_interpolation ), but you have to have an upper bound U on the degree of the polynomial and at least U + 1 values.
If you have sequential values, then there is a simple algorithm.
Given a sequence x1, x2, x3, ..., let Delta(x) be the sequence of differences x2 - x1, x3 - x2, x4 - x3, ... . If you have consecutive values of a degree n polynomial, then the nth iterate of Delta is a constant sequence.
For example, the polynomial n^3:
1, 8, 27, 64, 125, 216, ...
7, 19, 37, 61, 91, ...
12, 18, 24, 30, ...
6, 6, 6, ...
To get the next value, fill in another 6 and then work backward.
6, 6, 6, 6 = 6, ...
12, 18, 24, 30, 36 = 30 + 6, ...
7, 19, 37, 61, 91, 127 = 91 + 36, ...
1, 8, 27, 64, 125, 216, 343 = 216 + 127, ...
The restriction on the number of values above ensures that your sequence never becomes empty while performing the differences.
Sorry to disappoint, but this isn't quite possible (in general), as there are an infinite number of sequences for any given k values. Maybe with certain constraints..
You can take a look at this Everything2 post, which points to Lagrange polynomial.
Formally there is no unique next value to a partial sequence. The problem as usually understood can be clearly stated as:
Assume that the partial sequence exhibited is just sufficient to constrain some generating rule, deduce the simplest possible rule and exhibit the next value generated.
The problem turns on the meaning of "simplest", and is thus not really good for algorithmatic solutions. It can be done if you confine the problem to a certain class of functional forms for the generating rule, but the details depend on what forms you are willing to accept.
The book Numerical Recipes has pages and pages of real practical algorithms to do this kind of stuff. It's well worth the read!
The first two cases are easy:
>>> seq1 = [1, 2, 3, 4, 5]
>>> seq2 = [10, 20, 30, 40, 50]
>>> def next(seq):
... m = (seq[1] - seq[0])/(1-0)
... b = seq[0] - m * 0
... return m*len(seq) + b
>>> next(seq1)
6
>>> next(seq2)
60
The third case would require solving for a non-linear function.
You can try to use extrapolation. It will help you to find formulas to describe a given sequence.
I am sorry, I can't tell you much more, since my mathematic education happened quite a while ago. But you should find more informations in good books.
That kind of number series are often part of "intelligence tests", which leads me to think in the terms of such an algorithm being something passing (at least part of) a Turing Test, which is something quite hard to accomplish.
I like the idea and sequence one and two would seem to me that this is possible, but then again you cannot generalize as the sequence could totally go off base. The answer is probably that you cannot generalize, what you can do is write an algorithm to perform a specific sequence knowing the (n+1) or (2n+2) etc...
One thing you may be able to do is take a difference between element i and element i+1 and element i+2.
for example, in your third example:
10 17 31 59 115
Difference between 17 and 10 is 7, and the difference between 31 and 17 is 14, and the difference between 59 and 31 is 28, and the diffeerence between 115 and 59 is 56.
So you note that it becomes the element i+1 = i + (7*2^n).
So 17 = 10 + (7*2^0)
And 31 = 17 + (7*2^1)
And so on...
For an arbitrary function it can't be done, but for a linear function like in each of your examples it's simple enough.
You have f(n+1) = a*f(n) + b, and the problem amounts to finding a and b.
Given at least three terms of the sequence, you can do this (you need three because you have three unknowns -- the starting point, a, and b). For instance, suppose you have f(0), f(1) and f(2).
We can solve the equations:
f(1) = a*f(0) + b
f(2) = a*f(1) + b
The solution for is:
a = (f(2)-f(1))/(f(1)-f(0))
b = f(1) - f(0)*(f(2)-f(1))/(f(1)-f(0))
(You'll want to separately solve the case where f(0) = f(1) to avoid division by zero.)
Once you have a and b, you can repeatedly apply the formula to your starting value to generate any term in the sequence.
One could also write a more general procedure that works when given any three points in the sequence (e.g. 4th, 7th, 23rd, or whatever) . . . this is just a simple example.
Again, though, we had to make some assumptions about what form our solution would have . . . in this case taking it to be linear as in your example. One could take it to be a more general polynomial, for instance, but in that case you need more terms of the sequence to find the solution, depending on the degree of the polynomial.
See also the chapter "To Seek Whence Comes a Sequence" from the book "Fluid concepts and creative analogies: computer models of the fundamental mechanisms of thought" by Douglas Hofstadter
http://portal.acm.org/citation.cfm?id=218753.218755&coll=GUIDE&dl=GUIDE&CFID=80584820&CFTOKEN=18842417

Resources