Calculating a weighted similarity - algorithm

I have 2 data rows and each of them have 4 fields
something like this:
field1 field2 field3 field4
Row 1
Row 2
Now I have to compare these two records and calculate the similarity. I calculate the similarity for each field by deriving the cosine similarity.
So I end up with similarities something like this:
(0 signifying a week similarity and 1 signifying a strong similarity)
field1: 0.12
field2: 0.67
field3: 1.00
field3: 0.93
I can now find the total similarity by averaging the value but the problem is:
I want to add weights to the fields
so if field2 has a higher weight than field1, then the similarity of field2 will have a significant contribution to the average similarity.
Can you suggest a formula or algorithm to satisfy such a requirement?

Simple,
multiply each of the 4 values by their weight
add the results together
divide by the sum of the weights
Examples
In the example each of the fields can be thought to have an equal weight of 1
((0.12 * 1) + (0.67 * 1) + (1.00 * 1) + (0.93 * 1)) / 4 = 0.68
Now if we want to make field2 worth 2x more than the other fields
// Weights are (1 + 2 + 1 + 1) = 5
((0.12 * 1) + (0.67 * 2) + (1.00 * 1) + (0.93 * 1)) / 5 = 0.678
If we want field 3 to have 100 times the weight (field 2 is still 2x)
// Weights are (1 + 2 + 100 + 1) = 104
((0.12 * 1) + (0.67 * 2) + (1.00 * 100) + (0.93 * 1)) / 104 = 0.9845192307692308
Formula
((field1 * field1_weight) + (field2 * field2_weight) + ... + (fieldn * fieldn_weight)) / (field1_weight + field2_weight + ... + fieldn_weight) = weighted_average
Fractional weights
The formula works just the same if you give fractions as weights. For example if you would like the weight of the 4th field to be weighted 150% more then the other fields you can assign it weight 1.5
// Weights are (1 + 1 + 1 + 1.5) = 4.5
((0.12 * 1) + (0.67 * 1) + (1.00 * 1) + (0.93 * 1.5)) / 4.5 = 0.7077777777777778
Weights are relative
You don't need to start with each of the weights set to 1, you can use 100 or 1000 if you like.
For example if the weights for all 4 fields were 100 the final average would be the same if they were all 1.
Further reading
wikipedia: Weighted arithmetic mean

You just want to find the weighted average. Multiply each similarity by the weight, then add the products together, divide at the end by the sum of the weights to get the average:
total, totalw = 0, 0
for w,s in weighted_sims :
total += w*s
totalw += w
result = total / totalw

Related

How to solve this in an efficient way with optimum time complexity?

Given a set of N numbers in an array. Given Q queries. Each Query contains 1 number x.
For each query, you need to add x to each element of the array and then report the sum of absolute values in the array.
Note : Changes to the array are permanent. See Sample for more clarification.
Input Format
First line contains N , number of elements in the array.
Next line contains N space separated integers of the array.
Next line contains Q(number of queries).
Next line contains Q space separated integers(the number x).
Output Format
For each query , output the sum in a newline.
Constraints
1 ≤ N ≤ 500000
1 ≤ Q ≤ 500000
-2000 ≤ number in each Query ≤ 2000
-2000 ≤ value of the array element ≤ 2000
Sample Input
3
-1 2 -3
3
1 -2 3
Sample Output
5
7
6
Explanation
After Query 1 : [ 0 , 3 , -2 ] => sum = 0 + 3 + 2 = 5
After Query 2 : [ -2 , 1 , -4 ] => sum = 2 + 1 + 4 = 7
After Query 3 : [ 1 , 4 , -1 ] => sum = 1 + 4 + 1 = 6
#include<stdio.h>
#include<stdlib.h>
int main()
{
int n,*a,q,*aq;
long int sum=0;
scanf("%d",&n);
a=(int*)malloc(sizeof(int)*n);
for(int i=0;i<n;i++)
scanf("%d",&a[i]);
scanf("%d",&q);
aq=(int*)malloc(sizeof(int)*q);
for(int i=0;i<n;i++)
scanf("%d",&aq[i]);
for(int i=0;i<q;i++)
{
for(int j=0;j<n;j++)
{
sum+=abs(aq[i]+a[j]);
a[j]=aq[i]+a[j];
}
printf("%ld\n",sum);
sum=0;
}
}
Some test cases are timing out.
Your solution is performing N.Q operations, which is huge.
First notice that the range of the data is moderate, so that you can represent the N numbers using an histogram of 4001 entries. This histogram is computed in N operations (plus initializing the bins).
Then the requested sum is obtained as the sum of the absolute differences with every bin, weighted by the bin values. This lowers the workload from N.Q to B.Q (B is the number of bins).
If I am right, we can do much better by decomposing the sum in a subsum for the negative values and another in the positives. And these sums are obtained by computing prefix sums. This should lead to a solution in Q operations, after preprocessing the histogram in B operations.
Here's an outline of an algorithm:
Sample Input
3
-1 2 -3
Sort the data and compute prefix sums:
-3, -1, 2
-3, -4, -2 (prefix sums)
(Using a histogram as Yves Daoust suggested would eliminate the initial sort and any binary search to find the three sections below, which would significantly optimise complexity.)
Maintain a running delta:
delta = 0
For each query of
1 -2 3
Query 1:
* update delta:
delta = 0 + 1 = 1
* identify three sections:
[negative unaffected] [switches sign] [positive unaffected]
-3, -1, 2
* Add for each section abs(num_elements * delta + prefix_sum):
abs(2 * 1 + (-4 - 0)) + abs(1 * 1 + (-2 -(-4)))
= abs(2 - 4) + abs(1 + 2)
= 5
Query -2:
* update delta:
delta = 1 - 2 = -1
* identify three sections:
[negative unaffected] [switches sign] [positive unaffected]
-3, -1, 2
* Add for each section abs(num_elements * delta + prefix_sum):
abs(2 * (-1) + (-4 - 0)) + abs(1 * (-1) + (-2 -(-4)))
= abs(-2 - 4) + abs(-1 + 2)
= 7
Query 3:
* update delta:
delta = -1 + 3 = 2
* identify three sections:
[negative unaffected] [switches sign] [positive unaffected]
-3, -1, 2
* Add for each section abs(num_elements * delta + prefix_sum):
abs(1 * 2 + (-3 - 0)) + abs(1 * 2 + (-4 - (-3))) + abs(1 * 2 + (-2 -(-4)))
= abs(2 - 3) + abs(2 - 1) + abs(2 + 2)
= 6
Sample Output
5
7
6

Scoring results based on an ideal solution

I am searching through a large number of possible outcomes and, while I may not find the perfect outcome, I would like to score the various outcomes to see how close they come to ideal. (I think I'm talking about some kind of weighted scoring, but don't let that influence your answer in case I'm completely off base.)
For some context, I'm generating a variety of work schedules and would like to have each result scored such that I don't have to look at them individually (it's a brute force approach, and there are literally billions of solutions) to determine if one is better or worse than any other one.
Input-wise, for each generated schedule, I have a 3x14 array that holds the total number of people that are scheduled to work each shift on any given day (i.e. for each day in a two-week period, the number of people working days, swings, and mids on that day).
So far, I have tried:
A) summing the values in each row, then multiplying each sum (row) by a weight (e.g. row 0 sum * 1, row 1 sum * 2, row 2 sum * 3, etc.), and finally adding together the weighted sums
function calcScore(a)
dim iCol, iTotalD, iTotalM, iTotalS
for iCol = 0 to 13
iTotalD = iTotalD + a(0)(iCol)
iTotalS = iTotalS + a(1)(iCol)
iTotalM = iTotalM + a(2)(iCol)
next
calcScore = iTotalD + iTotalS * 2 + iTotalM * 3
end function
And
B) multiplying each value in each row by a weight (e.g. row 0(0) * 1, row 0(1) * 2, row 0(2) * 3, etc.), and then summing the weighted values of each row
function calcScore(a)
dim iCol, iTotalD, iTotalM, iTotalS
for iCol = 0 to 13
iTotalD = iTotalD + a(0)(iCol) * (iCol + 1)
iTotalS = iTotalS + a(1)(iCol) * (iCol + 1)
iTotalM = iTotalM + a(2)(iCol) * (iCol + 1)
next
calcScore = iTotalD + iTotalS + iTotalM
end function
Below are some sample inputs (schedules), both ideal and non-ideal. Note that in my ideal example, each row is the same all the way across (e.g. all 4's, or all 3's), but that will not necessarily be the case in real-world usage. My plan is to score my ideal schedule, and compare the score of other schedules to it.
Ideal:
Su Mo Tu We ...
Day: 4 4 4 4 ...
Swing: 3 3 3 3 ...
Mid: 2 2 2 2 ...
Not Ideal:
Su Mo Tu We ...
Day: 3 4 4 4 [D(0) is not 4]
Swing: 3 3 3 3
Mid: 2 2 2 2
Not Ideal:
Su Mo Tu We ...
Day: 4 4 4 4
Swing: 3 3 4 3 [S(2) is not 3]
Mid: 0 2 2 2 [M(0) is not 2]
Summarizing my comments into an answer.
So you have an optimal/ideal/perfect solution and want to compare other solutions to it. In this case you could for example compute the sum of (squared) errors. If you need a score you can invert the error.
Specifically, you would have to calculate the sum of (squared) differences between a solution and the optimal by looking at each entry of your matrix and calculating the difference. Sum these (squared) differences up and you get the error.
For the examples you gave the sum of errors are as follows:
E(Ideal, Not Ideal 1) = 1
E(Ideal, Not Ideal 2) = 3
The sum of squared errors would yield the following:
SQE(Ideal, Not Ideal 1) = 1
SQE(Ideal, Not Ideal 2) = 5
Usually, the sum of squared errors is used in order to penalize larger errors more than several small errors.

Efficient algorithm to find the n-th digit in the string 112123123412345

What is an efficient algorithm for finding the digit in nth position in the following string
112123123412345123456 ... 123456789101112 ...
Storing the entire string in memory is not feasible for very large n, so I am looking for an algorithm that can find the nth digit in the above string which works if n is very large (i.e. an alternative to just generating the first n digits of the string).
There are several levels here: the digit is part of a number x, the number x is part of a sequence 1,2,3...x...y and that sequence is part of a block of sequences that lead up to numbers like y that have z digits. We'll tackle these levels one by one.
There are 9 numbers with 1 digit:
first: 1 (sequence length: 1 * 1)
last: 9 (sequence length: 9 * 1)
average sequence length: (1 + 9) / 2 = 5
1-digit block length: 9 * 5 = 45
There are 90 numbers with 2 digits:
first: 10 (sequence length: 9 * 1 + 1 * 2)
last: 99 (sequence length: 9 * 1 + 90 * 2)
average sequence length: 9 + (2 + 180) / 2 = 100
2-digit block length: 90 * 100 = 9000
There are 900 numbers with 3 digits:
first: 100 (sequence length: 9 * 1 + 90 * 2 + 1 * 3)
last: 999 (sequence length: 9 * 1 + 90 * 2 + 900 * 3)
average sequence length: 9 + 180 + (3 + 2,700) / 2 = 1,540.5
3-digit block length: 900 * 1,540.5 = 1,386,450
If you continue to calculate these values, you'll find which block (of sequences up to how many digits) the digit you're looking for is in, and you'll know the start and end point of this block.
Say you want the millionth digit. You find that it's in the 3-digit block, and that this block is located in the total sequence at:
start of 3-digit block: 45 + 9,000 + = 9,045
start of 4-digit block: 45 + 9,000 + 1,386,450 = 1,395,495
So in this block we're looking for digit number:
1,000,000 - 9,045 = 990,955
Now you can use e.g. a binary search to find which sequence the 990,955th digit is in; you start with the 3-digit number halfway in the 3-digit block:
first: 100 (sequence length: 9 + 180 + 1 * 3)
number: 550 (sequence length: 9 + 180 + 550 * 3)
average sequence length: 9 + 180 + (3 + 1650) / 2 = 1,015.5
total sequence length: 550 * 1,015.5 = 558,525
Which is too small; so we try 550 * 3/4 = 825, see if that is too small or large, and go up or down in increasingly smaller steps until we know which sequence the 990,995th digit is in.
Say it's in the sequence for the number n; then we calculate the total length of all 3-digit sequences up to n-1, and this will give us the location of the digit we're looking for in the sequence for the number n. Then we can use the numbers 9*1, 90*2, 900*3 ... to find which number the digit is in, and then what the digit is.
We have three types of structures that we would like to be able to search on, (1) the sequence of concatenating d-digit numbers, for example, single digit:
123456...
or 3-digit:
100101102103
(2) the rows in a section,
where each section builds on the previous section added to a prefix. For example, section 1:
1
12
123
...
or section 3:
1234...10111213...100
1234...10111213...100102
1234...10111213...100102103
<----- prefix ----->
and (3) the full sections, although the latter we can just enumerate since they grow exponentially and help build our section prefixes. For (1), we can use simple division if we know the digit count; for (2), we can binary search.
Here's Python code that also answers the big ones:
def getGreatest(n, d, prefix):
rows = 9 * 10**(d - 1)
triangle = rows * (d + rows * d) // 2
l = 0
r = triangle
while l < r:
mid = l + ((r - l) >> 1)
triangle = mid * prefix + mid * (d + mid * d) // 2
prevTriangle = (mid-1) * prefix + (mid-1) * (d + (mid-1) * d) // 2
nextTriangle = (mid+1) * prefix + (mid+1) * (d + (mid+1) * d) // 2
if triangle >= n:
if prevTriangle < n:
return prevTriangle
else:
r = mid - 1
else:
if nextTriangle >= n:
return triangle
else:
l = mid
return l * prefix + l * (d + l * d) // 2
def solve(n):
debug = 1
d = 0
p = 0.1
prefixes = [0]
sections = [0]
while sections[d] < n:
d += 1
p *= 10
rows = int(9 * p)
triangle = rows * (d + rows * d) // 2
section = rows * prefixes[d-1] + triangle
sections.append(sections[d-1] + section)
prefixes.append(prefixes[d-1] + rows * d)
section = sections[d - 1]
if debug:
print("section: %s" % section)
n = n - section
rows = getGreatest(n, d, prefixes[d - 1])
if debug:
print("rows: %s" % rows)
n = n - rows
d = 1
while prefixes[d] < n:
d += 1;
if prefixes[d] == n:
return 9;
prefix = prefixes[d - 1]
if debug:
print("prefix: %s" % prefix)
n -= prefix
if debug:
print((n, d, prefixes, sections))
countDDigitNums = n // d
remainder = n % d
prev = 10**(d - 1) - 1
num = prev + countDDigitNums
if debug:
print("num: %s" % num)
if remainder:
return int(str(num + 1)[remainder - 1])
else:
s = str(num);
return int(s[len(s) - 1])
ns = [
1, # 1
2, # 1
3, # 2
100, # 1
2100, # 2
31000, # 2
999999999999999999, # 4
1000000000000000000, # 1
999999999999999993, # 7
]
for n in ns:
print(n)
print(solve(n))
print('')
Well, you have a series of sequences each increasing by a single number.
If you have "x" of them, then the sequences up to that point occupy x * (x + 1) / 2 character positions. Or, another way of saying this is that the "x"s sequence starts at x * (x - 1) / 2 (assuming zero-based indexing). These are called triangular numbers.
So, all you need to do is to find the "x" value where the cumulative amount is closest to a given "n". Here are three ways:
Search for a closed from solution. This exists, but the formula is rather complicated. (Here is one reference for the sum of triangular numbers.)
Pre-calculate a table in memory with values up to, say, 1,000,000. that will get you to 10^10 sizes.
Use a "binary" search and the formula. So, generate the sequence of values for 1, 2, 4, 8, and so on and then do a binary search to find the exact sequence.
Once you know the sequence where the value lies, determining the value is simply a matter of arithmetic.

Finding number representation in different bases

I was recently solving a problem when I encountered this one: APAC Round E Q2
Basically the question asks to find the smallest base (>1) in which if the number (input) is written then the number would only consist of 1s. Like 3 if represented in base 2 would become 1 (consisting of only 1s).
Now, I tried to solve this the brute force way trying out all bases from 2 till the number to find such a base. But the constraints required a more efficient one.
Can anyone provide some help on how to approach this?
Here is one suggestion: A number x that can be represented as all 1s in a base b can be written as x = b^n + b^(n-1) + b^(n-2) + ... + b^1 + 1
If you subtract 1 from this number you end up with a number divisble by b:
b^n + b^(n-1) + b^(n-2) + ... + b^1 which has the representation 111...110. Dividing by b means shifting it right once so the resulting number is now b^(n-1) + b^(n-2) + ... + b^1 or 111...111 with one digit less than before. Now you can repeat the process until you reach 0.
For example 13 which is 111 in base 3:
13 - 1 = 12 --> 110
12 / 3 = 4 --> 11
4 - 1 = 3 --> 10
3 / 3 = 1 --> 1
1 - 1 = 0 --> 0
Done => 13 can be represented as all 1s in base 3
So in order to check if a given number can be written with all 1s in a base b you can check if that number is divisble by b after subtracting 1. If not you can immediately start with the next base.
This is also pretty brute-forcey but it doesn't do any base conversions, only one subtraction, one divisions and one mod operation per iteration.
We can solve this in O( (log2 n)^2 ) complexity by recognizing that the highest power attainable in the sequence would correspond with the smallest base, 2, and using the formula for geometric sum:
1 + r + r^2 + r^3 ... + r^(n-1) = (1 - r^n) / (1 - r)
Renaming the variables, we get:
n = (1 - base^power) / (1 - base)
Now we only need to check power's from (floor(log2 n) + 1) down to 2, and for each given power, use a binary search for the base. For example:
n = 13:
p = floor(log2 13) + 1 = 4:
Binary search for base:
(1 - 13^4) / (1 - 13) = 2380
...
No match for power = 4.
Try power = 3:
(1 - 13^3) / (1 - 13) = 183
(1 - 6^3) / (1 - 6) = 43
(1 - 3^3) / (1 - 3) = 13 # match
For n around 10^18 we may need up to (floor(log2 (10^18)) + 1)^2 = 3600 iterations.

Algorithm for converting decimal fractions to negadecimal?

I would like to know, how to convert fractional values (say, -.06), into negadecimal or a negative base. I know -.06 is .14 in negadecimal, because I can do it the other way around, but the regular algorithm used for converting fractions into other bases doesn't work with a negative base. Dont give a code example, just explain the steps required.
The regular algorithm works like this:
You times the value by the base you're converting into. Record whole numbers, then keep going with the remaining fraction part until there is no more fraction:
0.337 in binary:
0.337*2 = 0.674 "0"
0.674*2 = 1.348 "1"
0.348*2 = 0.696 "0"
0.696*2 = 1.392 "1"
0.392*2 = 0.784 "0"
0.784*2 = 1.568 "1"
0.568*2 = 1.136 "1"
Approximately .0101011
I have a two-step algorithm for doing the conversion. I'm not sure if this is the optimal algorithm, but it works pretty well.
The basic idea is to start off by getting a decimal representation of the number, then converting that decimal representation into a negadecimal representation by handling the even powers and odd powers separately.
Here's an example that motivates the idea behind the algorithm. This is going to go into a lot of detail, but ultimately will arrive at the algorithm and at the same time show where it comes from.
Suppose we want to convert the number 0.523598734 to negadecimal (notice that I'm presupposing you can convert to decimal). Notice that
0.523598734 = 5 * 10^-1
+ 2 * 10^-2
+ 3 * 10^-3
+ 5 * 10^-4
+ 9 * 10^-5
+ 8 * 10^-6
+ 7 * 10^-7
+ 3 * 10^-8
+ 4 * 10^-9
Since 10^-n = (-10)^-n when n is even, we can rewrite this as
0.523598734 = 5 * 10^-1
+ 2 * (-10)^-2
+ 3 * 10^-3
+ 5 * (-10)^-4
+ 9 * 10^-5
+ 8 * (-10)^-6
+ 7 * 10^-7
+ 3 * (-10)^-8
+ 4 * 10^-9
Rearranging and regrouping terms gives us this:
0.523598734 = 2 * (-10)^-2
+ 5 * (-10)^-4
+ 8 * (-10)^-6
+ 3 * (-10)^-8
+ 5 * 10^-1
+ 3 * 10^-3
+ 9 * 10^-5
+ 7 * 10^-7
+ 4 * 10^-9
If we could rewrite those negative terms as powers of -10 rather than powers of 10, we'd be done. Fortunately, we can make a nice observation: if d is a nonzero digit (1, 2, ..., or 9), then
d * 10^-n + (10 - d) * 10^-n
= 10^-n (d + 10 - d)
= 10^-n (10)
= 10^{-n+1}
Restated in a different way:
d * 10^-n + (10 - d) * 10^-n = 10^{-n+1}
Therefore, we get this useful fact:
d * 10^-n = 10^{-n+1} - (10 - d) * 10^-n
If we assume that n is odd, then -10^-n = (-10)^-n and 10^{-n+1} = (-10)^{-n+1}. Therefore, for odd n, we see that
d * 10^-n = 10^{-n+1} - (10 - d) * 10^-n
= (-10)^{-n+1} + (10 - d) * (-10)^-n
Think about what this means in a negadecimal setting. We've turned a power of ten into a sum of two powers of minus ten.
Applying this to our summation gives this:
0.523598734 = 2 * (-10)^-2
+ 5 * (-10)^-4
+ 8 * (-10)^-6
+ 3 * (-10)^-8
+ 5 * 10^-1
+ 3 * 10^-3
+ 9 * 10^-5
+ 7 * 10^-7
+ 4 * 10^-9
= 2 * (-10)^-2
+ 5 * (-10)^-4
+ 8 * (-10)^-6
+ 3 * (-10)^-8
+ (-10)^0 + 5 * (-10)^-1
+ (-10)^-2 + 7 * (-10)^-3
+ (-10)^-4 + 1 * (-10)^-5
+ (-10)^-6 + 3 * (-10)^-7
+ (-10)^-8 + 6 * (-10)^-9
Regrouping gives this:
0.523598734 = (-10)^0
+ 5 * (-10)^-1
+ 2 * (-10)^-2 + (-10)^-2
+ 7 * (-10)^-3
+ 5 * (-10)^-4 + (-10)^-4
+ 1 * (-10)^-5
+ 8 * (-10)^-6 + (-10)^-6
+ 3 * (-10)^-7
+ 3 * (-10)^-8 + (-10)^-8
+ 6 * (-10)^-9
Overall, this gives a negadecimal representation of 1.537619346ND
Now, let's think about this at a negadigit level. Notice that
Digits in even-numbered positions are mostly preserved.
Digits in odd-numbered positions are flipped: any nonzero, odd-numbered digit is replaced by 10 minus that digit.
Each time an odd-numbered digit is flipped, the preceding digit is incremented.
Let's look at 0.523598734 and apply this algorithm directly. We start by flipping all of the odd-numbered digits to give their 10's complement:
0.523598734 --> 0.527518336
Next, we increment the even-numbered digits preceding all flipped odd-numbered digits:
0.523598734 --> 0.527518336 --> 1.537619346ND
This matches our earlier number, so it looks like we have the makings of an algorithm!
Things get a bit trickier, unfortunately, when we start working with decimal values involving the number 9. For example, let's take the number 0.999. Applying our algorithm, we start by flipping all the odd-numbered digits:
0.999 --> 0.191
Now, we increment all the even-numbered digits preceding a column that had a value flipped:
0.999 --> 0.191 --> 1.1(10)1
Here, the (10) indicates that the column containing a 9 overflowed to a 10. Clearly this isn't allowed, so we have to fix it.
To figure out how to fix this, it's instructive to look at how to count in negabinary. Here's how to count from 0 to 110:
000
001
002
003
...
008
009
190
191
192
193
194
...
198
199
180
181
...
188
189
170
...
118
119
100
101
102
...
108
109
290
Fortunately, there's a really nice pattern here. The basic mechanism works like normal base-10 incrementing: increment the last digit, and if it overflows, carry a 1 into the next column, continuing to carry until everything stabilizes. The difference here is that the odd-numbered columns work in reverse. If you increment the -10s digit, for example, you actually subtract one rather than adding one, since increasing the value in that column by 10 corresponds to having one fewer -10 included in your sum. If that number underflows at 0, you reset it back to 9 (subtracting 90), then increment the next column (adding 100). In other words, the general algorithm for incrementing a negadecimal number works like this:
Start at the 1's column.
If the current column is at an even-numbered position:
Add one.
If the value reaches 10, set it to zero, then apply this procedure to the preceding column.
If the current column is at an odd-numbered position:
Subtract one.
If the values reaches -1, set it to 9, then apply this procedure to the preceding column.
You can confirm that this math works by generalizing the above reasoning about -10s digits and 100s digits and realizing that overflowing an even-numbered column corresponding to 10k means that you need to add in 10k+1, which means that you need to decrement the previous column by one, and that underflowing an odd-numbered column works by subtracting out 9 · 10k, then adding in 10k+1.
Let's go back to our example at hand. We're trying to convert 0.999 into negadecimal, and we've gotten to
0.999 --> 0.191 --> 1.1(10)1
To fix this, we'll take the 10's column and reset it back to 0, then carry the 1 into the previous column. That's an odd-numbered column, so we decrement it. This gives the final result:
0.999 --> 0.191 --> 1.1(10)1 --> 1.001ND
Overall, for positive numbers, we have the following algorithm for doing the conversion:
Processing digits from left to right:
If you're at an odd-numbered digit that isn't zero:
Replace the digit d with the digit 10 - d.
Using the standard negadecimal addition algorithm, increment the value in the previous column.
Of course, negative numbers are a whole other story. With negative numbers, the odd columns are correct and the even columns need to be flipped, since the parity of the (-10)k terms in the summation flip. Consequently, for negative numbers, you apply the above algorithm, but preserve the odd columns and flip the even columns. Similarly, instead of incrementing the preceding digit when doing a flip, you decrement the preceding digit.
As an example, suppose we want to convert -0.523598734 into negadecimal. Applying the algorithm gives this:
-0.523598734 --> 0.583592774 --> 0.6845(10)2874 --> 0.684402874ND
This is indeed the correct representation.
Hope this helps!
For your question i thought about this object-oriented code. I am not sure although. This class takes two negadecimals numbers with an operator and creates an equation, then converts those numbers to decimals.
public class NegadecimalNumber {
private int number1;
private char operator;
private int number2;
public NegadecimalNumber(int a, char op, int b) {
this.number1 = a;
this.operator = op;
this.number2 = b;
}
public int ConvertNumber1(int a) {
int i = 1;
int nega, temp;
temp = a;
int n = a & (-10);
while (n > 0) {
temp = a / (-10);
n = temp % (-10);
n = n * i;
i = i * 10;
}
nega = n;
return nega;
}
public int ConvertNumber2(int b) {
int i = 1;
int negb, temp;
temp = b;
int n = b & (-10);
while (n > 0) {
temp = b / (-10);
n = temp % (-10);
n = n * i;
i = i * 10;
}
negb = n;
return negb;
}
public double Equation() {
double ans = 0;
if (this.operator == '+') {
ans = this.number1 + this.number2;
} else if (this.operator == '-') {
ans = this.number1 - this.number2;
} else if (this.operator == '*') {
ans = this.number1 * this.number2;
} else if (this.operator == '/') {
ans = this.number1 / this.number2;
}
return ans;
}
}
Note that https://en.wikipedia.org/wiki/Negative_base#To_Negative_Base tells you how to convert whole numbers to a negative base. So one way to solve the problem is simply to multiply the fraction by a high enough power of 100 to turn it into a whole number, convert, and then divide again: -0.06 = -6 / 100 => 14/100 = 0.14.
Another way is to realise that you are trying to create a sum of the form -a/10 + b/100 -c/1000 + d/10000... to approximate the target number so you want to reduce the error as much as possible at each stage, but you need to leave an error in the direction that you can correct at the next stage. Note that this also means that a fraction might not start with 0. when converted. 0.5 => 1.5 = 1 - 5/10.
So to convert -0.06. This is negative and the first digit after the decimal point is in the range [0.0, -0.1 .. -0.9] so we start with 0. to leave us -0.06 to convert. Now if the first digit after the decimal point is 0 then I have -0.06 left, which is in the wrong direction to convert with 0.0d so I need to chose the first digit after the decimal point to produce an approximation below my target -0.06. So I chose 0.1, which is actually -0.1 and leaves me with an error of 0.04, which I can convert exactly leaving me the conversion of 0.14.
So at each point output the digit which gives you either
1) The exact result, in which case you are finished
2) An approximation which is slightly larger than the target number, if the next digit will be negative.
3) An approximation which is slightly smaller than the target number, if the next digit will be positive.
And if you start off trying to approximate a number in the range (-1.0, 0.0] at each point you can choose a digit which keeps the remaining error small enough and in the right direction, so this always works.

Resources