Related
I am trying to write an algorithm to establish correlation between n bits integers for the value “1”.
Here is an exemple of a 5 bits integer: 0,1,0,0,1
I want to establish the percentage of correlation between this integer and a set of N other integers.
For example, Integer A(0,1,0,0,1) and Integer B(0,1,0,0,0) have a correlation of 0,5 for the value “1” as only the second bit is matching.
In my Firebase database, I have one n bits integer attached to each user_ID that I want to match against the n bits integer of every other user of my application to get a type of correlation between each user.
The distribution of the total correlations between users will follow a Gaussian curve that I want to use in the future to match users with each other.
For example: I want user A to be matched with every other user with these matches sorted by decreasing order of affinity (from high to low correlation between their n bits integers).
Do you guys have any idea how I could perform the algorithm to establish the correlation between the N number of users and then perform another algorithm to sort these correlations from high to low?
Any help would be greatly appreciated.
Thank you for your time,
Maxime
you can use the and operation to get the Result R.
Example:
A = 9 = 01001
B = 8 = 01000
C = 7 = 00111
D = 31 = 11111
R = A & B gives 8 = 01000, the correlation is counting the ones: R/A = 1/2 = 0,5.
R = A & C gives 1 = 00001, the correlation: R/A = 1/2 = 0,5.
R = A & D gives 9 = 01001, R/A = 2/2 = 1.
Here we have a problem. you can solve this by using the max of the ones occuring in the num like R/max(A,D)
I believe it is better to use the total bit count (here 5).
results would be.
corr AB = 1/5 = 0,2
corr AC = 1/5 = 0,2
corr AD = 2/5 = 0,4
corr CD = 3/5 = 0,6
This question is taken from an ACM-ICPC Romanian archive.
You are given T tuples of the form (N, P), find the smallest number X for every tuple such that X % P == N. If this is not possible, print -1. X can only be formed using digits from the set {2, 3, 5, 7}.
Example :
3
52 100
11 100
51 1123
Output for given example :
52
-1
322352
Restrictions :
1 ≤ P ≤ 5 * 10^6
1 ≤ N ≤ P - 1
I attempted solving this problem by using a recursive function that would build numbers with digits from the given set and check if the condition is met, but that is way too slow because I have no idea when to stop searching (i.e. when there's no solution for the given tuple).
The author hints at using BFS somehow, but I really don't see any way to construct a meaningful graph using the input data of this problem.
How would you approach solving this problem?
You can solve this with a BFS, starting from 0, where adjacent vertices to a number n are 10n+2, 10n+3, 10n+5 and 10n+7. By keeping a record of all numbers mod p already queued, one can reduce the size of the search space, but more importantly know when the whole space has been searched.
Here's a simple Python implementation:
import collections
def ns(n, p):
q = collections.deque([0])
done = set()
while q:
x = q.popleft()
for d in [2, 3, 5, 7]:
nn = 10 * x + d
if nn % p in done:
continue
if nn % p == n:
return nn
q.append(nn)
done.add(nn % p)
return -1
assert ns(52, 100) == 52
assert ns(11, 100) == -1
assert ns(51, 1123) == 322352
assert ns(0, 55) == 55
I have given an array and I have to find the targeted sum.
For Example:
A[] ={1,2,3};
S = 5;
Total Combination = {1,1,1,1,1} , {2,3} ,{3,2} . {1,1,3} , {1,3,1} , {3,1,1} and other possible pair
I know it sounds like coin change problem, But the problem is how to find the Combination i.e {2,3} and {3,2} are 2 different solutions.
In the original coin change problem, you "choose" an arbitrary coin - and "guess" if it is or is not in the solution, this is done because the order is not important.
Here, you will have to iterate all possibilities for "which coin is first", until you are done:
D(0) = 1
D(x) = 0 | x < 0
D(x) = sum { D(x-coins[0]) , D(x-coins[1]), ..., D(x-coins[n-1] }
Note that for each step, you are giving all possibilities for the choosing the next coin, and moving on. At the end, you sum up all the solutions, for all possibilities to place each coin at the head of the solution.
Complexity of this solution using DP is O(n*S), where n is the number of coins and S is the desired sum.
Matlab code (wrote it in imperative style, this is my current open IDE, sorry it's matlab and not more common language like java or C)
function [ n ] = make_change( coins, x )
D = zeros(x,1);
for k = 1:x
for t = 1:length(coins)
curr = k-coins(t);
if curr>0
D(k) = D(k) + D(curr);
elseif curr == 0
D(k) = D(k) + 1;
end
end
end
n = D(x);
end
Invoking will yield:
>> make_change([1,2,3],5)
ans =
13
Which is correct, since all possibilities are [1,1,1,1,1],[1,1,1,2]*4, [1,1,3]*3,[1,2,2]*3,[2,3]*2 = 13
I've got another interesing programming/mathematical problem.
For a given natural number q from interval [2; 10000] find the number n
which is equal to sum of q-th powers of its digits modulo 2^64.
for example: for q=3, n=153; for q=5, n=4150.
I wasn't sure if this problem fits more to math.se or stackoverflow, but this was a programming task which my friend told me quite a long time ago. Now I remembered that and would like to know how such things can be done. How to approach this?
There are two key points,
the range of possible solutions is bounded,
any group of numbers whose digits are the same up to permutation con contain at most one solution.
Let us take a closer look at the case q = 2. If a d-digit number n is equal to the sum of the squares of its digits, then
n >= 10^(d-1) // because it's a d-digit number
n <= d*9^2 // because each digit is at most 9
and the condition 10^(d-1) <= d*81 is easily translated to d <= 3 or n < 1000. That's not many numbers to check, a brute-force for those is fast. For q = 3, the condition 10^(d-1) <= d*729 yields d <= 4, still not many numbers to check. We could find smaller bounds by analysing further, for q = 2, the sum of the squares of at most three digits is at most 243, so a solution must be less than 244. The maximal sum of squares of digits in that range is reached for 199: 1² + 9² + 9² = 163, continuing, one can easily find that a solution must be less than 100. (The only solution for q = 2 is 1.) For q = 3, the maximal sum of four cubes of digits is 4*729 = 2916, continuing, we can see that all solutions for q = 3 are less than 1000. But that sort of improvement of the bound is only useful for small exponents due to the modulus requirement. When the sum of the powers of the digits can exceed the modulus, it breaks down. Therefore I stop at finding the maximal possible number of digits.
Now, without the modulus, for the sum of the q-th powers of the digits, the bound would be approximately
q - (q/20) + 1
so for larger q, the range of possible solutions obtained from that is huge.
But two points come to the rescue here, first the modulus, which limits the solution space to 2 <= n < 2^64, at most 20 digits, and second, the permutation-invariance of the (modular) digital power sum.
The permutation invariance means that we only need to construct monotonous sequences of d digits, calculate the sum of the q-th powers and check whether the number thus obtained has the correct digits.
Since the number of monotonous d-digit sequences is comparably small, a brute-force using that becomes feasible. In particular if we ignore digits not contributing to the sum (0 for all exponents, 8 for q >= 22, also 4 for q >= 32, all even digits for q >= 64).
The number of monotonous sequences of length d using s symbols is
binom(s+d-1, d)
s is for us at most 9, d <= 20, summing from d = 1 to d = 20, there are at most 10015004 sequences to consider for each exponent. That's not too much.
Still, doing that for all q under consideration amounts to a long time, but if we take into account that for q >= 64, for all even digits x^q % 2^64 == 0, we need only consider sequences composed of odd digits, and the total number of monotonous sequences of length at most 20 using 5 symbols is binom(20+5,20) - 1 = 53129. Now, that looks good.
Summary
We consider a function f mapping digits to natural numbers and are looking for solutions of the equation
n == (sum [f(d) | d <- digits(n)] `mod` 2^64)
where digits maps n to the list of its digits.
From f, we build a function F from lists of digits to natural numbers,
F(list) = sum [f(d) | d <- list] `mod` 2^64
Then we are looking for fixed points of G = F ∘ digits. Now n is a fixed point of G if and only if digits(n) is a fixed point of H = digits ∘ F. Hence we may equivalently look for fixed points of H.
But F is permutation-invariant, so we can restrict ourselves to sorted lists and consider K = sort ∘ digits ∘ F.
Fixed points of H and of K are in one-to-one correspondence. If list is a fixed point of H, then sort(list) is a fixed point of K, and if sortedList is a fixed point of K, then H(sortedList) is a permutation of sortedList, hence H(H(sortedList)) = H(sortedList), in other words, H(sortedList) is a fixed point of K, and sort resp. H are bijections between the set of fixed points of H and K.
A further improvement is possible if some f(d) are 0 (modulo 264). Let compress be a function that removes digits with f(d) mod 2^64 == 0 from a list of digits and consider the function L = compress ∘ K.
Since F ∘ compress = F, if list is a fixed point of K, then compress(list) is a fixed point of L. Conversely, if clist is a fixed point of L, then K(clist) is a fixed point of K, and compress resp. K are bijections between the sets of fixed points of L resp. K. (And H(clist) is a fixed point of H, and compress ∘ sort resp. H are bijections between the sets of fixed points of L resp. H.)
The space of compressed sorted lists of at most d digits is small enough to brute-force for the functions f under consideration, namely power functions.
So the strategy is:
Find the maximal number d of digits to consider (bounded by 20 due to the modulus, smaller for small q).
Generate the compressed monotonic sequences of up to d digits.
Check whether the sequence is a fixed point of L, if it is, F(sequence) is a fixed point of G, i.e. a solution of the problem.
Code
Fortunately, you haven't specified a language, so I went for the option of simplest code, i.e. Haskell:
{-# LANGUAGE CPP #-}
module Main (main) where
import Data.List
import Data.Array.Unboxed
import Data.Word
import Text.Printf
#include "MachDeps.h"
#if WORD_SIZE_IN_BITS == 64
type UINT64 = Word
#else
type UINT64 = Word64
#endif
maxDigits :: UINT64 -> Int
maxDigits mx = min 20 $ go d0 (10^(d0-1)) start
where
d0 = floor (log (fromIntegral mx) / log 10) + 1
mxi :: Integer
mxi = fromIntegral mx
start = mxi * fromIntegral d0
go d p10 mmx
| p10 > mmx = d-1
| otherwise = go (d+1) (p10*10) (mmx+mxi)
sortedDigits :: UINT64 -> [UINT64]
sortedDigits = sort . digs
where
digs 0 = []
digs n = case n `quotRem` 10 of
(q,r) -> r : digs q
generateSequences :: Int -> [a] -> [[a]]
generateSequences 0 _
= [[]]
generateSequences d [x]
= [replicate d x]
generateSequences d (x:xs)
= [replicate k x ++ tl | k <- [d,d-1 .. 0], tl <- generateSequences (d-k) xs]
generateSequences _ _ = []
fixedPoints :: (UINT64 -> UINT64) -> [UINT64]
fixedPoints digFun = sort . map listNum . filter okSeq $
[ds | d <- [1 .. mxdigs], ds <- generateSequences d contDigs]
where
funArr :: UArray UINT64 UINT64
funArr = array (0,9) [(i,digFun i) | i <- [0 .. 9]]
mxval = maximum (elems funArr)
contDigs = filter ((/= 0) . (funArr !)) [0 .. 9]
mxdigs = maxDigits mxval
listNum = sum . map (funArr !)
numFun = listNum . sortedDigits
listFun = inter . sortedDigits . listNum
inter = go contDigs
where
go cds#(c:cs) dds#(d:ds)
| c < d = go cs dds
| c == d = c : go cds ds
| otherwise = go cds ds
go _ _ = []
okSeq ds = ds == listFun ds
solve :: Int -> IO ()
solve q = do
printf "%d:\n " q
print (fixedPoints (^q))
main :: IO ()
main = mapM_ solve [2 .. 10000]
It's not optimised, but as is, it finds all solutions for 2 <= q <= 10000 in a little below 50 minutes on my box, starting with
2:
[1]
3:
[1,153,370,371,407]
4:
[1,1634,8208,9474]
5:
[1,4150,4151,54748,92727,93084,194979]
6:
[1,548834]
7:
[1,1741725,4210818,9800817,9926315,14459929]
8:
[1,24678050,24678051,88593477]
9:
[1,146511208,472335975,534494836,912985153]
10:
[1,4679307774]
11:
[1,32164049650,32164049651,40028394225,42678290603,44708635679,49388550606,82693916578,94204591914]
And ending with
9990:
[1,12937422361297403387,15382453639294074274]
9991:
[1,16950879977792502812]
9992:
[1,2034101383512968938]
9993:
[1]
9994:
[1,9204092726570951194,10131851145684339988]
9995:
[1]
9996:
[1,10606560191089577674,17895866689572679819]
9997:
[1,8809232686506786849]
9998:
[1]
9999:
[1]
10000:
[1,11792005616768216715]
The exponents from about 10 to 63 take longest (individually, not cumulative), there's a remarkable speedup from exponent 64 on due to the reduced search space.
Here is a brute force solution that will solve for all such n, including 1 and any other n greater than the first within whatever range you choose (in this case I chose base^q as my range limit). You could modify to ignore the special case of 1 and also to return after the first result. It's in C#, but might look nicer in a language with a ** exponentiation operator. You could also pass in your q and base as parameters.
int q = 5;
int radix = 10;
for (int input = 1; input < (int)Math.Pow(radix, q); input++)
{
int sum = 0;
for (int i = 1; i < (int)Math.Pow(radix, q); i *= radix)
{
int x = input / i % radix; //get current digit
sum += (int)Math.Pow(x, q); //x**q;
}
if (sum == input)
{
Console.WriteLine("Hooray: {0}", input);
}
}
So, for q = 5 the results are:
Hooray: 1
Hooray: 4150
Hooray: 4151
Hooray: 54748
Hooray: 92727
Hooray: 93084
Following is text from Data structure and algorithm analysis by Mark Allen Wessis.
Following x(i+1) should be read as x subscript of i+1, and x(i) should be
read as x subscript i.
x(i + 1) = (a*x(i))mod m.
It is also common to return a random real number in the open interval
(0, 1) (0 and 1 are not possible values); this can be done by
dividing by m. From this, a random number in any closed interval [a,
b] can be computed by normalizing.
The problem with this routine is that the multiplication could
overflow; although this is not an error, it affects the result and
thus the pseudo-randomness. Schrage gave a procedure in which all of
the calculations can be done on a 32-bit machine without overflow. We
compute the quotient and remainder of m/a and define these as q and
r, respectively.
In our case for M=2,147,483,647 A =48,271, q = 127,773, r = 2,836, and r < q.
We have
x(i + 1) = (a*x(i))mod m.---------------------------> Eq 1.
= ax(i) - m (floorof(ax(i)/m)).------------> Eq 2
Also author is mentioning about:
x(i) = q(floor of(x(i)/q)) + (x(i) mod Q).--->Eq 3
My question
what does author mean by random number is computed by normalizing?
How author came with Eq 2 from Eq 1?
How author came with Eq 3?
Normalizing means if you have X ∈ [0,1] and you need to get Y ∈ [a, b] you can compute
Y = a + X * (b - a)
EDIT:
2. Let's suppose
a = 3, x = 5, m = 9
Then we have
where [ax/m] means an integer part.
So we have 15 = [ax/m]*m + 6
We need to get 6. 15 - [ax/m]*m = 6 => ax - [ax/m]*m = 6 => x(i+1) = ax(i) - [ax(i)/m]*m
If you have a random number in the range [0,1], you can get a number in the range [2,5] (for example) by multiplying by 3 and adding 2.