Linear time complexity ranking algorithm when the orders are precomputed - algorithm

I am trying to write an efficient ranking algorithm in C++ but I will present my case in R as it is far easier to understand this way.
> samples_x <- c(4, 10, 9, 2, NA, 3, 7, 1, NA, 8)
> samples_y <- c(5, 7, 9, NA, 1, 4, NA, 8, 2, 10)
> orders_x <- order(samples_x)
> orders_y <- order(samples_y)
> cbind(samples_x, orders_x, samples_y, orders_y)
samples_x orders_x samples_y orders_y
[1,] 4 8 5 5
[2,] 10 4 7 9
[3,] 9 6 9 6
[4,] 2 1 NA 1
[5,] NA 7 1 2
[6,] 3 10 4 8
[7,] 7 3 NA 3
[8,] 1 2 8 10
[9,] NA 5 2 4
[10,] 8 9 10 7
Suppose the above is already precomputed. Performing a simple ranking on each of the sample sets takes linear time complexity (the result is much like the rank function):
> ranks_x <- rep(0, length(samples_x))
> for (i in 1:length(samples_x)) ranks_x[orders_x[i]] <- i
For a work project I am working on, it would be useful for me to emulate the following behaviour in linear time complexity:
> cc <- complete.cases(samples_x, samples_y)
> ranks_x <- rank(samples_x[cc])
> ranks_y <- rank(samples_y[cc])
The complete.cases function, when given n sets of the same length, returns the indices for which none of the sets contain NAs. The order function returns the permutation of indices corresponding to the sorted sample set. The rank function returns the ranks of the sample set.
How to do this? Let me know if I have provided sufficient information as to the problem in question.
More specifically, I am trying to build a correlation matrix based on Spearman's rank sum correlation coefficient test in a way such that NAs are handled properly. The presence of NAs requires that the rankings be calculated for every pairwise sample set (s n^2 log n); I am trying to avoid that by calculating the orders once for every sample set (s n log n) and use a linear complexity for every pairwise comparison. Is this even doable?
Thanks in advance.

It looks like, when you work out the rank correlation of two arrays, you want to delete from both arrays elements in positions where either has NA.
You have
for (i in 1:length(samples_x)) ranks_x[orders_x[i]] <- i
Could you change this to something like
wp <- 0;
for (i in 1:length(samples_x)) {
if ((samples_x[orders_x[i]] == NA) ||
(samples_y[orders_x[i]] == NA))
{
ranks_x[orders_x[i]] <- NA;
}
else
{
ranks_x[orders_x[i]] <- wp++;
}
}
Then you could either go along later and compress out the NAs, or hope the correlation subroutine just ignores them.

Related

Getting minimum possible number after performing operations on array elements

Question : Given an integer(n) denoting the no. of particles initially
Given an array of sizes of these particles
These particles can go into any number of simulations (possibly none)
In one simualtion two particles combines to give another particle with size as the difference between the size of them (possibly 0).
Find the smallest particle that can be formed.
constraints
n<=1000
size<=1e9
Example 1
3
30 10 8
Output
2
Explaination- 10 - 8 is the smallest we can achive
Example 2
4
1 2 4 8
output
1
explanation
We cannot make another 1 so as to get 0 so smallest without any simulation is 1
example 3
5
30 27 26 10 6
output
0
30-26=4
10-6 =4
4-4 =0
My thinking: I can only think of the brute force solution which will obviously time out. Can anyone help me out here with just the approach? I think it's related to dynamic programming
I think this can be solved in O(n^2log(n))
Consider your third example: 30 27 26 10 6
Sort the input to make it : 6 10 26 27 30
Build a list of differences for each (i,j) combination.
For:
i = 1 -> 4 20 21 24
i = 2 -> 16, 17, 20
i = 3 -> 1, 4
i = 4 -> 3
There is no list for i = 5 why? because it is already considered for combination with other particles before.
Now consider the below cases:
Case 1
The particle i is not combined with any other particle yet. This means some other particle should have been combined with a particle other than i.
This suggests us that we need to search for A[i] in the lists j = 1 to N except for j = i.
Get the nearest value. This can be done using binary search. Because your difference lists are sorted! Then your result for now is |A[i] - NearestValueFound|
Case 2
The particle i is combined with some other particle.
Take example i = 1 above and lets consider that its combined with particle 2. The result is 4.
So search for 4 in all the lists except list 2 - because we consider that particle 2 is already combined with particle 1 and we shouldn't search list 2.
Do we have a best match? It seems we have a match 4 found in the list 3. It needn't be 0 - in this case it is 0 so just return 0.
Repeat Case 1, 2 for all particles. Time complexity is O(n^2log(n)), because you are doing a binary search on all lists for each i except the list i.
import itertools as it
N = int(input())
nums = list()
for i in range(N):
nums.append(int(input()))
_min = min(nums)
def go(li):
global _min
if len(li)>1:
for i in it.combinations(li, 2):
temp = abs(i[0] - i[1])
if _min > temp:
_min = temp
k = li.copy()
k.remove(i[0])
k.remove(i[1])
k.append(temp)
go(k)
go(nums)
print(_min)

MATLAB: Fast creation of random symmetric Matrix with fixed degree (sum of rows)

I am searching for a method to create, in a fast way a random matrix A with the follwing properties:
A = transpose(A)
A(i,i) = 0 for all i
A(i,j) >= 0 for all i, j
sum(A) =~ degree; the sum of rows are randomly distributed by a distribution I want to specify (here =~ means approximate equality).
The distribution degree comes from a matrix orig, specifically degree=sum(orig), thus I know that matrices with this distribution exist.
For example: orig=[0 12 7 5; 12 0 1 9; 7 1 0 3; 5 9 3 0]
orig =
0 12 7 5
12 0 1 9
7 1 0 3
5 9 3 0
sum(orig)=[24 22 11 17];
Now one possible matrix A=[0 11 5 8, 11 0 4 7, 5 4 0 2, 8 7 2 0] is
A =
0 11 5 8
11 0 4 7
5 4 0 2
8 7 2 0
with sum(A)=[24 22 11 17].
I am trying this for quite some time, but unfortunatly my two ideas didn't work:
version 1:
I switch Nswitch times two random elements: A(k1,k3)--; A(k1,k4)++; A(k2,k3)++; A(k2,k4)--; (the transposed elements aswell).
Unfortunatly, Nswitch = log(E)*E (with E=sum(sum(nn))) in order that the Matrices are very uncorrelated. As my E > 5.000.000, this is not feasible (in particular, as I need at least 10 of such matrices).
version 2:
I create the matrix according to the distribution from scratch. The idea is, to fill every row i with degree(i) numbers, based on the distribution of degree:
nn=orig;
nnR=zeros(size(nn));
for i=1:length(nn)
degree=sum(nn);
howmany=degree(i);
degree(i)=0;
full=rld_cumsum(degree,1:length(degree));
rr=randi(length(full),[1,howmany]);
ff=full(rr);
xx=i*ones([1,length(ff)]);
nnR = nnR + accumarray([xx(:),ff(:)],1,size(nnR));
end
A=nnR;
However, while sum(A')=degree, sum(A) systematically deviates from degree, and I am not able to find the reason for that.
Small deviations from degree are fine of course, but there seem to be systmatical deviations in particulat of the matrices contain in some places large numbers.
I would be very happy if somebody could either show me a fast method for version1, or a reason for the systematic deviation of the distribution in version 2, or a method to create such matrices in a different way. Thank you!
Edit:
This is the problem in matsmath's proposed solution:
Imagine you have the matrix:
orig =
0 12 3 1
12 0 1 9
3 1 0 3
1 9 3 0
with r(i)=[16 22 7 13].
Step 1: r(1)=16, my random integer partition is p(i)=[0 7 3 6].
Step 2: Check that all p(i)<=r(i), which is the case.
Step 3:
My random matrix starts looks like
A =
0 7 3 6
7 0 . .
3 . 0 .
6 . . 0
with the new row sum vector rnew=[r(2)-p(2),...,r(n)-p(n)]=[15 4 7]
Second iteration (here the problem occures):
Step 1: rnew(1)=15, my random integer partition is p(i)=[0 A B]: rnew(1)=15=A+B.
Step 2: Check that all p(i)<=rnew(i), which gives A<=4, B<=7. So A+B<=11, but A+B has to be 15. contradiction :-/
Edit2:
This is the code representing (to the best of my knowledge) the solution posted by David Eisenstat:
orig=[0 12 3 1; 12 0 1 9; 3 1 0 3; 1 9 3 0];
w=[2.2406 4.6334 0.8174 1.6902];
xfull=zeros(4);
for ii=1:1000
rndmat=[poissrnd(w(1),1,4); poissrnd(w(2),1,4); poissrnd(w(3),1,4); poissrnd(w(4),1,4)];
kkk=rndmat.*(ones(4)-eye(4)); % remove diagonal
hhh=sum(sum(orig))/sum(sum(kkk))*kkk; % normalisation
xfull=xfull+hhh;
end
xf=xfull/ii;
disp(sum(orig)); % gives [16 22 7 13]
disp(sum(xf)); % gives [14.8337 9.6171 18.0627 15.4865] (obvious systematic problem)
disp(sum(xf')) % gives [13.5230 28.8452 4.9635 10.6683] (which is also systematically different from [16, 22, 7, 13]
Since it's enough to approximately preserve the degree sequence, let me propose a random distribution where each entry above the diagonal is chosen according to a Poisson distribution. My intuition is that we want to find weights w_i such that the i,j entry for i != j has mean w_i*w_j (all of the diagonal entries are zero). This gives us a nonlinear system of equations:
for all i, (sum_{j != i} w_i*w_j) = d_i,
where d_i is the degree of i. Equivalently,
for all i, w_i * (sum_j w_j) - w_i^2 = d_i.
The latter can be solved by applying Newton's method as described below from a starting solution of w_i = d_i / sqrt(sum_j d_j).
Once we have the w_is, we can sample repeatedly using poissrnd to generate samples of multiple Poisson distributions at once.
(If I have time, I'll try implementing this in numpy.)
The Jacobian matrix of the equation system for a 4 by 4 problem is
(w_2 + w_3 + w_4) w_1 w_1 w_1
w_2 (w_1 + w_3 + w_4) w_2 w_2
w_3 w_3 (w_1 + w_2 + w_4) w_3
w_4 w_4 w_4 (w_1 + w_2 + w_3).
In general, let A be a diagonal matrix where A_{i,i} = sum_j w_j - 2*w_i. Let u = [w_1, ..., w_n]' and v = [1, ..., 1]'. The Jacobian can be written J = A + u*v'. The inverse is given by the Sherman--Morrison formula
A^-1*u*v'*A^-1
J^-1 = (A + u*v')^-1 = A^-1 - -------------- .
1 + v'*A^-1*u
For the Newton step, we need to compute J^-1*y for some given y. This can be done straightforwardly in time O(n) using the above equation. I'll add more detail when I get the chance.
First approach (based on version2)
Let your row sum vector given by the matrix orig [r(1),r(2),...,r(n)].
Step 1. Take a random integer partition of the integer r(1) into exactly n-1 parts, say p(2), p(3), ..., p(n)
Step 2. Check if p(i)<=r(i) for all i=2...n. If not, go to Step 1.
Step 3. Fill out your random matrix first row and colum by the entries 0, p(2), ... , p(n), and consider the new row sum vector [r(2)-p(2),...,r(n)-p(n)].
Repeat these steps with a matrix of order n-1.
The point is, that you randomize one row at a time, and reduce the problem to searching for a matrix of size one less.
As pointed out by OP in the comment, this naive algorithm fails. The reason is that the matrices in question have a further necessary condition on their entries as follows:
FACT:
If A is an orig matrix with row sums [r(1), r(2), ..., r(n)] then necessarily for every i=1..n it holds that r(i)<=-r(i)+sum(r(j),j=1..n).
That is, any row sum, say the ith, r(i), is necessarily at most as big as the sum of the other row sums (not including r(i)).
In light of this, a revised algorithm is possible. Note that in Step 2b. we check if the new row sum vector has the property discussed above.
Step 1. Take a random integer partition of the integer r(1) into exactly n-1 parts, say p(2), p(3), ..., p(n)
Step 2a. Check if p(i)<=r(i) for all i=2...n. If not, go to Step 1.
Step 2b. Check if r(i)-p(i)<=-r(i)+p(i)+sum(r(j)-p(j),j=2..n) for all i=2..n. If not, go to Step 1.
Step 3. Fill out your random matrix first row and colum by the entries 0, p(2), ... , p(n), and consider the new row sum vector [r(2)-p(2),...,r(n)-p(n)].
Second approach (based on version1)
I am not sure if this approach gives you random matrices, but it certainly gives you different matrices.
The idea here is to change some parts of your orig matrix locally, in a way which maintains all of its properties.
You should look for a random 2x2 submatrix below the main diagonal which contains strictly positive entries, like [[a,b],[c,d]] and perturbe its contents by a random value r to [[a+r,b-r],[c-r,d+r]]. You make the same change above the main diagonal too, to keep your new matrix symmetric. Here the point is that the changes within the entries "cancel" each other out.
Of course, r should be chosen in a way such that b-r>=0 and c-r>=0.
You can pursue this idea to modify larger submatrices too. For example, you might choose 3 random row coordinates r1, r2, r2 and 3 random column coordinates c1, c2, and c3 and then make changes in your orig matrix at the 9 positions (ri,cj) as follows: you change your 3x3 submatrix [[a b c],[d e f], [g h i]] to [[a-r b+r c] [d+r e f-r], [g h-r i+r]]. You do the same at the transposed places. Again, the random value r must be chosen in a way so that a-r>=0 and f-r>=0 and h-r>=0. Moreover, c1 and r1, and c3 and r3 must be distinct as you can't change the 0 entries in the main diagonal of the matrix orig.
You can repeat such things over and over again, say 100 times, until you find something which looks random. Note that this idea uses the fact that you have existing knowledge of a solution, this is the matrix orig, while the first approach does not use such knowledge at all.

Strategy with regard to how to approach this algorithm?

I was asked this question in a test and I need help with regards to how I should approach the solution, not the actual answer. The question is
You have been given a 7 digit number(with each digit being distinct and 0-9). The number has this property
product of first 3 digits = product of last 3 digits = product of central 3 digits
Identify the middle digit.
Now, I can do this on paper by brute force(trial and error), the product is 72 and digits being
8,1,9,2,4,3,6
Now how do I approach the problem in a no brute force way?
Let the number is: a b c d e f g
So as per the rule(1):
axbxc = cxdxe = exfxg
more over we have(2):
axb = dxe and
cxd = fxg
This question can be solved with factorization and little bit of hit/trial.
Out of the digits from 1 to 9, 5 and 7 can rejected straight-away since these are prime numbers and would not fit in the above two equations.
The digits 1 to 9 can be factored as:
1 = 1, 2 = 2, 3 = 3, 4 = 2X2, 6 = 2X3, 8 = 2X2X2, 9 = 3X3
After factorization we are now left with total 7 - 2's, 4 - 3's and the number 1.
As for rule 2 we are left with only 4 possibilities, these 4 equations can be computed by factorization logic since we know we have overall 7 2's and 4 3's with us.
1: 1X8(2x2x2) = 2X4(2x2)
2: 1X6(3x2) = 3X2
3: 4(2x2)X3 = 6(3x2)X2
4: 9(3x3)X2 = 6(3x2)X3
Skipping 5 and 7 we are left with 7 digits.
With above equations we have 4 digits with us and are left with remaining 3 digits which can be tested through hit and trial. For example, if we consider the first case we have:
1X8 = 2X4 and are left with 3,6,9.
we have axbxc = cxdxe we can opt c with these 3 options in that case the products would be 24, 48 and 72.
24 cant be correct since for last three digits we are left with are 6,9,4(=216)
48 cant be correct since for last three digits we are left with 3,9,4(=108)
72 could be a valid option since the last three digits in that case would be 3,6,4 (=72)
This question is good to solve with Relational Programming. I think it very clearly lets the programmer see what's going on and how the problem is solved. While it may not be the most efficient way to solve problems, it can still bring desired clarity and handle problems up to a certain size. Consider this small example from Oz:
fun {FindDigits}
D1 = {Digit}
D2 = {Digit}
D3 = {Digit}
D4 = {Digit}
D5 = {Digit}
D6 = {Digit}
D7 = {Digit}
L = [D1 D2 D3] M = [D3 D4 D5] E= [D5 D6 D7] TotL in
TotL = [D1 D2 D3 D4 D5 D6 D7]
{Unique TotL} = true
{ProductList L} = {ProductList M} = {ProductList E}
TotL
end
(Now this would be possible to parameterize furthermore, but non-optimized to illustrate the point).
Here you first pick 7 digits with a function Digit/0. Then you create three lists, L, M and E consisting of the segments, as well as a total list to return (you could also return the concatenation, but I found this better for illustration).
Then comes the point, you specify relations that have to be intact. First, that the TotL is unique (distinct in your tasks wording). Then the next one, that the segment products have to be equal.
What now happens is that a search is conducted for your answers. This is a depth-first search strategy, but could also be breadth-first, and a solver is called to bring out all solutions. The search strategy is found inside the SolveAll/1 function.
{Browse {SolveAll FindDigits}}
Which in turns returns this list of answers:
[[1 8 9 2 4 3 6] [1 8 9 2 4 6 3] [3 6 4 2 9 1 8]
[3 6 4 2 9 8 1] [6 3 4 2 9 1 8] [6 3 4 2 9 8 1]
[8 1 9 2 4 3 6] [8 1 9 2 4 6 3]]
At least this way forward is not using brute force. Essentially you are searching for answers here. There might be heuristics that let you find the correct answer sooner (some mathematical magic, perhaps), or you can use genetic algorithms to search the space or other well-known strategies.
Prime factor of distinct digit (if possible)
0 = 0
1 = 1
2 = 2
3 = 3
4 = 2 x 2
5 = 5
6 = 2 x 3
7 = 7
8 = 2 x 2 x 2
9 = 3 x 3
In total:
7 2's + 4 3's + 1 5's + 1 7's
With the fact that When A=B=C, composition of prime factor of A must be same as composition of prime factor of B and that of C, 0 , 5 and 7 are excluded since they have unique prime factor that can never match with the fact.
Hence, 7 2's + 4 3's are left and we have 7 digit (1,2,3,4,6,8,9). As there are 7 digits only, the number is formed by these digits only.
Recall the fact, A, B and C must have same composition of prime factors. This implies that A, B and C have same number of 2's and 3's in their composition. So, we should try to achieve (in total for A and B and C):
9 OR 12 2's AND
6 3's
(Must be product of 3, lower bound is total number of prime factor of all digits, upper bound is lower bound * 2)
Consider point 2 (as it has one possibility), A has 2 3's and same for B and C. To have more number of prime factor in total, we need to put digit in connection digit between two product (third or fifth digit). Extract digits with prime factor 3 into two groups {3,6} and {9} and put digit into connection digit. The only possible way is to put 9 in connection digit and 3,6 on unconnected product. That mean xx9xx36 or 36xx9xx (order of 3,6 is not important)
With this result, we get 9 x middle x connection digit = connection digit x 3 x 6. Thus, middle = (3 x 6) / 9 = 2
My answer actually extends #Ansh's answer.
Let abcdefg be the digits of the number. Then
ab=de
cd=fg
From these relations we can exclude 0, 5 and 7 because there are no other multipliers of these numbers between 0 and 9. So we are left with seven numbers and each number is included once in each answer. We are going to examine how we can pair the numbers (ab, de, cd, fg).
What happens with 9? It can't be combined with 3 or 6 since then their product will have three times the factor 3 and we have at total 4 factors of 3. Similarly, 3 and 6 must be combined at least one time together in response to the two factors of 9. This gives a product of 18 and so 9 must be combined at least once with 2.
Now if 9x2 is in a corner then 3x6 must be in the middle. Meaning in the other corner there must be another multiplier of 3. So 9 and 2 are in the middle.
Let's suppose ab=3x6 (The other case is symmetric). Then d must be 9 or 2. But if d is 9 then f or g must be multiplier of 3. So d is 2 and e is 9. We can stop here and answer the middle digit is
2
Now we have 2c = fg and the remaining choices are 1, 4, 8. We see that the only solutions are c = 4, f = 1, g = 8 and c = 4, f = 8, g = 1.
So if is 3x6 is in the left corner we have the following solutions:
3642918, 3642981, 6342918, 6342981
If 3x6 is in the right corner we have the following solutions which are the reverse of the above:
8192463, 1892463, 8192436, 1892436
Here is how you can consider the problem:
Let's note the final solution N1 N2 N3 N4 N5 N6 N7 for the 3 numbers N1N2N3, N3N4N5 and N5N6N7
0, 5 and 7 are to exclude because they are prime and no other ciphers is a multiple of them. So if they had divided one of the 3 numbers, no other number could have divided the others.
So we get the 7 remaining ciphers : 1234689
where the product of the ciphers is 2^7*3^4
(N1*N2*N3) and (N5*N6*N7) are equals so their product is a square number. We can then remove, one of the number (N4) from the product of the previous point to find a square number (i.e. even exponents on both numbers)
N4 can't be 1, 3, 4, 6, 9.
We conclude N4 is 2 or 8
If N4 is 8 and it divides (N3*N4*N5), we can't use the remaining even numbers (2, 4, 6) to divides
both (N1*N2*N3) and (N6*N7*N8) by 8. So N4 is 2 and 8 does not belong to the second group (let's put it in N1).
Now, we have: 1st grp: 8XX, 2nd group: X2X 3rd group: XXX
Note: at this point we know that the product is 72 because it is 2^3*3^2 (the square root of 2^6*3^4) but the result is not really important. We have made the difficult part knowing the 7 numbers and the middle position.
Then, we know that we have to distribute 2^3 on (N1*N2*N3), (N3*N4*N5), (N5*N6*N7) because 2^3*2*2^3=2^7
We already gave 8 to N1, 2 to N4 and we place 6 to N6, and 4 to N5 position, resulting in each of the 3 numbers being a multiple of 8.
Now, we have: 1st grp: 8XX, 2nd group: X24 3rd group: 46X
We have the same way of thinking considering the odd number, we distribute 3^2, on each part knowing that we already have a 6 in the last group.
Last group will then get the 3. And first and second ones the 9.
Now, we have: 1st grp: 8X9, 2nd group: 924 3rd group: 463
And, then 1 at N2, which is the remaining position.
This problem is pretty easy if you look at the number 72 more carefully.
We have our number with this form abcdefg
and abc = cde = efg, with those digits 8,1,9,2,4,3,6
So, first, we can conclude that 8,1,9 must be one of the triple, because, there is no way 1 can go with other two numbers to form 72.
We can also conclude that 1 must be in the start/end of the whole number or middle of the triple.
So now we have 819defg or 918defg ...
Using some calculations with the rest of those digits, we can see that only 819defg is possible, because, we need 72/9 = 8,so only 2,4 is valid, while we cannot create 72/8 = 9 from those 2,4,3,6 digits, so -> 81924fg or 81942fg and 819 must be the triple that start or end our number.
So the rest of the job is easy, we need either 72/4 = 18 or 72/2 = 36, now, we can have our answers: 8192436 or 8192463.
7 digits: 8,1,9,2,4,3,6
say XxYxZ = 72
1) pick any two from above 7 digits. say X,Y
2) divide 72 by X and then Y.. you will get the 3rd number i.e Z.
we found XYZ set of 3-digits which gives result 72.
now repeat 1) and 2) with remaining 4 digits.
this time we found ABC which multiplies to 72.
lets say, 7th digit left out is I.
3) divide 72 by I. result R
4) divide R by one of XYZ. check if result is in ABC.
if No, repeat the step 3)
if yes, found the third pair.(assume you divided R by Y and the result is B)
YIB is the third pair.
so... solution will be.
XZYIBAC
You have your 7 numbers - instead of looking at it in groups of 3 divide up the number as such:
AB | C | D | E | FG
Get the value of AB and use it to get the value of C like so: C = ABC/AB
Next you want to do the same thing with the trailing 2 digits to find E using FG. E = EFG/FG
Now that you have C & E you can solve for D
Since CDE = ABC then D = ABC/CE
Remember your formulas - instead of looking at numbers create a formula aka an algorithm that you know will work every time.
ABC = CDE = EFG However, you have to remember that your = signs have to balance. You can see that D = ABC/CE = EFG/CE Once you know that, you can figure out what you need in order to solve the problem.
Made a quick example in a fiddle of the code:
http://jsfiddle.net/4ykxx9ve/1/
var findMidNum = function() {
var num = [8, 1, 9, 2, 4, 3, 6];
var ab = num[0] * num[1];
var fg = num[5] * num[6];
var abc = num[0] * num[1] * num[2];
var cde = num[2] * num[3] * num[4];
var efg = num[4] * num[5] * num[6];
var c = abc/ab;
var e = efg/fg;
var ce = c * e
var d = abc/ce;
console.log(d); //2
}();
You have been given a 7 digit number(with each digit being distinct and 0-9). The number has this property
product of first 3 digits = product of last 3 digits = product of central 3 digits
Identify the middle digit.
Now, I can do this on paper by brute force(trial and error), the product is 72 and digits being
8,1,9,2,4,3,6
Now how do I approach the problem in a no brute force way?
use linq and substring functions
example var item = array.Skip(3).Take(3) in such a way that you have a loop
for(f =0;f<charlen.length;f++){
var xItemSum = charlen[f].Skip(f).Take(f).Sum(f => f.Value);
}
// untested code

How to find the units digit of a certain power in a simplest way

How to find out the units digit of a certain number (e.g. 3 power 2011). What logic should I use to find the answer to this problem?
For base 3:
3^1 = 3
3^2 = 9
3^3 = 27
3^4 = 81
3^5 = 243
3^6 = 729
3^7 = 2187
...
That is the units digit has only 4 possibilities and then it repeats in ever the same cycle.
With the help of Euler's theorem we can show that this holds for any integer n, meaning their units digit will repeat after at most 4 consecutive exponents. Looking only at the units digit of an arbitrary product is equivalent to taking the remainder of the multiplication modulo 10, for example:
2^7 % 10 = 128 % 10 = 8
It can also be shown (and is quite intuitive) that for an arbitrary base, the units digit of any power will only depend on the units digit of the base itself - that is 2013^2013 has the same units digit as 3^2013.
We can exploit both facts to come up with an extremely fast algorithm (thanks for the help - with kind permission I may present a much faster version).
The idea is this: As we know that for any number 0-9 there will be at most 4 different outcomes, we can as well store them in a lookup table:
{ 0,0,0,0, 1,1,1,1, 6,2,4,8, 1,3,9,7, 6,4,6,4,
5,5,5,5, 6,6,6,6, 1,7,9,3, 6,8,4,2, 1,9,1,9 }
That's the possible outcomes for 0-9 in that order, grouped in fours. The idea is now for an exponentiation n^a to
first take the base mod 10 => := i
go to index 4*i in our table (it's the starting offset of that particular digit)
take the exponent mod 4 => := off (as stated by Euler's theorem we only have four possible outcomes!)
add off to 4*i to get the result
Now to make this as efficient as possible, some tweaks are applied to the basic arithmetic operations:
Multiplying by 4 is equivalent to shifting two to the left ('<< 2')
Taking a number a % 4 is equivalent to saying a&3 (masking the 1 and 2 bit, which form the remainder % 4)
The algorithm in C:
static int table[] = {
0, 0, 0, 0, 1, 1, 1, 1, 6, 2, 4, 8, 1, 3, 9, 7, 6, 4, 6, 4,
5, 5, 5, 5, 6, 6, 6, 6, 1, 7, 9, 3, 6, 8, 4, 2, 1, 9, 1, 9
};
int /* assume n>=0, a>0 */
unit_digit(int n, int a)
{
return table[((n%10)<<2)+(a&3)];
}
Proof for the initial claims
From observing we noticed that the units digit for 3^x repeats every fourth power. The claim was that this holds for any integer. But how is this actually proven? As it turns out that it's quite easy using modular arithmetic. If we are only interested in the units digit, we can perform our calculations modulo 10. It's equivalent to say the units digit cycles after 4 exponents or to say
a^4 congruent 1 mod 10
If this holds, then for example
a^5 mod 10 = a^4 * a^1 mod 10 = a^4 mod 10 * a^1 mod 10 = a^1 mod 10
that is, a^5 yields the same units digit as a^1 and so on.
From Euler's theorem we know that
a^phi(10) mod 10 = 1 mod 10
where phi(10) is the numbers between 1 and 10 that are co-prime to 10 (i.e. their gcd is equal to 1). The numbers < 10 co-prime to 10 are 1,3,7 and 9. So phi(10) = 4 and this proves that really a^4 mod 10 = 1 mod 10.
The last claim to prove is that for exponentiations where the base is >= 10 it suffices to just look at the base's units digit. Lets say our base is x >= 10, so we can say that x = x_0 + 10*x_1 + 100*x_2 + ... (base 10 representation)
Using modular representation it's easy to see that indeed
x ^ y mod 10
= (x_0 + 10*x_1 + 100*x_2 + ...) ^ y mod 10
= x_0^y + a_1 * (10*x_1)^y-1 + a_2 * (100*x_2)^y-2 + ... + a_n * (10^n) mod 10
= x_0^y mod 10
where a_i are coefficients that include powers of x_0 but finally not relevant since the whole product a_i * (10 * x_i)^y-i will be divisible by 10.
You should look at Modular exponentiation. What you want is the same of calculating n^e (mod m) with m = 10. That is the same thing as calculating the remainder of the division by ten of n^e.
You are probably interested in the Right-to-left binary method to calculate it, since it's the most time-efficient one and the easiest not too hard to implement. Here is the pseudocode, from Wikipedia:
function modular_pow(base, exponent, modulus)
result := 1
while exponent > 0
if (exponent & 1) equals 1:
result = (result * base) mod modulus
exponent := exponent >> 1
base = (base * base) mod modulus
return result
After that, just call it with modulus = 10 for you desired base and exponent and there's your answer.
EDIT: for an even simpler method, less efficient CPU-wise but more memory-wise, check out the Memory-efficient section of the article on Wikipedia. The logic is straightforward enough:
function modular_pow(base, exponent, modulus)
c := 1
for e_prime = 1 to exponent
c := (c * base) mod modulus
return c
I'm sure there's a proper mathematical way to solve this, but I would suggest that since you only care about the last digit and since in theory every number multiplied by itself repeatedly should generate a repeating pattern eventually (when looking only at the last digit), you could simply perform the multiplications until you detect the first repetition and then map your exponent into the appropriate position in the pattern that you built.
Note that because you only care about the last digit, you can further simplify things by truncating your input number down to its ones-digit before you start building your pattern mapping. This will let you to determine the last digit even for arbitrarily large inputs that would otherwise cause an overflow on the first or second multiplication.
Here's a basic example in JavaScript: http://jsfiddle.net/dtyuA/2/
function lastDigit(base, exponent) {
if (exponent < 0) {
alert("stupid user, negative values are not supported");
return 0;
}
if (exponent == 0) {
return 1;
}
var baseString = base + '';
var lastBaseDigit = baseString.substring(baseString.length - 1);
var lastDigit = lastBaseDigit;
var pattern = [];
do {
pattern.push(lastDigit);
var nextProduct = (lastDigit * lastBaseDigit) + '';
lastDigit = nextProduct.substring(nextProduct.length - 1);
} while (lastDigit != lastBaseDigit);
return pattern[(exponent - 1) % pattern.length];
};
function doMath() {
var base = parseInt(document.getElementById("base").value, 10);
var exp = parseInt(document.getElementById("exp").value, 10);
console.log(lastDigit(base, exp));
};
console.log(lastDigit(3003, 5));
Base: <input id="base" type="text" value="3" /> <br>
Exponent: <input id="exp" type="text" value="2011"><br>
<input type="button" value="Submit" onclick="doMath();" />
And the last digit in 3^2011 is 7, by the way.
We can start by inspecting the last digit of each result obtained by raising the base 10 digits to successive powers:
d d^2 d^3 d^4 d^5 d^6 d^7 d^8 d^9 (mod 10)
--- --- --- --- --- --- --- --- ---
0 0 0 0 0 0 0 0 0
1 1 1 1 1 1 1 1 1
2 4 8 6 2 4 8 6 2
3 9 7 1 3 9 7 1 3
4 6 4 6 4 6 4 6 4
5 5 5 5 5 5 5 5 5
6 6 6 6 6 6 6 6 6
7 9 3 1 7 9 3 1 7
8 4 2 6 8 4 2 6 8
9 1 9 1 9 1 9 1 9
We can see that in all cases the last digit cycles through no more than four distinct values. Using this fact, and assuming that n is a non-negative integer and p is a positive integer, we can compute the result fairly directly (e.g. in Javascript):
function lastDigit(n, p) {
var d = n % 10;
return [d, (d*d)%10, (d*d*d)%10, (d*d*d*d)%10][(p-1) % 4];
}
... or even more simply:
function lastDigit(n, p) {
return Math.pow(n % 10, (p-1) % 4 + 1) % 10;
}
lastDigit(3, 2011)
/* 7 */
The second function is equivalent to the first. Note that even though it uses exponentiation, it never works with a number larger than nine to the fourth power (6561).
The key to solving this type of question lies in Euler's theorem.
This theorem allows us to say that a^phi(m) mod m = 1 mod m, if and only if a and m are coprime. That is, a and m do not divide evenly. If this is the case, (and for your example it is), we can solve the problem on paper, without any programming what so ever.
Let's solve for the unit digit of 3^2011, as in your example. This is equivalent to 3^2011 mod 10.
The first step is to check is 3 and 10 are co-prime. They do not divide evenly, so we can use Euler's theorem.
We also need to compute what the totient, or phi value, is for 10. For 10, it is 4. For 100 phi is 40, 1000 is 4000, etc.
Using Euler's theorem, we can see that 3^4 mod 10 = 1. We can then re-write the original example as:
3^2011 mod 10 = 3^(4*502 + 3) mod 10 = 3^(4*502) mod 10 + 3^3 mod 10 = 1^502 * 3^3 mod 10 = 27 mod 10 = 7
Thus, the last digit of 3^2011 is 7.
As you saw, this required no programming whatsoever and I solved this example on a piece of scratch paper.
You ppl are making simple thing complicated.
Suppose u want to find out the unit digit of abc ^ xyz .
divide the power xyz by 4,if remainder is 1 ans is c^1=c.
if xyz%4=2 ans is unit digit of c^2.
else if xyz%4=3 ans is unit digit of c^3.
if xyz%4=0
then we need to check whether c is 5,then ans is 5
if c is even ans is 6
if c is odd (other than 5 ) ans is 1.
Bellow is a table with the power and the unit digit of 3 to that power.
0 1
1 3
2 9
3 7
4 1
5 3
6 9
7 7
Using this table you can see that the unit digit can be 1, 3, 9, 7 and the sequence repeats in this order for higher powers of 3. Using this logic you can find that the unit digit of (3 power 2011) is 7. You can use the same algorithm for the general case.
Here's a trick that works for numbers that aren't a multiple of a factor of the base (for base 10, it can't be a multiple of 2 or 5.) Let's use base 3. What you're trying to find is 3^2011 mod 10. Find powers of 3, starting with 3^1, until you find one with the last digit 1. For 3, you get 3^4=81. Write the original power as (3^4)^502*3^3. Using modular arithmetic, (3^4)^502*3^3 is congruent to (has the same last digit as) 1^502*3^3. So 3^2011 and 3^3 have the same last digit, which is 7.
Here's some pseudocode to explain it in general. This finds the last digit of b^n in base B.
// Find the smallest power of b ending in 1.
i=1
while ((b^i % B) != 1) {
i++
}
// b^i has the last digit 1
a=n % i
// For some value of j, b^n == (b^i)^j * b^a, which is congruent to b^a
return b^a % B
You'd need to be careful to prevent an infinite loop, if no power of b ends in 1 (in base 10, multiples of 2 or 5 don't work.)
Find out the repeating set in this case, it is 3,9,7,1 and it repeats in the same order for ever....so divide 2011 by 4 which will give you a reminder 3. That is the 3rd element in the repeating set. This is the easiest way to find for any given no. say if asked for 3^31, then the reminder of 31/4 is 3 and so 7 is the unit digit. for 3^9, 9/4 is 1 and so the unit will be 3. 3^100, the unit will be 1.
If you have the number and exponent separate it's easy.
Let n1 is the number and n2 is the power. And ** represents power.
assume n1>0.
% means modulo division.
pseudo code will look like this
def last_digit(n1, n2)
if n2==0 then return 1 end
last = n1%10
mod = (n2%4).zero? ? 4 : (n2%4)
last_digit = (last**mod)%10
end
Explanation:
We need to consider only the last digit of the number because that determines the last digit of the power.
it's the maths property that count of possibility of each digits(0-9) power's last digit is at most 4.
1) Now if the exponent is zero we know the last digit would be 1.
2) Get the last digit by %10 on the number(n1)
3) %4 on the exponent(n2)- if the output is zero we have to consider that as 4 because n2 can't be zero. if %4 is non zero we have to consider %4 value.
4) now we have at most 9**4. This is easy for the computer to calculate.
take the %10 on that number. You have the last digit.

How can I take the modulus of two very large numbers?

I need an algorithm for A mod B with
A is a very big integer and it contains digit 1 only (ex: 1111, 1111111111111111)
B is a very big integer (ex: 1231, 1231231823127312918923)
Big, I mean 1000 digits.
To compute a number mod n, given a function to get quotient and remainder when dividing by (n+1), start by adding one to the number. Then, as long as the number is bigger than 'n', iterate:number = (number div (n+1)) + (number mod (n+1))Finally at the end, subtract one. An alternative to adding one at the beginning and subtracting one at the end is checking whether the result equals n and returning zero if so.
For example, given a function to divide by ten, one can compute 12345678 mod 9 thusly:
12345679 -> 1234567 + 9
1234576 -> 123457 + 6
123463 -> 12346 + 3
12349 -> 1234 + 9
1243 -> 124 + 3
127 -> 12 + 7
19 -> 1 + 9
10 -> 1
Subtract 1, and the result is zero.
1000 digits isn't really big, use any big integer library to get rather fast results.
If you really worry about performance, A can be written as 1111...1=(10n-1)/9 for some n, so computing A mod B can be reduced to computing ((10^n-1) mod (9*B)) / 9, and you can do that faster.
Try Montgomery reduction on how to find modulo on large numbers - http://en.wikipedia.org/wiki/Montgomery_reduction
1) Just find a language or package that does arbitrary precision arithmetic - in my case I'd try java.math.BigDecimal.
2) If you are doing this yourself, you can avoid having to do division by using doubling and subtraction. E.g. 10 mod 3 = 10 - 3 - 3 - 3 = 1 (repeatedly subtracting 3 until you can't any more) - which is incredibly slow, so double 3 until it is just smaller than 10 (e.g. to 6), subtract to leave 4, and repeat.

Resources