Strange order of output of linq query - linq

I have a massive of int values.
int[] numbers= {5, 4, 1, 3, 9, 8, 6, 7, 2, 0};
I divide this numbers on the modulo:
var numberGroups=from n in numbers group n by n%5 into g select new { Remainder=g.Key, Numbers=g};
and I have such strange order of results:
// 0, 4, 1, 3, 2
In my honour opinion the order should be:
//0, 4, 3, 1, 2
As the first number which can be divided by 5 is 5. // 0
The second number which can be divided by 5 is 9. // 4
The third number which can be divided by 5 is 8. // 3
The fourth number which can be divided by 5 is 6. // 1
The fifth number which can be divided by 5 is 7. // 2
I have such presentation of work group by. Where am I wrong?
Questions:
Why does linq change order?
Why do I have such strange order?

The groups are added in the order that they are needed. Which is the order of the input to the group clause. Given 5 is the first value, the 0 group is created first... this can been seen with the following code examples.
With this code:
int[] numbers = { 5, 4, 1, 3, 9, 8, 6, 7, 2, 0 };
var numberGroups = from n in numbers group n by n % 5 into g select new { Remainder = g.Key, Numbers = g };
foreach(var ng in numberGroups)
{
Console.Write("{0} - ", ng.Remainder);
foreach (var v in ng.Numbers)
{
Console.Write(" {0}", v);
}
Console.WriteLine();
}
I get
0 - 5 0
4 - 4 9
1 - 1 6
3 - 3 8
2 - 7 2
which makes be believe there is no sort on the group operation, it is rather collecting the number in the order they are found, the first number 5, results in remainder 0 so the first output bucket is 0. We can test this by putting the 9 first and we get
4, 0, 1, 3, 2
To prove they add added to the buckets in the order they are processed, using these numbers
int[] numbers = { 15, 24, 31, 43, 59, 68, 76, 87, 92, 100 };
we get
0 - 15 100
4 - 24 59
1 - 31 76
3 - 43 68
2 - 87 92
15 is added first new slot 0, 24 is new slot 4, 31 net slot 1, 43 is new slot 3, 59 next 4 value, 68 is next 3 slot value, 76 next 1 slot, 87 new slot 2, 92 next 2, 100 next 0
to prove this more use:
int[] numbers = { 15, 25, 35, 45, 55, 65, 75, 85, 95, 105 };
we get
0 - 15 25 35 45 55 65 75 85 95 105
It might not be how you want it, but is makes 100% sense how/why it's happens.

Related

How to find the sum in a matrix with dynamic programming

Please, I would like to find the maximum sum with only one value per row. I already made the resolution by brute force and it is O (N^5). Now I would like to find a way with dynamic programming or another way to reduce the complexity.
For example:
Matrix:
100 5 4 3 1
90 80 70 60 50
70 69 65 20 10
60 20 10 5 1
50 45 15 6 1
Solution for 5 sets:
100 + 90 + 70 + 60 + 50 = 370
100 + 90 + 69 + 60 + 50 = 369
100 + 90 + 70 + 60 + 45 = 365
100 + 90 + 65 + 60 + 50 = 365
100 + 90 + 69 + 60 + 45 = 364
Sum: 1833
example for the sum with brute force:
for(int i=0; i<matrix[0].size(); i++) {
for(int j=0; j<matrix[1].size(); j++) {
for(int k=0; k<matrix[2].size(); k++) {
for(int l=0; l<matrix[3].size(); l++) {
for(int x=0; x<matrix[4].size(); x++) {
sum.push_back(matrix[0][i] + matrix[1][j] + matrix[2][k] + matrix[3][l] + matrix[4][x]);
}
}
}
}
}
sort(sum.begin(), sum.end(), mySort);
Thanks!
You can solve it in O(k*log k) time with Dijkstra's algorithm. A node in a graph is represented by a list with 5 indexes of the numbers in the corresponding rows of the matrix.
For example in the matrix
100 5 4 3 1
90 80 70 60 50
70 69 65 20 10
60 20 10 5 1
50 45 15 6 1
the node [0, 0, 2, 0, 1] represents the numbers [100, 90, 65, 60, 45]
The initial node is [0, 0, 0, 0, 0]. Every node has up to 5 outgoing edges increasing 1 of the 5 indexes by 1, and the distance between nodes is the absolute difference in the sums of the indexed numbers.
So for that matrix the edges from the node [0, 0, 2, 0, 1] lead:
to [1, 0, 2, 0, 1] with distance 100 - 5 = 95
to [0, 1, 2, 0, 1] with distance 90 - 80 = 10
to [0, 0, 3, 0, 1] with distance 65 - 20 = 45
to [0, 0, 2, 1, 1] with distance 60 - 20 = 40
to [0, 0, 2, 0, 2] with distance 45 - 15 = 30
With this setup you can use Dijkstra's algorithm to find k - 1 closest nodes to the initial node.
Update I previously used a greedy algorithm, which doesn't work for this problem. Here is a more general solution.
Suppose we've already found the combinations with the top m highest sums. The next highest combination (number m+1) must be 1 step away from one of these, where a step is defined as shifting focus one column to the right in one of the rows of the matrix. (Any combination that is more than one step away from all of the top m combinations cannot be the m+1 highest, because you can convert it to a higher one that is not in the top m by undoing one of those steps, i.e., moving back toward one of the existing combinations.)
For m = 1, we know that the "m highest combinations" just means the combination made by taking the first element of each row of the matrix (assuming each row is sorted from highest to lowest). So then we can work out from there:
Create a set of candidate combinations to consider for the next highest position. This will initially hold only the highest possible combination (first column of the matrix).
Identify the candidate with the highest sum and move that to the results.
Find all the combinations that are 1 step away from the one that was just added to the results. Add all of these to the set of candidate combinations. Only n of these will be added each round, where n is the number of rows in the matrix. Some may be duplicates of previously identified candidates, which should be ignored.
Go back to step 2. Repeat until there are 5 results.
Here is some Python code that does this:
m = [
[100, 5, 4, 3, 1],
[90, 80, 70, 60, 50],
[70, 69, 65, 20, 10],
[60, 20, 10, 5, 1],
[50, 45, 15, 6, 1]
]
n_cols = len(m[0]) # matrix width
# helper function to calculate the sum for any combination,
# where a "combination" is a list of column indexes for each row
score = lambda combo: sum(m[r][c] for r, c in enumerate(combo))
# define candidate set, initially with single highest combination
# (this set could also store the score for each combination
# to avoid calculating it repeatedly)
candidates = {tuple(0 for row in m)}
results = set()
# get 5 highest-scoring combinations
for i in range(5):
result = max(candidates, key=score)
results.add(result)
candidates.remove(result) # don't test it again
# find combinations one step away from latest result
# and add them to the candidates set
for j, c in enumerate(result):
if c+1 >= n_cols:
continue # don't step past edge of matrix
combo = result[:j] + (c+1,) + result[j+1:]
if combo not in results:
candidates.add(combo) # drops dups
# convert from column indexes to actual values
final = [
[m[r][c] for r, c in enumerate(combo)]
for combo in results
]
final.sort(key=sum, reverse=True)
print(final)
# [
# [100, 90, 70, 60, 50]
# [100, 90, 69, 60, 50],
# [100, 90, 70, 60, 45],
# [100, 90, 65, 60, 50],
# [100, 90, 69, 60, 45],
# ]
If you want just maximum sum, then sum maximum value at each row.
That is,
M = [[100, 5, 4, 3, 1],
[90, 80, 70, 60, 50],
[70, 69, 65, 20, 10],
[60, 20, 10, 5, 1],
[50, 45, 15, 6, 1]]
sum(max(row) for row in M)
Edit
It is not necessary to use dynamic programming, etc.
There is simple rule: select next number considering difference between the number and current number.
Here is a code using numpy.
import numpy as np
M = np.array(M)
M = -np.sort(-M, axis = 1)
k = 3
answer = []
ind = np.zeros(M.shape[0], dtype = int)
for _ in range(k):
answer.append(sum(M[list(range(M.shape[0])), ind]))
min_ind = np.argmin(M[list(range(len(ind))), ind] - M[list(range(len(ind))), ind+1])
ind[min_ind] += 1
Result is [370, 369, 365].

Splitting 61 into integer partitions

In a program I am doing, I have a vector called a = [5, 6, 7] and I have to split integer 61 into additive partitions using integer from this list. So one example would be
61 = 5 + 6 + 7 + 7 + 6 + 6 + 6 + 6 + 6 + 6
There are many ways to split this. I have to do this programmatically. One approach I found is as follows. I don't know if this always will give result. First I check if 61 is divisible by any number in the list. If it is, then I can just use that number to add many times (i.e. quotient) to get 61. In this case, 61 is a prime number. So this will fail. Next step is to
take first number in the list (in our case, 5) and subtract it from 61 and try to see if the answer is divisible by any member in the list. If it is, then we again found a way to do addition. In this case, subtracting 5 from 61 gives 56, which is divisible by 7 and our solution would be
61 = 5 + 7 + 7 + 7 + 7 + 7 + 7 + 7 + 7
In this manner we continue down the list until we find some answer after subtraction which is divisible by a member in the list.
Now the given list to me, [5, 6, 7] is such that there exists an integer partition such that, we can get 61 from additions using that integer partition. So we won't have to worry whether a solution exists. So my approach seems very crude. I wonder if there is an efficient way to do this
using some algorithm from combinatorics. So my final answer should be a list
of numbers from the integer partition. So one possible answer would be
[5, 6, 7, 7, 6, 6, 6, 6, 6, 6]
thanks
The generic root of 61 is 7
Multiple of 7 nearing 61 will be 8 hence , subtracting 61-7*8 =gave 56,
Similarly multiple of 6 nearing 61 with a gap of 7 will then be 9,
subtracting 61-6*9 =gave 54,then result % remaining two from {5,7} to be satisfied should be zero
Getting intermediate sum and finding their generic root combined with displacement can give the answer.
Hope this helps. !! Happy to help further
I think I found a crude logic here. Let's say given list is [5, 6, 7] and number is 61. we need to find additive list such that total is 61. one such example is
61 = 5 + 6 + 7 + 7 + 6 + 6 + 6 + 6 + 6 + 6
another would be
61 = 5 + 7 + 7 + 7 + 7 + 7 + 7 + 7 + 7
Our job is to get the numbers on the right side as a list. So one possible solution would be
[5, 6, 7, 7, 6, 6, 6, 6, 6, 6]
My algorithm is as follows. First subtract all list members from 61. So 61 - 5 - 6 - 7 = 43. So we get first three members from the list, which are 5, 6, 7. Now remainder here is 43. To get other members, we subtract each of 5, 6, 7 from 43 , one at a time and see if the answer is divisible by any of 5, 6, 7. So
43 - 5 = 38 -> not divisible by any [5, 6, 7]
43 - 6 = 37 -> not divisible by any [5, 6, 7]
43 - 7 = 36 -> divisible by 6 in [5, 6, 7]
and quotient is 6, which means we have to use 6 of 6's and 7 which we subtracted last from 43. So the list would be original list plus 7 and a
list of 6 with length 6. So one possible solution I found is
[5, 6, 7, 7, 6, 6, 6, 6, 6, 6]
And we can verify that the sum is 61. I wrote a program in R programming language. Here it is
get_me_list <- function(number, mylist){
rest <- number - sum(mylist)
flag = FALSE
for(i in seq_along(mylist)){
answer <- rest - mylist[i]
for(j in seq_along(mylist)) {
if( answer %% mylist[j] == 0){
repeat_factor <- answer / mylist[j]
number_to_repeat <- mylist[j]
pivot <- mylist[i]
flag = TRUE
break
}
}
if(flag){
break
}
}
final_list <- c(mylist, pivot, rep(number_to_repeat, repeat_factor))
final_list
}
So get_me_list function takes two inputs. number and my_list. In my case number = 61 and my_list = [5, 6, 7] . In R, a vector is written as
c(5, 6, 7) or 5:7 if its a sequence from 5 to 7. So the output I get is
c(5, 6, 7, 7, 6, 6, 6, 6, 6, 6)
which is a vector in R. I tried giving various values of my_list and compared the solution with manual solution which I calculated using pen and paper. I am getting correct answer with the above code. I don't know if this approach would be valid always. I am off course assuming that its possible to get the sum equal to number using only the members of the my_list.
Please comment on code.
thanks

Retrieving elements from array regarding to an accumulating parameter

Assume that there are 2 arrays of elements and a function call will return elements within them. Each time a retrieval is performed, 8 elements will be retrieved from array 1, while 2 will be retrieved from array 2. And the elements to be retrieved is indicated by a number provided, assume that list 1 has 35 elements, and list 2 has 7, the situation will be like:
Assume the 2 arrays are:
array 1: 0, 1, 2, 3, 4, ..., 35
array 2: 0, 1, 2, 3, 4, 5, 6
number provided elements from array 1 elements from array 2
1 0, 1, 2, 3, 4, 5, 6, 7 0, 1
11 8, 9, 10, 11, 12, 13, 14, 15 2, 3
21 16, 17, 18, 19, 20, 21, 22, 23 4, 5
31 24, 25, 26, 27, 28, 29, 30, 31 6
40 32, 33, 34, 35 0, 1
46 0, 1, 2, 3, 4, 5, 6, 7 2, 3
56 8, 9, 10, 11, 12, 13, 14, 15 4, 5
66 16, 17, 18, 19, 20, 21, 22, 23 6
75 24, 25, 26, 27, 28, 29, 30, 31 0, 1
85 32, 33, 34, 35 2, 3
...
Each time a retrieval is done, the count of numbers returned will be added to the last provided number become the next provided number. If one of the list is exhausted (remaining elements fewer than 8), then the remaining numbers will be retrieved from that list, and next time it will start retrieving elements start from index 0 again, like the situations when number 31 and 40 is passed.
The question is, is there anyway to determine what position to start in both array when a number is provided? e.g. when number 40 is given, I should start at 32 in list 1, and 0 in list 2. Like the above situation, list one is exhausted every 5th retrieval, while list 2 exhausted at every 4th retrieval, but since the provided number is based on the accumulated count of number retrieved, how can I determine where to start this time when a number is given?
I have been thinking this for days and really feel frustrated about it. Thanks for any help!
Their is a cycle. And one cycle will have total_num numbers, we can get total_num from the code bellow:
def get_one_cycle_numbers:
n = len(a) / 8
m = len(b) / 2
g = gcd(n, m)
total_num = len(a) * n / g + len(b) * m / g
return total_num
When we get the provided number num we just num = num % total_num and simulate the cycle.
PS: Hope I got the right understanding of the question.

Equality Between Base 10 and Base 16

From my textbook:
What does it mean when it says 37 subscript(16) = 55 subscript(10)?
It means 37 base 16 (Hexadecimal), and 55 base 10 (Decimal). The 0x preceding a number denotes that it is base 16 hexadecimal.
To see how they are equal lets first look at the place values of 55
5, 5 (digits)
10, 1 (place values)
They are 10 to the power of the number of places over they are so 10^0 = 1 for the ones, and 10^1 = 10 for the tens.
You have a 5 in the ones place giving you 5, and 5 in the tens place giving you 50 when you add them together you get 55.
5 * 10 = 50
5 * 1 = 5
5 + 50 = 55
The 37 is in Hexadecimal which means its base is 16 so the place values are 16 to the power of the number of places over which gives you
3, 7 (digits)
16, 1 (place values)
3 * 16 = 48
7 * 1 = 7
48 + 7 = 55
Because the Hexadecimal system requires 16 unique numerals it uses the letters a-f as well
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, a, b, c, d, e, f
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15
Also because you might see these 0b denotes base 2 (Binary), and 0o denotes base 8 (Octal).

finding maximal subsets

For given n, find the subset S of {1,2,...,n} such that
all elements of S are coprime
the sum of the elements of S is as large as possible
Doing a brute force search takes too long and I can't find a pattern. I know that I can just take all the primes from 1 to n, but that's probably not the right answer. Thanks.
I would tackle this as a dynamic programming problem. Let me walk through it for 20. First take the primes in reverse order.
19, 17, 13, 11, 7, 5, 3, 2
Now we're going to walk up the best solutions which have used subsets of those primes of increasing size. We're going to do a variation of breadth first search, but with the trick that we always use the largest currently unused prime (plus possibly more). I will represent all of the data structures in the form size: {set} = (total, next_number). (I'm doing this by hand, so all mistakes are mine.) Here is how we build up the data structure. (In each step I consider all ways of growing all sets of one smaller size from the previous step, and take the best totals.)
Try to reproduce this listing and, modulo any mistakes I made, you should have an algorithm.
Step 0
0: {} => (1, 1)
Step 1
1: {19} => (20, 19)
Step 2
2: {19, 17} => (37, 17)
Step 3
3: {19, 17, 13} => (50, 13)
Step 4
4: {19, 17, 13, 11} => (61, 11)
Step 5
5: {19, 17, 13, 11, 7} => (68, 7)
6: {19, 17, 13, 11, 7, 2} => (75, 14)
Step 6
6: {19, 17, 13, 11, 7, 5} => (73, 5)
{19, 17, 13, 11, 7, 2} => (75, 14)
7: {19, 17, 13, 11, 7, 5, 2} => (88, 20)
{19, 17, 13, 11, 7, 5, 3} => (83, 15)
Step 7
7: {19, 17, 13, 11, 7, 5, 2} => (88, 20)
{19, 17, 13, 11, 7, 5, 3} => (83, 15)
8: {19, 17, 13, 11, 7, 5, 3, 2} => (91, 18)
Step 8
8: {19, 17, 13, 11, 7, 5, 3, 2} => (99, 16)
And now we just trace the data structures backwards to read off 16, 15, 7, 11, 13, 17, 19, 1 which we can sort to get 1, 7, 11, 13, 15, 16, 17, 19.
(Note there are a lot of details to get right to turn this into a solution. Good luck!)
You can do a little better by taking powers of primes, up the to bound you have. For example, suppose that n=30. Then you want to start with
1, 16, 27, 25, 7, 11, 13, 17, 19, 23, 29
Now look at where there are places to improve. Certainly you cannot increase any of the primes that are already at least n/2: 17, 19, 23, 29 (why?). Also, 3^3 and 5^2 are pretty close to 30, so they're also probably best left alone (why?).
But what about 2^4, 7, 11 and 13? We can take the 2's and combine them with 7, 11, or 13. This would give:
2 * 13 = 26 replaces 16 + 13 = 29 BAD
2 * 11 = 22 replaces 16 + 11 = 27 BAD
2^2 * 7 = 28 replaces 16 + 7 = 23 GOOD
So it looks like we should get the following list (now sorted):
1, 11, 13, 17, 19, 23, 25, 27, 28, 29
Try to prove that this cannot be improved, and that should give you some insight into the general case.
Good luck!
The following is quite practical.
Let N = {1, 2, 3, ..., n}.
Let p1 < p2 < p3 < ... < pk be the primes in N.
Let Ti be the natural numbers in N divisible by pi but not by any prime less than pi.
We can pick at most one number from each subset Ti.
Now recurse.
S = {1}.
Check if pi is a divisor of any of the numbers already in S. If it is, skip Ti.
Otherwise, pick a number xi from Ti coprime to the elements already in S, and add it to S.
Go to next i.
When we reach k + 1, calculate the sum of the elements in S. If new maximum, save S away.
Continue.
Take n = 30.
The primes are 2, 3, 5, 7, 11, 13, 17, 19, 23, and 29.
T1 = {2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30}
T2 = {3, 9, 15, 21, 27}
T3 = {5, 25}
T4 = {7}
T5 = {11}
T6 = {13}
T7 = {17}
T8 = {19}
T9 = {23}
T10 = {29}
So fewer than 15 * 5 * 2 = 150 possibilities.
Here is my original wrong result for n = 100.
1 17 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 88 89 91 95 97 99
Sum = 1374
It should be
1 17 23 29 31 37 41 43 47 53 59 61 67 71 73 79 81 83 88 89 91 95 97
Sum = 1356
Less than 2 seconds for n = 150. About 9 seconds for n = 200.
I think that this is similar to the subset problem, which is NP-Complete.
First, break each number into its prime factors (or use a list of primes to generate the full list from 1 to n, same thing).
Solve the subset problem with recursive descend by finding a subset that contains no common primes.
Run through all solutions and find the largest one.
I implemented a recursive solution in Prolog, based on taking the list of integers in descending order. On my fairly ancient Toshiba laptop, SWI-Prolog produces answers without hesitation for N < 90. Here are some timings for N = 100 to 150 by tens:
N Sum Time(s)
----- --------- -------
100 1356 1.9
110 1778 2.4
120 1962 4.2
130 2273 11.8
140 2692 16.3
150 2841 30.5
The timings reflect an implementation that starts from scratch for each value of N. A lot of the computation for N+1 can be skipped if the result for N is previously known, so if a range of values N are to be computed, it would make sense to take advantage of that.
Prolog source code follows.
/*
Check if two positive integers are coprime
recursively via Euclid's division algorithm
*/
coprime(0,Z) :- !, Z = 1.
coprime(A,B) :-
C is B mod A,
coprime(C,A).
/*
Find the sublist of first argument that are
integers coprime to the second argument
*/
listCoprime([ ],_,[ ]).
listCoprime([H|T],X,L) :-
( coprime(H,X)
-> L = [H|M]
; L = M
),
listCoprime(T,X,M).
/*
Find the sublist of first argument of coprime
integers having the maximum possible sum
*/
sublistCoprimeMaxSum([ ],S,[ ],S).
sublistCoprimeMaxSum([H|T],A,L,S) :-
listCoprime(T,H,R),
B is A+H,
sublistCoprimeMaxSum(R,B,U,Z),
( T = R
-> ( L = [H|U], S = Z )
; ( sublistCoprimeMaxSum(T,A,V,W),
( W < Z
-> ( L = [H|U], S = Z )
; ( L = V, S = W )
)
)
).
/* Test utility to generate list N,..,1 */
list1toN(1,[1]).
list1toN(N,[N|L]) :-
N > 1,
M is N-1,
list1toN(M,L).
/* Test calling sublistCoprimeMaxSum/4 */
testCoprimeMaxSum(N,CoList,Sum) :-
list1toN(N,L),
sublistCoprimeMaxSum(L,0,CoList,Sum).

Resources