Related
A number N is given in the range 1 <= N <= 10^50. A function F(x) is defined as the sum of all digits of a number x. We have to find the count of number of special pairs (x, y) such that:
1. 0 <= x, y <= N
2. F(x) + F(y) is prime in nature
We have to count (x, y) and (y, x) only once.
Print the output modulo 1000000000 + 7
My approach:
Since the maximum value of sum of digits in given range can be 450 (If all the characters are 9 in a number of length 50, which gives 9*50 = 450). So, we can create a 2-D array of size 451*451 and for all pair we can store whether it is prime or not.
Now, the issue I am facing is to find all the pairs (x, y) for given number N in linear time (Obviously, we cannot loop through 10^50 to find all the pairs). Can someone suggest any approach, or any formula (if exists), to get all the pairs in linear time.
You can create a 2-D array of size 451*451 and for all pair we can store whether it is prime or not. At the same time if you know how many numbers less than n who have F(x)=i and how many have F(x)=j, then after checking (i+j) is prime or not you can easily find a result with the state (i,j) of 2-D array of size 451*451.
So what you need is finding the total numbers who have F(x) =i.
You can easily do it using digit dp:
Digit DP for finding how many numbers who have F(x)=i:
string given=convertIntToString(given string);
int DP[51][2][452]= {-1};
Initially all index hpolds -1;
int digitDp(int pos,int small,int sum)
{
if(pos==given.size())
{
if(sum==i) return 1;
else return 0;
}
if(DP[pos][small][sum]!=-1)return DP[pos][small][sum];
int res=0;
if(small)
{
for(int j=0; j<=9; j++)res=(res+digitDp(pos+1,small,sum+j))%1000000007;
}
else
{
int hi=given[pos]-'0';
for(int j=0; j<=hi; j++)
{
if(j==hi)res=(res+digitDp(pos+1,small,sum+j))%1000000007;
else res=(res+digitDp(pos+1,1,sum+j))%1000000007;
}
}
return DP[pos][small][sum]=res;
}
This function will return the total numbers less than n who have F(x)=i.
So we can call this function for every i from 0 to 451 and can store the result in a temporary variable.
int res[452];
for(i=0;i<=451;i++){
memset(DP,-1,sizeof DP);
res[i]=digitDp(0,0,0);
}
Now test for each pair (i,j) :
int answer=0;
for(k=0;k<=451;k++){
for(int j=0;j<=451;j++){
if(isprime(k+j)){
answer=((log long)answer+(long long)res[k]*(long long)res[j])%1000000007;
}
}
}
finally result will be answer/2 as (i,j) and (j,i) will be calculated once.
Although there is a case for i=1 and j=1 , Hope you will be able to handle it.
Here's the answer in Python if which makes the code easily readable and a bit easier to understand.
primes = set([2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149, 151, 157, 163, 167, 173, 179, 181, 191, 193, 197, 199, 211, 223, 227, 229, 233, 239, 241, 251, 257, 263, 269, 271, 277, 281, 283, 293, 307, 311, 313, 317, 331, 337, 347, 349, 353, 359, 367, 373, 379, 383, 389, 397, 401, 409, 419, 421, 431, 433, 439, 443, 449, 457, 461, 463, 467, 479, 487, 491, 499, 503, 509, 521, 523, 541, 547, 557, 563, 569, 571, 577, 587, 593, 599, 601, 607, 613, 617, 619, 631, 641, 643, 647, 653, 659, 661, 673, 677, 683, 691, 701, 709, 719, 727, 733, 739, 743, 751, 757, 761, 769, 773, 787, 797, 809, 811, 821, 823, 827, 829, 839, 853, 857, 859, 863, 877, 881, 883, 887, 907, 911, 919, 929, 937, 941, 947, 953, 967, 971, 977, 983, 991, 997])
DP = []
given = ''
k = 0
def memset():
global DP
DP = [[[-1 for k in range(452)] for j in range(2)] for i in range(51)]
def digitDp(pos , small , final):
global DP , k
if pos == len(given):
if final == k:
return 1
else:
return 0
if DP[pos][small][final] != -1:
return DP[pos][small][final]
res = 0
if small:
for i in range(10):
res=(res+digitDp(pos+1,small,final+i))% 1000000007
else:
hi = int(given[pos]) - 0
for i in range(hi+1):
if(i == hi):
res= (res + digitDp(pos + 1 , small, final + i)) % 1000000007
else:
res = (res + digitDp(pos + 1 , 1 , final + i)) % 1000000007
DP[pos][small][final] = res
return DP[pos][small][final]
def main():
result = [0] * 452
global primes , k , given
given = str(input())
for k in range(452):
memset()
result[k] = digitDp(0 , 0 , 0)
answer = 0
for i in range(452):
for j in range(452):
if (i+j) in primes:
answer = (answer + result[i] * result[j]) % 1000000007
print(answer // 2)
main()
Thanks to #mahbubcseju for providing the solution to this problem.
I have some code that I paralellized using Rayon hoping to improve its performance, but the results, measured by the Bencher, were... most unimpressive. I suspected that it might be caused by the way I am performing the benchmarks (maybe they are run in parallel?), so I tested a simpler case.
Consider the following parallelized code, based on the quick_sort crate:
#![feature(test)]
extern crate rayon;
extern crate test;
use test::Bencher;
use std::cmp::Ordering;
pub fn sort<T>(arr: &mut [T])
where T: Send + std::cmp::PartialEq + Ord
{
qsort(arr, find_pivot, &|a, b| a.cmp(b))
}
pub fn sort_by<T, F>(arr: &mut [T], compare: &F)
where T: Send + std::cmp::PartialOrd,
F: Sync + Fn(&T, &T) -> Ordering
{
qsort(arr, find_pivot, compare);
}
fn qsort<T, F>(arr: &mut [T], pivot: fn(&[T], &F) -> usize, compare: &F)
where T: Send + std::cmp::PartialOrd,
F: Sync + Fn(&T, &T) -> Ordering
{
let len = arr.len();
if len <= 1 {
return;
}
let p = pivot(arr, compare);
let p = partition(arr, p, compare);
let (l, r) = arr.split_at_mut(p + 1);
if p > len / 2 {
rayon::join(
|| qsort(r, pivot, compare),
|| qsort(l, pivot, compare)
);
} else {
rayon::join(
|| qsort(l, pivot, compare),
|| qsort(r, pivot, compare)
);
}
}
fn find_pivot<T, F>(arr: &[T], compare: &F) -> usize
where T: Send + std::cmp::PartialOrd,
F: Sync + Fn(&T, &T) -> Ordering
{
let (l, r) = (0, arr.len() - 1);
let m = l + ((r - 1) / 2);
let (left, middle, right) = (&arr[l], &arr[m], &arr[r]);
if (compare(middle, left) != Ordering::Less) && (compare(middle, right) != Ordering::Greater) {
m
} else if (compare(left, middle) != Ordering::Less) &&
(compare(left, right) != Ordering::Greater) {
l
} else {
r
}
}
fn partition<T, F>(arr: &mut [T], p: usize, compare: &F) -> usize
where T: std::cmp::PartialOrd,
F: Sync + Fn(&T, &T) -> Ordering
{
if arr.len() <= 1 {
return p;
}
let last = arr.len() - 1;
let mut next_pivot = 0;
arr.swap(last, p);
for i in 0..last {
if compare(&arr[i], &arr[last]) == Ordering::Less {
arr.swap(i, next_pivot);
next_pivot += 1;
}
}
arr.swap(next_pivot, last);
next_pivot
}
#[bench]
fn bench_qsort(b: &mut Bencher) {
let mut vec = vec![ 3, 97, 50, 56, 58, 80, 91, 71, 83, 65,
92, 35, 11, 26, 69, 44, 42, 75, 40, 43,
63, 5, 62, 56, 35, 3, 51, 97, 100, 73,
42, 41, 79, 86, 93, 58, 65, 96, 66, 36,
17, 97, 6, 16, 52, 30, 38, 14, 39, 7,
48, 83, 37, 97, 21, 58, 41, 59, 97, 37,
97, 9, 24, 78, 77, 7, 78, 80, 11, 79,
42, 30, 39, 27, 71, 61, 12, 8, 49, 62,
69, 48, 8, 56, 89, 27, 1, 80, 31, 62,
7, 15, 30, 90, 75, 78, 22, 99, 97, 89];
b.iter(|| { sort(&mut vec); } );
}
Results of cargo bench:
running 1 test
test bench_qsort ... bench: 10,374 ns/iter (+/- 296) // WHAT
While the results for the sequential code (extern crate quick_sort) are:
running 1 test
test bench_qsort ... bench: 1,070 ns/iter (+/- 65)
I also tried benchmarking with longer vectors, but the results were consistent. In addition, I performed some tests using GNU time and it looks like the parallel code is faster with bigger vectors, as expected.
What am I doing wrong? Can I use Bencher to benchmark parallel code?
The array you use in the test is so small that the parallel code really is slower in that case.
There's some overhead to launching tasks in parallel, and the memory access will be slower when different threads access memory on the same cache line.
For iterators to avoid overhead on tiny arrays there's with_min_len, but for join you probably need to implement parallel/non-parallel decision yourself.
With 100 times larger array:
with rayon: 3,468,891 ns/iter (+/- 95,859)
without rayon: 4,227,220 ns/iter (+/- 635,260)
rayon if len > 1000: 3,166,570 ns/iter (+/- 66,329)
The relatively small speed-up is expected for this task, because it's memory-bound (there's no complex computation to parallelize).
There is the problem of trying to see if two unique strings are anagrams of each other. The first solution that I had considered would be to sort both strings and see if they were equal to each other.
I have been considering another solution and I would like to discuss if the same would be feasible.
The idea would be to assign a numerical value to each character and sum it up such that a unique set of characters would produce a unique value. As we are testing for anagrams, we do not mind if the checksum of "asdf" and "adsf" are the same -- in fact, we require it to be that way. However the checksum of strings "aa" and "b" shouldn't be equal.
I was considering assigning the first 52 prime numbers to alphabets "a" through "z" and then "A" through "Z"(assume we only have alphabets).
The above scheme would break if the sum of any two or more primes in the set of 52 primes could result in another prime existing in the set.
My doubts are :-
Is there any numbering scheme that would satify my requirements?
I'm unsure about the math involved ; is it possible to prove/is there any proof that suggests that the sum of two or more primes in the set of the first 52 primes has at least one value that exists in the same set?
Thanks.
Use multiplication instead of addition. Primes are "multiplicatively unique", but not "additively unique".
A slightly more clunky way to do it would require the length of your longest string max_len (or the largest number of any specific character for slightly better performance). Given that, your hash could look like
number_of_a*max_len^51 + number_of_b*max_len^50 + ... + number_of_Z*max_len^0
If you preferred to use primes, multiplication will work better, as previously mentioned.
Of course, you could achieve the same effect by having an array of 52 values instead.
You are trying to compare two sorted strings for equality by comparing two n-bit numbers for equality. As soon as your strings are long enough that there are more than 2^n possible sorted strings you will definitely have two different sorted strings that produce the same n-bit number. It is likely, by the http://en.wikipedia.org/wiki/Birthday_problem, that you will hit problems before this, unless (as with multiplication of primes) there is some theorem saying that you cannot have two different strings from the same number.
In some cases you might save time by using this idea as a quick first check for equality, so that you only need to compare sorted strings if their numbers match.
Don't use prime numbers - prime numbers properties are related to division, not sums.
However, the idea is good, you could use bit sets but you would hit another problem - duplicate letters (same problem with primes, 1+1+1=3).
So, you can use an integer sets, an array 1...26 of frequency of letters.
def primes(n):
array = [i for i in range(2,n+1)]
p = 2
while p <= n:
i = 2*p
while i <= n:
array[i-2] = 0
i += p
p += 1
return [num for num in array if num > 0]
def anagram(a):
# initialize a list
anagram_list = []
for i in a:
for j in a:
if i != j and (sorted(str(i))==sorted(str(j))):
anagram_list.append(i)
return anagram_list
if __name__ == '__main__':
print("The Prime Numbers are:\n",primes(1000),"\n")
a = primes(1000)
print("Prime Numbers between 0 to 100:")
T100 = a[:25]
print(T100,"\n")
print("The Anagram elements from 0 to 100 are listed :", anagram(T100),"\n")
print("Prime Numbers between 101 to 200:")
T200 = a[25:46]
print(T200,"\n")
print("The Anagram elements from 101 to 200 are listed :",anagram(T200),"\n")
print("Prime Numbers between 201 to 300:")
T300 = a[46:62]
print(T300,"\n")
print("The Anagram elements from 201 to 300 are listed :",anagram(T300),"\n")
print("Prime Numbers between 301 to 400:")
T400 = a[62:78]
print(T400,"\n")
print("The Anagram elements from 301 to 400 are listed :",anagram(T400),"\n")
print("Prime Numbers between 401 to 500:")
T500 = a[78:95]
print(T500,"\n")
print("The Anagram elements from 401 to 500 are listed :",anagram(T500),"\n")
print()
print("Prime Numbers between 501 to 600:")
T600 = a[95:109]
print(T600,"\n")
print("The Anagram elements from 501 to 600 are listed :",anagram(T600),"\n")
print("Prime Numbers between 601 to 700:")
T700 = a[109:125]
print(T700,"\n")
print("The Anagram elements from 601 to 700 are listed :",anagram(T700),"\n")
print("Prime Numbers between 701 to 800:")
T800 = a[125:139]
print(T800,"\n")
print("The Anagram elements from 701 to 800 are listed :",anagram(T800),"\n")
print()
print("Prime Numbers between 801 to 900:")
T900 = a[139:154]
print(T900,"\n")
print("The Anagram elements from 801 to 900 are listed :",anagram(T900),"\n")
print("Prime Numbers between 901 to 1000:")
T1000 = a[154:168]
print(T1000,"\n")
print("The Anagram elements from 901 to 1000 are listed :",anagram(T1000),"\n")
Out Put:
The Prime Numbers are:
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149, 151, 157, 163, 167, 173, 179, 181, 191, 193, 197, 199, 211, 223, 227, 229, 233, 239, 241, 251, 257, 263, 269, 271, 277, 281, 283, 293, 307, 311, 313, 317, 331, 337, 347, 349, 353, 359, 367, 373, 379, 383, 389, 397, 401, 409, 419, 421, 431, 433, 439, 443, 449, 457, 461, 463, 467, 479, 487, 491, 499, 503, 509, 521, 523, 541, 547, 557, 563, 569, 571, 577, 587, 593, 599, 601, 607, 613, 617, 619, 631, 641, 643, 647, 653, 659, 661, 673, 677, 683, 691, 701, 709, 719, 727, 733, 739, 743, 751, 757, 761, 769, 773, 787, 797, 809, 811, 821, 823, 827, 829, 839, 853, 857, 859, 863, 877, 881, 883, 887, 907, 911, 919, 929, 937, 941, 947, 953, 967, 971, 977, 983, 991, 997]
Prime Numbers between 0 to 100:
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97]
The Anagram elements from 0 to 100 are listed : [13, 17, 31, 37, 71, 73, 79, 97]
Prime Numbers between 101 to 200:
[101, 103, 107, 109, 113, 127, 131, 137, 139, 149, 151, 157, 163, 167, 173, 179, 181, 191, 193, 197, 199]
The Anagram elements from 101 to 200 are listed : [113, 131, 137, 139, 173, 179, 193, 197]
Prime Numbers between 201 to 300:
[211, 223, 227, 229, 233, 239, 241, 251, 257, 263, 269, 271, 277, 281, 283, 293]
The Anagram elements from 201 to 300 are listed : [239, 293]
Prime Numbers between 301 to 400:
[307, 311, 313, 317, 331, 337, 347, 349, 353, 359, 367, 373, 379, 383, 389, 397]
The Anagram elements from 301 to 400 are listed : [313, 331, 337, 373, 379, 397]
Prime Numbers between 401 to 500:
[401, 409, 419, 421, 431, 433, 439, 443, 449, 457, 461, 463, 467, 479, 487, 491, 499]
The Anagram elements from 401 to 500 are listed : [419, 491]
Prime Numbers between 501 to 600:
[503, 509, 521, 523, 541, 547, 557, 563, 569, 571, 577, 587, 593, 599]
The Anagram elements from 501 to 600 are listed : []
Prime Numbers between 601 to 700:
[601, 607, 613, 617, 619, 631, 641, 643, 647, 653, 659, 661, 673, 677, 683, 691]
The Anagram elements from 601 to 700 are listed : [613, 619, 631, 691]
Prime Numbers between 701 to 800:
[701, 709, 719, 727, 733, 739, 743, 751, 757, 761, 769, 773, 787, 797]
The Anagram elements from 701 to 800 are listed : []
Prime Numbers between 801 to 900:
[809, 811, 821, 823, 827, 829, 839, 853, 857, 859, 863, 877, 881, 883, 887]
The Anagram elements from 801 to 900 are listed : []
Prime Numbers between 901 to 1000:
[907, 911, 919, 929, 937, 941, 947, 953, 967, 971, 977, 983, 991, 997]
The Anagram elements from 901 to 1000 are listed : [919, 991]
You can also re-frame it how you want if you are a python developer.
If a newbee to Python please learn the concept of List.
I don't understand the complexity of the attempted examples so far, so I wrote a simple Python 3 example:
from operator import mul
from functools import reduce
TO_PRIME = dict( \
a=2, b=3, c=5, d=7, e=11, \
f=13, g=17, h=19, i=23, j=29, \
k=31, l=37, m=41, n=43, o=47, \
p=53, q=59, r=61, s=67, t=71, \
u=73, v=79, w=83, x=89, y=97, z=101 \
)
def anagram_product(string):
return reduce(mul, (TO_PRIME[char.lower()] for char in string if char.isalpha()), 1)
def anagram_check(string_1, string_2):
return anagram_product(string_1) == anagram_product(string_2)
# True examples
print(repr('Astronomer'), repr('Moon starer'), anagram_check('Astronomer', 'Moon starer'))
print(repr('The Morse code'), repr('Here come dots'), anagram_check('The Morse code', 'Here come dots'))
# False examples (near misses)
print(repr('considerate'), repr('cure is noted'), anagram_check('considerate', 'cure is noted'))
print(repr('debit card'), repr('bed credit'), anagram_check('debit card', 'bed credit'))
OUTPUT
> python3 test.py
'Astronomer' 'Moon starer' True
'The Morse code' 'Here come dots' True
'considerate' 'cure is noted' False
'debit card' 'bed credit' False
>
The next step is to get this from a product to a sum. One approach I imagine is to map the letters to irrational numbers instead of primes. These irrational numbers would need to be of a type that don't become rational through any sort of addition. Here's a crude example:
from math import pi
ROUND = 4
TO_IRRATIONAL = {letter: pi ** n for n, letter in enumerate('abcdefghijklmnopqrstuvwxyz')}
def anagram_sum(string):
return round(sum(TO_IRRATIONAL[char.lower()] for char in string if char.isalpha()), ROUND)
def anagram_check(string_1, string_2):
return anagram_sum(string_1) == anagram_sum(string_2)
# True examples
print(repr('Astronomer'), repr('Moon starer'), anagram_check('Astronomer', 'Moon starer'))
print(repr('The Morse code'), repr('Here come dots'), anagram_check('The Morse code', 'Here come dots'))
# False examples (near misses)
print(repr('considerate'), repr('cure is noted'), anagram_check('considerate', 'cure is noted'))
print(repr('debit card'), repr('bed credit'), anagram_check('debit card', 'bed credit'))
OUTPUT
> python3 test2.py
'Astronomer' 'Moon starer' True
'The Morse code' 'Here come dots' True
'considerate' 'cure is noted' False
'debit card' 'bed credit' False
>
I'm not saying this is an optimal set of irrational numbers, just a rough demonstration of the concept. (Note my need to use round() to make this work which points to one design flaw -- the finite representation of irrational numbers.)
Here is an implementation in c# using the prime numbers way:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace Anag
{
class Program
{
private static int[] primes100 = new int[]
{
3, 7, 11, 17, 23, 29, 37,
47, 59, 71, 89, 107, 131,
163, 197, 239, 293, 353,
431, 521, 631, 761, 919,
1103, 1327, 1597, 1931,
2333, 2801, 3371, 4049,
4861, 5839, 7013, 8419,
10103, 12143, 14591, 17519,
21023, 25229, 30293, 36353,
43627, 52361, 62851, 75431,
90523, 108631, 130363,
156437, 187751, 225307,
270371, 324449, 389357,
467237, 560689, 672827,
807403, 968897, 1162687,
1395263, 1674319, 2009191,
2411033, 2893249, 3471899,
4166287, 4999559, 5999471,
7199369
};
private static int[] getNPrimes(int _n)
{
int[] _primes;
if (_n <= 100)
_primes = primes100.Take(_n).ToArray();
else
{
_primes = new int[_n];
int number = 0;
int i = 2;
while (number < _n)
{
var isPrime = true;
for (int j = 2; j <= Math.Sqrt(i); j++)
{
if (i % j == 0 && i != 2)
isPrime = false;
}
if (isPrime)
{
_primes[number] = i;
number++;
}
i++;
}
}
return _primes;
}
private static bool anaStrStr(string needle, string haystack)
{
bool _output = false;
var needleDistinct = needle.ToCharArray().Distinct();
int[] arrayOfPrimes = getNPrimes(needleDistinct.Count());
Dictionary<char, int> primeByChar = new Dictionary<char, int>();
int i = 0;
int needlePrimeSignature = 1;
foreach (var c in needleDistinct)
{
if (!primeByChar.ContainsKey(c))
{
primeByChar.Add(c, arrayOfPrimes[i]);
i++;
}
}
foreach (var c in needle)
{
needlePrimeSignature *= primeByChar[c];
}
for (int j = 0; j <= (haystack.Length - needle.Length); j++)
{
var result = 1;
for (int k = j; k < needle.Length + j; k++)
{
var letter = haystack[k];
result *= primeByChar.ContainsKey(letter) ? primeByChar[haystack[k]] : 0;
}
_output = (result == needlePrimeSignature);
if (_output)
break;
}
return _output;
}
static void Main(string[] args)
{
Console.WriteLine("Enter needle");
var _needle = Console.ReadLine(); ;
Console.WriteLine("Enter haystack");
var _haystack = Console.ReadLine();
Console.WriteLine(anaStrStr(_needle, _haystack));
Console.ReadLine();
}
}
}
I have a two dimensional array of integers. I would like to write an optimized and fast code to sum all the columns of the two dimensional array.
Any thoughts how I might be able to do this using LINQ/PLINQ/TASK parallelization ?
Ex:
private int[,] m_indexes = new int[6,4] { {367, 40, 74, 15},
{535, 226, 74, 15},
{368, 313, 74, 15},
{197, 316, 74, 15},
{27, 226, 74, 15},
{194, 41, 74, 15} };
The simplest parallel implementation:
int[,] m_indexes = new int[6, 4] { {367, 40, 74, 15},
{535, 226, 74, 15},
{368, 313, 74, 15},
{197, 316, 74, 15},
{27, 226, 74, 15},
{194, 41, 74, 15} };
var columns = Enumerable.Range(0, 4);
int[] sums = new int[4];
Parallel.ForEach(columns, column => {
int sum = 0;
for (int i = 0; i < 6; i++) {
sum += m_indexes[i, column];
}
sums[column] = sum;
});
This code can obviously be "generalized" (use m_indexes.GetLength(0) and m_indexes.GetLength(1)).
LINQ:
var sums = columns.Select(
column => {
int sum = 0;
for (int i = 0; i < 6; i++) {
sum += m_indexes[i, column];
} return sum;
}
).ToArray();
Be sure to profile on real-world data here if you truly need to optimize for performance here.
Also, if you truly care about optimizing for performance, try to load up your array so that you summing across rows. You'll get better locality for cache performance that way.
Or maybe without for's :
List<List<int>> m_indexes = new List<List<int>>() { new List<int>(){367, 40, 74, 15},
new List<int>(){535, 226, 74, 15},
new List<int>(){368, 313, 74, 15},
new List<int>(){197, 316, 74, 15},
new List<int>(){27, 226, 74, 15},
new List<int>(){194, 41, 74, 15} };
var res = m_indexes.Select(x => x.Sum()).Sum();
Straightforward LINQ way:
var columnSums = m_indexes.OfType<int>().Select((x,i) => new { x, col = i % m_indexes.GetLength(1) } )
.GroupBy(x => x.col)
.Select(x => new { Column = x.Key, Sum = x.Sum(g => g.x) });
It might not be worth it to parallelize. If you need to access the array by index, you spend some cycles on bounds checking, so, as always with performance, do measure it.
Due to the wonders of branch prediction, a binary search can be slower than a linear search through an array of integers. On a typical desktop processor, how big does that array have to get before it would be better to use a binary search? Assume the structure will be used for many lookups.
I've tried a little C++ benchmarking and I'm surprised - linear search seems to prevail up to several dozen items, and I haven't found a case where binary search is better for those sizes. Maybe gcc's STL is not well tuned? But then -- what would you use to implement either kind of search?-) So here's my code, so everybody can see if I've done something silly that would distort timing grossly...:
#include <vector>
#include <algorithm>
#include <iostream>
#include <stdlib.h>
int data[] = {98, 50, 54, 43, 39, 91, 17, 85, 42, 84, 23, 7, 70, 72, 74, 65, 66, 47, 20, 27, 61, 62, 22, 75, 24, 6, 2, 68, 45, 77, 82, 29, 59, 97, 95, 94, 40, 80, 86, 9, 78, 69, 15, 51, 14, 36, 76, 18, 48, 73, 79, 25, 11, 38, 71, 1, 57, 3, 26, 37, 19, 67, 35, 87, 60, 34, 5, 88, 52, 96, 31, 30, 81, 4, 92, 21, 33, 44, 63, 83, 56, 0, 12, 8, 93, 49, 41, 58, 89, 10, 28, 55, 46, 13, 64, 53, 32, 16, 90
};
int tosearch[] = {53, 5, 40, 71, 37, 14, 52, 28, 25, 11, 23, 13, 70, 81, 77, 10, 17, 26, 56, 15, 94, 42, 18, 39, 50, 78, 93, 19, 87, 43, 63, 67, 79, 4, 64, 6, 38, 45, 91, 86, 20, 30, 58, 68, 33, 12, 97, 95, 9, 89, 32, 72, 74, 1, 2, 34, 62, 57, 29, 21, 49, 69, 0, 31, 3, 27, 60, 59, 24, 41, 80, 7, 51, 8, 47, 54, 90, 36, 76, 22, 44, 84, 48, 73, 65, 96, 83, 66, 61, 16, 88, 92, 98, 85, 75, 82, 55, 35, 46
};
bool binsearch(int i, std::vector<int>::const_iterator begin,
std::vector<int>::const_iterator end) {
return std::binary_search(begin, end, i);
}
bool linsearch(int i, std::vector<int>::const_iterator begin,
std::vector<int>::const_iterator end) {
return std::find(begin, end, i) != end;
}
int main(int argc, char *argv[])
{
int n = 6;
if (argc < 2) {
std::cerr << "need at least 1 arg (l or b!)" << std::endl;
return 1;
}
char algo = argv[1][0];
if (algo != 'b' && algo != 'l') {
std::cerr << "algo must be l or b, not '" << algo << "'" << std::endl;
return 1;
}
if (argc > 2) {
n = atoi(argv[2]);
}
std::vector<int> vv;
for (int i=0; i<n; ++i) {
if(data[i]==-1) break;
vv.push_back(data[i]);
}
if (algo=='b') {
std::sort(vv.begin(), vv.end());
}
bool (*search)(int i, std::vector<int>::const_iterator begin,
std::vector<int>::const_iterator end);
if (algo=='b') search = binsearch;
else search = linsearch;
int nf = 0;
int ns = 0;
for(int k=0; k<10000; ++k) {
for (int j=0; tosearch[j] >= 0; ++j) {
++ns;
if (search(tosearch[j], vv.begin(), vv.end()))
++nf;
}
}
std::cout << nf <<'/'<< ns << std::endl;
return 0;
}
and my a couple of my timings on a core duo:
AmAir:stko aleax$ time ./a.out b 93
1910000/2030000
real 0m0.230s
user 0m0.224s
sys 0m0.005s
AmAir:stko aleax$ time ./a.out l 93
1910000/2030000
real 0m0.169s
user 0m0.164s
sys 0m0.005s
They're pretty repeatable, anyway...
OP says: Alex, I edited your program to just fill the array with 1..n, not run std::sort, and do about 10 million (mod integer division) searches. Binary search starts to pull away from linear search at n=150 on a Pentium 4. Sorry about the chart colors.
binary vs linear search http://spreadsheets.google.com/pub?key=tzWXX9Qmmu3_COpTYkTqsOA&oid=1&output=image
I don't think branch prediction should matter because a linear search also has branches. And to my knowledge there are no SIMD that can do linear search for you.
Having said that, a useful model would be to assume that each step of the binary search has a multiplier cost C.
C log2 n = n
So to reason about this without actually benchmarking, you would make a guess for C, and round n to the next integer. For example if you guess C=3, then it would be faster to use binary search at n=11.
Not many - but hard to say exactly without benchmarking it.
Personally I'd tend to prefer the binary search, because in two years time, when someone else has quadrupled the size of your little array, you haven't lost much performance. Unless I knew very specifically that it's a bottleneck right now and I needed it to be as fast as possible, of course.
Having said that, remember that there are hash tables too; you could ask a similar question about them vs. binary search.