C# Linq Query to avoid reversed entries - linq

Below linq query gives two entries of the same items in nd. Looking for a way to remove the reversed entry.
//unit = new List<string>{"F1","F2","F3","F4","F5","F6","F7","F8","F9"}
//v["F3"]="12" v["F6"]="12"
var nd = (from n1 in unit
from n2 in unit
where n1 != n2 && v[n1].Length == 2 && v[n1] == v[n2]
select new {n1,n2}).ToList();
The values in nd is given as below. How can i avoid the 2nd entry?
Count = 2
[0]: { n1 = "F3", n2 = "F6" }
[1]: { n1 = "F6", n2 = "F3" }

The solution is extremely trivial. Instead of checking that the two entries are different, check if one is strictly greater than the other:
var nd = (from n1 in unit
from n2 in unit
where n1 > n2 && v[n1].Length == 2 && v[n1] == v[n2]
select new {n1,n2}).ToList();

You could select (n1, n2, min(n1, n2), max(n1, n2)), group by the min and max and select the first in each group.
var result = nd.GroupBy(item => new {item.min, item.max}).Select(grp => new {grp.First().n1, grp.First().n2});

You can compare if one is less than or greater than the other instead of just checking for inequality. The trick is well-known in SQL, for example : 1, 2.
In .NET LINQ implementation, you can use String.CompareOrdinal() method since comparing strings can't be as straightforward as using < or > operator :
var nd = (from n1 in unit
from n2 in unit
where String.CompareOrdinal(n1,n2) < 0 && v[n1].Length == 2 && v[n1] == v[n2]
select new { n1, n2 }).ToList();

Related

Efficient algorithm for converting a "pop list" into an "index list"

Suppose I have a list of items:
[ A, B, C, D ]
and a "pop list":
[ 2, 0, 1, 0 ]
Let f(x,p) = y be a function that pops the indices p from x into a new list, y.
Using this process, you can compute
f([ A, B, C, D ], [ 2, 0, 1, 0 ]) = [ C, A, D, B ]
However, the cost of f is impractical, because it pops from a list and joins the remaining elements repeatedly.
It would be desirable to have an algorithm, g, to convert the pop list into a list of indices, such that
g(p) = [ 2, 0, 3, 1]
This would allow the new list to be constructed efficiently.
Is there an efficient algorithm, perhaps O(N), which could be used in implement g?
The easy way is to apply f to a list of indexes:
g(p) = f([0, 1, ... length(p)-1], p)
(assuming length(p) is the range of indexes. Otherwise use the appropriate length, or dynamically grow it if necessary)
This is O(n^2). You can make it O(n log n) by storing x in an order statistic tree instead of a list:
https://en.wikipedia.org/wiki/Order_statistic_tree
I wrote a small snippet in JS to explain my idea. The algorithm is NOT O(N) but probably O(n^2). It doesn't need to execute f though. I am not 100% certain this works properly but if not this might serve as an idea to build upon.
You reverse your pop list and iterate it:
Add current element to new array
Check whether previously added elements are bigger or equal to current element.
If true, increment the respective values.
Return newly filled array (reversed).
Rough sequence:
0
0 1
0 1 0
1 2 0 (increment old values with >= 0)
1 2 0 2
1 3 0 2 (increment old values with >= 2)
const data = [2, 0, 1, 0];
function popToIdx(popList) {
let arr = [],
tmp = [...popList].reverse(); // 0 1 0 2
for (let i = 0; i < tmp.length; i++) {
arr[i] = tmp[i];
for (let j = 0; j < i; j++) {
if (arr[j] >= tmp[i]) {
arr[j] += 1;
}
}
}
return arr.reverse();
}
console.log(popToIdx(data)) // 2 0 3 1
It works the same with lists since adding a new element to the end is still only O(1) and the partial iteration for lists and arrays is the same, too.

how to write iterative algorithm for generate all subsets of a set?

I wrote recursive backtracking algorithm for finding all subsets of a given set.
void backtracke(int* a, int k, int n)
{
if (k == n)
{
for(int i = 1; i <=k; ++i)
{
if (a[i] == true)
{
std::cout << i << " ";
}
}
std::cout << std::endl;
return;
}
bool c[2];
c[0] = false;
c[1] = true;
++k;
for(int i = 0; i < 2; ++i)
{
a[k] = c[i];
backtracke(a, k, n);
a[k] = INT_MAX;
}
}
now we have to write the same algorithm but in an iterative form, how to do it ?
You can use the binary counter approach. Any unique binary string of length n represents a unique subset of a set of n elements. If you start with 0 and end with 2^n-1, you cover all possible subsets. The counter can be easily implemented in an iterative manner.
The code in Java:
public static void printAllSubsets(int[] arr) {
byte[] counter = new byte[arr.length];
while (true) {
// Print combination
for (int i = 0; i < counter.length; i++) {
if (counter[i] != 0)
System.out.print(arr[i] + " ");
}
System.out.println();
// Increment counter
int i = 0;
while (i < counter.length && counter[i] == 1)
counter[i++] = 0;
if (i == counter.length)
break;
counter[i] = 1;
}
}
Note that in Java one can use BitSet, which makes the code really shorter, but I used a byte array to illustrate the process better.
There are a few ways to write an iterative algorithm for this problem. The most commonly suggested would be to:
Count (i.e. a simply for-loop) from 0 to 2numberOfElements - 1
If we look at the variable used above for counting in binary, the digit at each position could be thought of a flag indicating whether or not the element at the corresponding index in the set should be included in this subset. Simply loop over each bit (by taking the remainder by 2, then dividing by 2), including the corresponding elements in our output.
Example:
Input: {1,2,3,4,5}.
We'd start counting at 0, which is 00000 in binary, which means no flags are set, so no elements are included (this would obviously be skipped if you don't want the empty subset) - output {}.
Then 1 = 00001, indicating that only the last element would be included - output {5}.
Then 2 = 00010, indicating that only the second last element would be included - output {4}.
Then 3 = 00011, indicating that the last two elements would be included - output {4,5}.
And so on, all the way up to 31 = 11111, indicating that all the elements would be included - output {1,2,3,4,5}.
* Actually code-wise, it would be simpler to turn this on its head - output {1} for 00001, considering that the first remainder by 2 will then correspond to the flag of the 0th element, the second remainder, the 1st element, etc., but the above is simpler for illustrative purposes.
More generally, any recursive algorithm could be changed to an iterative one as follows:
Create a loop consisting of parts (think switch-statement), with each part consisting of the code between any two recursive calls in your function
Create a stack where each element contains each necessary local variable in the function, and an indication of which part we're busy with
The loop would pop elements from the stack, executing the appropriate section of code
Each recursive call would be replaced by first adding it's own state to the stack, and then the called state
Replace return with appropriate break statements
A little Python implementation of George's algorithm. Perhaps it will help someone.
def subsets(S):
l = len(S)
for x in range(2**l):
yield {s for i,s in enumerate(S) if ((x / 2**i) % 2) // 1 == 1}
Basically what you want is P(S) = S_0 U S_1 U ... U S_n where S_i is a set of all sets contained by taking i elements from S. In other words if S= {a, b, c} then S_0 = {{}}, S_1 = {{a},{b},{c}}, S_2 = {{a, b}, {a, c}, {b, c}} and S_3 = {a, b, c}.
The algorithm we have so far is
set P(set S) {
PS = {}
for i in [0..|S|]
PS = PS U Combination(S, i)
return PS
}
We know that |S_i| = nCi where |S| = n. So basically we know that we will be looping nCi times. You may use this information to optimize the algorithm later on. To generate combinations of size i the algorithm that I present is as follows:
Suppose S = {a, b, c} then you can map 0 to a, 1 to b and 2 to c. And perumtations to these are (if i=2) 0-0, 0-1, 0-2, 1-0, 1-1, 1-2, 2-0, 2-1, 2-2. To check if a sequence is a combination you check if the numbers are all unique and that if you permute the digits the sequence doesn't appear elsewhere, this will filter the above sequence to just 0-1, 0-2 and 1-2 which are later mapped back to {a,b},{a,c},{b,c}. How to generate the long sequence above you can follow this algorithm
set Combination(set S, integer l) {
CS = {}
for x in [0..2^l] {
n = {}
for i in [0..l] {
n = n U {floor(x / |S|^i) mod |S|} // get the i-th digit in x base |S|
}
CS = CS U {S[n]}
}
return filter(CS) // filtering described above
}

algorithm for checking close by numbers for similarities in a list

I have a list and I need to find and extract all numbers in close proximity to a new list.
for example I have a list:
1,5,10,8,11,14,15,11,14,1,4,7,5,9
so if I want to extract all numbers that are close by 3(only 3, the gap must be 3, so 11,14 is correct, 11,13 is not.) near each other how can I design this without hard-coding the whole thing?
the result should look like:
8,11,14,11,14,1,4,7
This doesn't look too hard ,but I'm kind stuck, all I can come up with is a loop that checks n+1 member of the loop if it's more than n by 3 and include the n+1 member in a new list, however I don't know how to include the n member without making it appear on the new list twice if there is a string of needed numbers.
any ideas?
Just loop through the list, checking the next and previous element, and save the current one if it differs by 3 from either one. In Python, that's
>>> l = [1,5,10,8,11,14,15,11,14,1,4,7,5,9]
>>> # pad with infinities to ease the final loop
>>> l = [float('-inf')] + l + [float('inf')]
>>> [x for i, x in enumerate(l[1:-1], 1)
... if 3 in (abs(x - l[i-1]), abs(x - l[i+1]))]
[8, 11, 14, 11, 14, 1, 4, 7]
In Matlab
list = [1,5,10,8,11,14,15,11,14,1,4,7,5,9]
then
list(or([diff([0 diff(list)==3]) 0],[0 diff(list)==3]))
returns
8 11 14 11 14 1 4 7
For those who don't understand Matlab diff(list) returns the first (forward) differences of the elements in list. The expression [0 diff(list)] pads the first differences with a leading 0 to make the result the same length as the original list. The rest should be obvious.
In a nutshell: take forward differences and backward differences, select the elements where either difference is 3.
A simple C++ code below:
assuming ar is the array of the initial integers and mark is a boolean array
for(int i=1;i<N;i++){
if(ar[i]-ar[i-1]==3){
mark[i]=1;
mark[i-1]=1;
}
}
Now to print the interesting numbers,
for(int i=0;i<N;i++){
if(mark[i]==1)cout<<ar[i]<<" ";
}
The idea behind the implementation is, we mark a number as interesting if the difference from it to its previous one is 3 or if the difference between it and its next number is 3.
that's a single loop:
public List<int> CloseByN(int n, List<int> oldL)
{
bool first = true;
int last = 0;
bool isLstAdded = false;
List<int> newL = new List<int>();
foreach(int curr in oldL)
{
if(first)
{
first = false;
last = curr;
continue;
}
if(curr - last == n)
{
if(isLstAdded == false)
{
newL.Add(last);
isLstAdded = true;
}
newL.Add(curr);
}
else
{
isLstAdded = false;
}
last = curr;
}
return newL;
}
tested on your input and got your output
And a Haskell version:
f g xs = dropWhile (null . drop 1) $ foldr comb [[last xs]] (init xs) where
comb a bbs#(b:bs)
| abs (a - head b) == g = (a:head bbs) : bs
| otherwise =
if null (drop 1 b) then [a] : bs else [a] : bbs
Output:
*Main> f 3 [5,10,8,11,14,15,11,14,1,4,7,5,9]
[[8,11,14],[11,14],[1,4,7]]
*Main> f 5 [5,10,8,11,14,15,11,14,1,4,7,5,9]
[[5,10]]

Easy way of bruteforcing numbers

So I have 10 numbers. Lets say each number represents the skill of an individual. If I were to create 2 teams of 5 , how would i make 2 teams such that the difference of their teams sum is minimal.
With 10 numbers, the easiest way would be to go over all combinations and calculate the difference.
This is similar to the Knapsack problem: You try to put individuals in one of the teams so that this team's sum is the biggest value not larger than half of the total sum. It would be the same if team size was not restricted.
Here's a crazy idea I came up with.
Time Complexity : O(N log N)
Sort the numbers.
Find the target sum for the set(T) that we would like to hit(Sum of all values/2)
Let Q=set of first 5 numbers in sorted list.Q will be our final set , which we will iteratively improve.
for(each element q from last element to first element of Q)
{
Find a number p that is not currently used
which if swapped with the current element q
makes the sum closer to T but not more than T.
Remove q from Q
Add p to Q
}
return Q as best set.
Though the for loop looks as though it's O(N2), one can do binary search to find the number p.So it's O(N*log N)
Disclaimer:I have only described the algorithm.I don't know how to formally prove it.
Generate all combination of 5 elements. You will have those 5 in a a team and the remaining in the other team. Compare all results and choose the one with the smallest difference. You can create all those combination with 5 for loops.
I just tried it out - unfortunately I had to program that permutation thing (function next) and call result.fit for every element.
Can be done nicer, but for demonstration it should be good enough.
var all = [ 3, 4, 5, 8 , 2, 1, 1, 4, 9, 10 ];
function sumArray(a) {
var asum = 0;
a.forEach(function(v){ asum += v });
return asum;
}
var next = function(start, rest, nbr, result) {
if (nbr < 0) {
result.fit(start);
return;
}
for (var i = 0; i < rest.length - nbr; ++i) {
var clone = start.slice(0);
clone.push(rest[i]);
next(clone, rest.slice(i + 1), nbr - 1, result);
}
};
var result = {
target: sumArray(all) / 2,
best: [],
bestfit: Math.pow(2,63), // really big
fit: function(a) {
var asum = sumArray(a);
var fit = Math.abs(asum - this.target);
if (fit < this.bestfit) {
this.bestfit = fit;
this.best = a;
}
}
}
next([], all, all.length / 2, result);
console.log(JSON.stringify(result.best));
Same algorithm as most -- compare 126 combinations. Code in Haskell:
inv = [1,2,3,4,5,6,7,8,9,10]
best (x:xs) (a,b)
| length a == 5 = [(abs (sum a - sum (x:xs ++ b)),(a,x:xs ++ b))]
| length b == 5 = [(abs (sum (x:xs ++ a) - sum b),(x:xs ++ a,b))]
| otherwise = let s = best xs (x:a,b)
s' = best xs (a,x:b)
in if fst (head s) < fst (head s') then s
else if fst (head s') < fst (head s) then s'
else s ++ s'
main = print $ best (tail inv) ([head inv],[])
Output:
*Main> main
[(1,([9,10,5,2,1],[8,7,6,4,3])),(1,([10,8,6,2,1],[9,7,5,4,3]))
,(1,([9,10,6,2,1],[8,7,5,4,3])),(1,([9,8,7,2,1],[10,6,5,4,3]))
,(1,([10,8,7,2,1],[9,6,5,4,3])),(1,([9,10,4,3,1],[8,7,6,5,2]))
,(1,([10,8,5,3,1],[9,7,6,4,2])),(1,([9,10,5,3,1],[8,7,6,4,2]))
,(1,([10,7,6,3,1],[9,8,5,4,2])),(1,([9,8,6,3,1],[10,7,5,4,2]))
,(1,([10,8,6,3,1],[9,7,5,4,2])),(1,([9,8,7,3,1],[10,6,5,4,2]))
,(1,([10,7,5,4,1],[9,8,6,3,2])),(1,([9,8,5,4,1],[10,7,6,3,2]))
,(1,([10,8,5,4,1],[9,7,6,3,2])),(1,([9,7,6,4,1],[10,8,5,3,2]))
,(1,([10,7,6,4,1],[9,8,5,3,2])),(1,([9,8,6,4,1],[10,7,5,3,2]))
,(1,([8,7,6,5,1],[9,10,4,3,2])),(1,([9,7,6,5,1],[10,8,4,3,2]))]
This is an instance of the Partition problem, but for your tiny instance testing all combinations should be fast enough.

How can I efficiently determine if two lists contain elements ordered in the same way?

I have two ordered lists of the same element type, each list having at most one element of each value (say ints and unique numbers), but otherwise with no restrictions (one may be a subset of the other, they may be completely disjunct, or share some elements but not others).
How do I efficiently determine if A is ordering any two items in a different way than B is? For example, if A has the items 1, 2, 10 and B the items 2, 10, 1, the property would not hold as A lists 1 before 10 but B lists it after 10. 1, 2, 10 vs 2, 10, 5 would be perfectly valid however as A never mentions 5 at all, I cannot rely on any given sorting rule shared by both lists.
You can get O(n) as follows. First, find the intersection of the two sets using hashing. Second, test whether A and B are identical if you only consider elements from the intersection.
My approach would be to first make sorted copies of A and B which also record the positions of elements in the original lists:
for i in 1 .. length(A):
Apos[i] = (A, i)
sortedApos = sort(Apos[] by first element of each pair)
for i in 1 .. length(B):
Bpos[i] = (B, i)
sortedBpos = sort(Bpos[] by first element of each pair)
Now find those elements in common using a standard list merge that records the positions in both A and B of the shared elements:
i = 1
j = 1
shared = []
while i <= length(A) && j <= length(B)
if sortedApos[i][1] < sortedBpos[j][1]
++i
else if sortedApos[i][1] > sortedBpos[j][1]
++j
else // They're equal
append(shared, (sortedApos[i][2], sortedBpos[j][2]))
++i
++j
Finally, sort shared by its first element (position in A) and check that all its second elements (positions in B) are increasing. This will be the case iff the elements common to A and B appear in the same order:
sortedShared = sort(shared[] by first element of each pair)
for i = 2 .. length(sortedShared)
if sortedShared[i][2] < sortedShared[i-1][2]
return DIFFERENT
return SAME
Time complexity: 2*(O(n) + O(nlog n)) + O(n) + O(nlog n) + O(n) = O(nlog n).
General approach: store all the values and their positions in B as keys and values in a HashMap. Iterate over the values in A and look them up in B's HashMap to get their position in B (or null). If this position is before the largest position value you've seen previously, then you know that something in B is in a different order than A. Runs in O(n) time.
Rough, totally untested code:
boolean valuesInSameOrder(int[] A, int[] B)
{
Map<Integer, Integer> bMap = new HashMap<Integer, Integer>();
for (int i = 0; i < B.length; i++)
{
bMap.put(B[i], i);
}
int maxPosInB = 0;
for (int i = 0; i < A.length; i++)
{
if(bMap.containsKey(A[i]))
{
int currPosInB = bMap.get(A[i]);
if (currPosInB < maxPosInB)
{
// B has something in a different order than A
return false;
}
else
{
maxPosInB = currPosInB;
}
}
}
// All of B's values are in the same order as A
return true;
}

Resources