Java/Kotlin - convert Set to Map<Long, Set<Long>> - algorithm

I have Set of Long values
Set<Long> ids = {1,2,3,4}
What I'd like to achieve is
Set<Map<Long, Set<Long>>
and from this Set of ids I need to have Set with 4 elements like:
Set: {
Map -> key: 1, values: 2,3,4
Map -> key: 2, values: 1,3,4
Map -> key: 3, values: 1,2,4
Map -> key: 4, values: 1,2,3
}
How can i get it by stream or maybe kotlin's groupBy ?
Was anyone going to have a map like this? (Solution without a for or while loop)

You can use use map method to transform every element to Map then collect it to set
var set = setOf(1, 2, 3, 4)
var map = set.map { v -> mapOf(v to set.filter { it != v }.toSet()) }
.toSet()
However I don't believe it's much better than simple foreach loop due to performance or readability

Opinions on kotlin groupBy
Notice that groupBy can just split the original set into severial sets without intersection. So it's impossible to construct the mentioned map directly with groupBy function.
The solution below take advantage of groupBy when getting result, but result2 is much more clear to read and meets intuition:
fun main() {
val set = setOf(1, 2, 3, 4)
val result = set
.groupBy { it }
.mapValues { (_, values) -> set.filter { it !in values } }
println(result) // {1=[2, 3, 4], 2=[1, 3, 4], 3=[1, 2, 4], 4=[1, 2, 3]}
val result2 = HashMap<Int, List<Int>>().apply {
set.forEach { this[it] = (set - it).toList() }
}
println(result2) // {1=[2, 3, 4], 2=[1, 3, 4], 3=[1, 2, 4], 4=[1, 2, 3]}
}

That would be a possible solution with a for loop:
val ids: Set<Long> = setOf(1, 2, 3, 4)
var result: MutableSet<Map<Long, Set<Long>>> = mutableSetOf()
for (id in ids) {
result.add(mapOf(id to ids.filter { it != id }.toSet()))
}
println(result)

Related

Why does my parallel merge algorithm produce the correct values in all positions of the output except the first?

I am writing a parallel merging algorithm in Rust using scoped-threadpool, but it seems to be producing the correct values in all positions of the output except the first.
I am attempting to adapt the pseudocode from the merge algorithm Wikipedia page:
fn parallel_merge(first: &[i32], second: &[i32], output: &mut [i32]) {
let mut n = first.len();
let mut m = second.len();
let a;
let b;
// Make sure that 'first' is the largest of the two to be merged
if m < n {
a = first;
b = second;
} else {
a = second;
b = first;
let tmp = n;
n = m;
m = tmp;
}
if m <= 0 {
return;
}
let pivot = n / 2;
let s = bisect(a[pivot], b);
let t = pivot + s;
output[t] = a[pivot];
let mut pool = Pool::new(2);
pool.scoped(|scoped| {
let (left, right) = output.split_at_mut(t);
scoped.execute(move || {
parallel_merge(&a[..pivot], &b[..s], left);
});
scoped.execute(move || {
parallel_merge(&a[pivot..], &b[s..], right);
});
});
}
When called with first as the slice [1, 3, 5, 7, 9], second as [2, 4, 6, 8, 10] and a slice of ten zeroes as the initial output, output is left as [0, 2, 3, 4, 5, 6, 7, 8, 9].
What is going wrong? As far as I can see, it matches the pseudocode aside from the unnecessary tracking of indexes.
You've misread the algorithm. m is the length of A:
algorithm merge(A[i...j], B[k...ℓ], C[p...q]) is
let m = j - i,
n = ℓ - k
You have it as the length of B:
let mut m = second.len();
The complete example:
use scoped_threadpool::Pool; // 0.1.9
fn parallel_merge(a: &[i32], b: &[i32], output: &mut [i32]) {
let (a, b) = if a.len() >= b.len() { (a, b) } else { (b, a) };
if a.is_empty() {
return;
}
let pivot = a.len() / 2;
let s = match b.binary_search(&a[pivot]) {
Ok(x) => x,
Err(x) => x,
};
let t = pivot + s;
let (a_left, a_tail) = a.split_at(pivot);
let (a_mid, a_right) = a_tail.split_first().unwrap();
let (b_left, b_right) = b.split_at(s);
let (o_left, o_tail) = output.split_at_mut(t);
let (o_mid, o_right) = o_tail.split_first_mut().unwrap();
*o_mid = *a_mid;
let mut pool = Pool::new(2);
pool.scoped(|scoped| {
scoped.execute(move || parallel_merge(a_left, b_left, o_left));
scoped.execute(move || parallel_merge(a_right, b_right, o_right));
});
}
#[test]
fn exercise() {
let first = [1, 3, 5, 7, 9];
let second = [2, 4, 6, 8, 10];
let mut output = [0; 10];
parallel_merge(&first, &second, &mut output);
assert_eq!(output, [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]);
}

How to split a stream into multiple stream based on certain criteria?

I have a list of integers and I want to return sub lists of integers from this list using java 8 streams where my sub list contain all the positive integers from original list just before it encounters a negative integer
For ex. let's say my list have elements
[1 , 2 , 0 , -1 , 5 , 8 , 9 , -11 , 7 , 13 ]
then I should return sub lists containing
[1 , 2 , 0 ] [ 5 , 8 , 9 ] [7 , 13]
I tried following approach but it's not working, your direction & input is much appreciated.
package Java8;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.HashSet;
import java.util.List;
import java.util.stream.Collectors;
public class StreamsExamples {
public static void main(String[] args) {
ArrayList<Integer> arr = new ArrayList<Integer>();
arr.add(1);
arr.add(2);
Integer[] al = new Integer[]{0,-1,5,8,9,-11,7,13};
arr.addAll(Arrays.asList(al));
arr.stream().collect(Collectors.groupingBy(Functionality::getPositiveList));
// arr.stream().collect(Collectors.toList()).;
//arr.stream().sorted().forEach(System.out::print);
}
}
class Functionality{
public List<List<Integer>> getPositiveList(List<Integer> list){
List<List<Integer>> li = new ArrayList<List<Integer>>();
List<Integer> a = new ArrayList<Integer>();
for(int i=0;i<list.size();i++) {
if(list.get(i)>=0) {
a.add(list.get(i));
}
else if (list.get(i)<0) {
li.add(a);
a.clear();
}
}
return li;
}
}
This isn't so hard if you think a bit different here: find out the indexes where the negative value is and just do a subList between those... There are some quirks to do with IntStream.of(-1) (but I'll let you figure it out why it is like that: try to replace it with the more intuitive IntStream.of(0) and see what is going on). So having an input like:
ArrayList<Integer> arr = List.of(1, 2, 0, -1, 5, 8, 9, -11, 7, 13);
You first find out the indexes:
int[] indexes = IntStream.concat(
IntStream.of(-1),
IntStream.concat(
IntStream.range(0, arr.size())
.filter(x -> arr.get(x) < 0),
IntStream.of(arr.size())))
.toArray();
System.out.println(Arrays.toString(indexes));
This will give a result like: [-1, 3, 7, 10].
Thus just compute the subList between these:
IntStream.range(0, indexes.length - 1)
.mapToObj(x -> arr.subList(indexes[x] + 1, indexes[x + 1]))
.collect(Collectors.toList())
.forEach(System.out::println);
It's a bit ugly, but this works:
List<List<Integer>> lists = Arrays.stream(arr).boxed()
.reduce(
new ArrayList<>(),
(l, i) -> {
if (l.isEmpty() || i < 0) {
l.add(new ArrayList<>());
}
if (i >= 0) {
l.get(l.size() - 1).add(i);
}
return l;
}, (a, b) -> {
a.addAll(b);
return a;
});
Tested:
int [] arr = {1, 2, 0, -1, 5, 8, 9, -11, 7, 13};
List<List<Integer>> lists = Arrays.stream(arr).boxed().reduce(new ArrayList<>(),
(l, i) -> {
if (l.isEmpty() || i < 0) {
l.add(new ArrayList<>());
}
if (i >= 0) {
l.get(l.size() - 1).add(i);
}
return l;
}, (a, b) -> {
a.addAll(b);
return a;
});
System.out.println(lists);
Output:
[[1, 2, 0], [5, 8, 9], [7, 13]]

Find optimal array intersections

I have an array of arrays and a matching array. Each array has unique id values.
MatchingArray = [1,2,3,4,5,6]
A1 = [1, 4, 6]
A2 = [2,3,5]
A3 = [1,5]
A4 = [4]
A5 = [1, 6]
Need to find "optimal matchings". An optimal matching is an array of subsets from A1-A5 with minimal length, which should have a maximum possible intersection with MatchingArray.
For this example there are 2 possible matchings with a maximum intersection: M1 = [[2,3,5], [1, 4, 6]] and M2 = [[1,5], [4], [1, 6]]. But M1.length < M2.length, so the algorithm should output M1.
You could use sets (or hashes, whatever the language calls them) to optimise the time efficiency.
Convert the target array to a set, and then subtract the selected source from it (i.e. removing common values). Keep doing this recursively until the target set is empty. Keep track of the best result (using the fewest source arrays as possible). Backtrack if the number of source arrays being used gets past the length of the best solution already found at that moment.
Here is the code in Python:
def find_optimal_coverage(target, sources):
max_size = len(target)
best = None
def recurse(target, sources, selected):
nonlocal max_size, best
if len(target) == 0:
best = selected
max_size = len(best) - 1
return True
if len(selected) == max_size:
return None
for i, source in enumerate(sources):
result = recurse(target - set(source), sources[i+1:],
selected + [list(source)])
if result:
return True
target = set(target) # convert to set for faster lookup
# limit the source lists to elements that occur in the target
sources = list(map(target.intersection, sources))
# limit target to elements that occur in at least one source
target = set.union(*sources)
# sort sources by decreasing length to maximise probability of
# finding optimal solution sooner
sources.sort(key = len, reverse = True)
if recurse(target, sources, []):
return best
result = find_optimal_coverage(
[1, 2, 3, 4, 5, 6, 8],
[
[1, 4, 6, 7],
[2, 3, 5],
[1, 5],
[4],
[1, 6]
]
)
print(result)
See it run on repl.it
In JavaScript:
function subtractArray(s, arr) {
return arr.reduce( (s, v) => (s.delete(v), s), new Set(s) );
}
function findOptimalCoverage(target, sources) {
var maxSize = target.size;
var best = null;
function recurse(target, sources, selected) {
if (target.size == 0) {
best = selected;
maxSize = best.length - 1;
return true;
}
if (selected.length == maxSize) return;
return sources.some( (source, i) =>
recurse(subtractArray(target, source), sources.slice(i+1),
selected.concat([source]))
);
}
target = new Set(target) // convert to set for faster lookup
// limit the source arrays to elements that occur in the target
sources = sources.map( source => source.filter(target.has.bind(target)));
// limit target to elements that occur in at least one source
target = new Set([].concat(...sources));
// sort sources by decreasing length to maximise probability of
// finding optimal solution sooner
sources.sort( (a,b) => b.length - a.length );
if (recurse(target, sources, [])) return best;
}
var result = findOptimalCoverage(
[1, 2, 3, 4, 5, 6, 8],
[
[1, 4, 6, 7],
[2, 3, 5],
[1, 5],
[4],
[1, 6]
]
);
console.log(result);
.as-console-wrapper { max-height: 100% !important; top: 0; }
Implemented algorithm in javascript:
var matchingArray = [1, 2, 3, 4, 5, 6];
var A1 = [1, 4, 6],
A2 = [2, 3, 5],
A3 = [1, 5],
A4 = [4],
A5 = [1, 6];
var M = [A1, A2, A3, A4, A5];
function compareArrays(M, machingArray) {
var intersections = []
M.forEach(function(A) {
var partOfItersections;
if (A.length > 0) {
var intersectionsCount = getIntersectionCount(A, machingArray);
partOfItersections = intersectionsCount / A.length;
} else {
partOfItersections = 0
}
intersections.push({
length: A.length,
partOfItersections: partOfItersections
});
});
//alert(JSON.stringify(intersections));
var maxLength = 0,
maxPartOfItersections = 0,
optimalArrays = [];
intersections.forEach(function(arrayData, index) {
var currentArr = M[index];
var currentArrLength = currentArr.length;
if (maxPartOfItersections < arrayData.partOfItersections) {
setCurrentOptimalArr(arrayData.partOfItersections, currentArr);
} else if (maxPartOfItersections === arrayData.partOfItersections) {
if (maxLength < currentArrLength) {
setCurrentOptimalArr(arrayData.partOfItersections, currentArr);
} else if (maxLength === currentArrLength) {
optimalArrays.push(currentArr);
}
}
});
//alert(JSON.stringify(optimalArrays));
return optimalArrays;
function setCurrentOptimalArr(intersectionsCount, currentArr) {
optimalArrays = [currentArr];
maxLength = currentArr.length;
maxPartOfItersections = intersectionsCount;
}
function getIntersectionCount(A, machingArray) {
var intersectionCount = 0;
A.forEach(function(elem) {
if (machingArray.indexOf(elem) != -1) {
intersectionCount++;
}
});
return intersectionCount;
}
}
alert(JSON.stringify(compareArrays(M, matchingArray)));
Count intersection of arrays separately.
Return arrays which contain more part of intersections.
Code updated

LINQ - OrderBy explictly

Is it possible to use LINQ OrderBy like so:
.OrderBy(x=>(x.SourceID == 3), (x.SourceID == 2), (x=>x.SourceID == 4), (x.SourceID == 1)).ToList();
So it'll order them by 3, 2, 4, 1 ?
No, that is not a valid lambda expression. What you could do instead is something like;
var sortOrder = new List<int> {3, 2, 4, 1};
var result = bop.OrderBy(x=> sortOrder.IndexOf(x.SourceID)).ToList();
If you want to extend this to doing special things with unknowns (they end up first now), you can just make a method that makes the determination of the sort order and use that instead.
Taking Joachim Isaksson's answer above, this could be wrapped in an extension method:
public static class ListExtensions
{
public static List<Source> SortByCustomOrder(this List<Source> list, List<int> sortOrder)
{
return list.OrderBy(x => sortOrder.IndexOf(x.SourceId)).ToList();
}
}
replacing Source with your Class and x.SourceId with your property
Usage:
// the sort order
var sortOrder = new List<int> { 3, 2, 4, 1, 6, 5 };
var results = sources.SortByCustomOrder(sortOrder);

How to sort a MongoDB query for a ratio of two fields in every document?

Suppose I have a lot of documents like {'a' : x , 'b' : y}.
Suppose x and y are integers.
How can I do something like find().sort({'a'/'b'}) ?
Since this question was asked in 2011, MongoDB has released the aggregation framework. It lets you sort by the combinations of fields, and you don't need to store (denormalize) any extra fields. Here's how it's done:
db.collection.aggregate([
{
$match: {
// Optional criteria to select only some documents to process, such as...
deleted: null
}
},
{
$project: {
// Need to prefix fields with '$'
ratio: { $divide: [ "$a", "$b" ] },
}
},
{
$sort: { ratio: -1 },
}
]);
That's it. We use the $divide operator of the aggregation framework.
You can add third field, result of a/b and sort by it.
You document will looks like:
{'a' : x , 'b' : y, c : z} // z = x/y
And you will sort by 'c':
find().sort({c : 1})
I don't believe this is possible, as you also can't run queries that compare 2 fields (without using $where to specify a javascript function which would be slow). Instead, I suspect you need to also store the ratio separately within the document and then sort on that new field.
db.collection.aggregate([
{ $addFields: { newField: { $divide: [ "$a", "$b" ] } } }, // Prefix fields with '$'
{ $sort: { newField: -1 } }
]);
Just like Bugai13 said, you need a need a third property in your collection in order to perform the sort. You can add the ratio property with a call to mapReduce (as follows), but this won't be terribly fast on large collections - and will lock up your database while it is running. You really should manually keep the ratio property up to date - it should't be very hard.
db.data.insert({a: 1, b: 1});
db.data.insert({a: 2, b: 2});
db.data.insert({a: 3, b: 3});
db.data.insert({a: 1, b: 4});
db.data.insert({a: 2, b: 1});
db.data.insert({a: 3, b: 2});
db.data.insert({a: 1, b: 3});
db.data.insert({a: 2, b: 4});
db.data.insert({a: 3, b: 1});
db.data.insert({a: 1, b: 2});
db.data.insert({a: 2, b: 3});
db.data.insert({a: 3, b: 4});
db.data.mapReduce(
function(){
emit(this._id, this);
},
function(k, vs){
v = vs[0];
v.c = v.a / v.b;
return v;
},
{out : 'data'}
);
db.data.find().sort({c:1});

Resources