Nearest neighbor algorithm in GraphX - algorithm

I am trying to find and implement nearest neighbors algorithm. I tried using bfs algorithm but I don't think it is right approach.
Can anyone help me implementing it?
var g2 = g.mapVertices((vid,vd) => vid == origin)
.mapTriplets(et => (et.attr,false))
var nextVertexId = origin
var edgesAreAvailable = true
do {
type tripletType = EdgeTriplet[Boolean,Tuple2[Double,Boolean]]
val availableEdges =
g2.triplets
.filter(et => !et.attr._2
&& (et.srcId == nextVertexId && !et.dstAttr
|| et.dstId == nextVertexId && !et.srcAttr))
edgesAreAvailable = availableEdges.count > 0
if (edgesAreAvailable) {
val smallestEdge = availableEdges
.min()(new Ordering[tripletType]() {
override def compare(a:tripletType, b:tripletType) = {
Ordering[Double].compare(a.attr._1,b.attr._1)
}

Related

Nesting bowls for shipping

Problem statement:
A company sells bowls of various integer sized diameters (inches) and often customers buy a number of these bowls at once.
The company would like to reduce shipping costs by sending the minimum number of packages for an order of bowls to a given customer by finding an optimal nesting of the bowls.
The company has also decided to restrict the nestings with the following limitations:
No more than 3 bowls should be nested in one nesting.
A bowl can be nested inside another if it's smaller but not more than 3 inches smaller than the bowl it's directly nested within.
For example, a customer orders the following bowl sizes:
One 5" bowl
One 8" bowl
Two 11" bowls
One 12" bowl
Two 15" bowls
The follow is a possible (and optimal) nesting:
[15] [15,12,11] [11,8,5]
Is there an algorithm to always provide an optimal nesting?
I've looked through many similar questions here on stackoverflow and googled around, but can't find this exact problem, nor am I able to map any similar problems over to this problem space in a way that solves the problem.
This was actually posted in another forum by a real business owner. A number of the developers tried to help, ultimately finding a heuristic solution that provided an optimal solution most of the time but not always.
I can share the chosen algorithm one of the developers put forward as well as a few approaches I tried myself.
I'm just very curious about this problem and if there is an algorithm that can actually do this, or the best solution will be heuristic. If you can either give an idea of how to approach this, share an algorithm, or send a link to a similar problem that can be mapped to this one, that would be awesome.
This can be solved with dynamic programming in polynomial time.
The idea is that we ONLY care about how many boxes there are total, and how many boxes there are of different top bowl sizes. We don't care about the details beyond that. This is a polynomial amount of state, and so we can track through the calculation and enumerate one arrangement per possible state in a polynomial time. We then reconstruct the minimal packing of bowls into boxes from that arrangement.
class Arrangement:
def __init__(self, next_bowl, prev_arrangement=None):
self.prev_arrangement = prev_arrangement
self.add_rule = None
self.open1 = {}
self.open2 = {}
self.next_bowl = next_bowl
if prev_arrangement is None:
self.boxes = 0
for i in range(next_bowl, next_bowl + 4):
self.open1[i] = 0
self.open2[i] = 0
else:
self.boxes = prev_arrangement.boxes
for i in range(next_bowl, next_bowl + 4):
self.open1[i] = prev_arrangement.open1.get(i, 0)
self.open2[i] = prev_arrangement.open2.get(i, 0)
# This will be tuples of tuples.
def state(self):
open1 = (self.open1[i+self.next_bowl] for i in range(4))
open2 = (self.open2[i+self.next_bowl] for i in range(4))
return (open1, open2)
def next_arrangements(self, bowl):
base_arrangement = Arrangement(bowl, self)
base_arrangement.boxes += 1
base_arrangement.add_rule = ("new",)
old_count = self.open2.get(bowl, 0)
base_arrangement.open2[bowl] = old_count + 1
yield base_arrangement
for i in range(1, 4):
if 0 < self.open1.get(bowl+i, 0):
next_arrangement = Arrangement(bowl, self)
next_arrangement.open1[bowl+i] -= 1
next_arrangement.add_rule = ("open", 1, bowl+i)
yield next_arrangement
if 0 < self.open2.get(bowl+i, 0):
next_arrangement = Arrangement(bowl, self)
next_arrangement.open2[bowl+i] -= 1
next_arrangement.open1[bowl] += 1
next_arrangement.add_rule = ("open", 2, bowl+i)
yield next_arrangement
def find_boxes(self):
items = self._find_boxes()
boxes = items["full"]
for more_boxes in items["open1"].values():
boxes.extend(more_boxes)
for more_boxes in items["open2"].values():
boxes.extend(more_boxes)
return list(reversed(sorted(boxes)))
def _find_boxes(self):
if self.prev_arrangement is None:
return {
"full": [],
"open1": {},
"open2": {},
}
else:
items = self.prev_arrangement._find_boxes()
rule = self.add_rule
if rule[0] == "new":
if self.next_bowl not in items["open2"]:
items["open2"][self.next_bowl] = [[self.next_bowl]]
else:
items["open2"][self.next_bowl].append([self.next_bowl])
elif rule[0] == "open":
if rule[1] == 1:
box = items["open1"][rule[2]].pop()
box.append(self.next_bowl)
items["full"].append(box)
elif rule[1] == 2:
box = items["open2"][rule[2]].pop()
box.append(self.next_bowl)
if self.next_bowl not in items["open1"]:
items["open1"][self.next_bowl] = [box]
else:
items["open1"][self.next_bowl].append(box)
return items
def __str__ (self):
return str(self.boxes) + " open1:" + str(self.open1) + " open2:" + str(self.open2)
def bowl_nesting (bowls):
bowls = list(reversed(sorted(bowls))) # Largest to smallest.
start_arrangement = Arrangement(bowls[0])
arrange = {start_arrangement.state(): start_arrangement}
for bowl in bowls:
next_arrange = {}
for state, arrangement in arrange.items():
for next_arrangement in arrangement.next_arrangements(bowl):
state = next_arrangement.state()
if state in next_arrange and next_arrange[state].boxes <= next_arrangement.boxes:
pass # We are not an improvement.
else:
next_arrange[state] = next_arrangement
arrange = next_arrange
min_boxes = len(bowls)
min_box_list = None
for arrangement in arrange.values():
if arrangement.boxes <= min_boxes:
min_boxes = arrangement.boxes
min_box_list = arrangement.find_boxes()
return min_box_list
print(bowl_nesting([15, 15, 12, 11, 11,8,5]))
Now while the above solution works, it is inefficient. Suppose that we have up to k bowls of any given size. The number of combinations of open1[bowl] and open2[bowl] that allows is k choose 2 = k*(k-1)/2). When we consider that our state has 4 sizes in it, that's O(k^8 / 16 possible states. We do that for the number of bowls to get O(n k^8). This doesn't scale well.
We can do better by making the following notes:
In any arrangement with an open2[bowls+3] option, you do not do worse by moving the next bowl out of whatever box you were going to put it in, and putting it there instead.
If there is an open2[bowls+2] option and an open2[bowls+1] option, you never do worse by picking open2[bowls+2].
If there is an open1[bowls+i] option and an open1[bowls+j] option with 1 <= i < j <= 3 then you never do worse picking open1[bowls+i] instead.
This optimization means fewer choices, which speeds you up by a constant. But also you cannot have open2[bowls+3] and also have open2[bowls]. So that O(k^8) becomes O(k^7) states. And adding to the boxes with larger bowls will reduce how much of the potential state space we actually visit. This should lead to a better constant.
Here is this logic with a minor refactor to cleanup the code.
class Arrangement:
def __init__(self, next_bowl, prev_arrangement=None, choice=None, position=None):
self.prev_arrangement = prev_arrangement
self.add_rule = None
self.open1 = {}
self.open2 = {}
self.next_bowl = next_bowl
if prev_arrangement is None:
self.boxes = 0
for i in range(next_bowl, next_bowl + 4):
self.open1[i] = 0
self.open2[i] = 0
else:
self.boxes = prev_arrangement.boxes
for i in range(next_bowl, next_bowl + 4):
self.open1[i] = prev_arrangement.open1.get(i, 0)
self.open2[i] = prev_arrangement.open2.get(i, 0)
if choice is not None:
self.choice(choice, position)
# This will be tuples of tuples.
def state(self):
open1 = (self.open1[i+self.next_bowl] for i in range(4))
open2 = (self.open2[i+self.next_bowl] for i in range(4))
return (open1, open2)
def choice (self, rule, position=None):
self.add_rule = (rule, position)
if rule == "new":
self.boxes += 1
self.open2[self.next_bowl] += 1
elif rule == "open1":
self.open1[position] -= 1
elif rule == "open2":
self.open2[position] -= 1
self.open1[self.next_bowl] += 1
def next_arrangements(self, bowl):
if 0 < self.open2.get(bowl+3, 0):
yield Arrangement(bowl, self, "open2", bowl+3)
else:
yield Arrangement(bowl, self, "new")
for i in [3, 2, 1]:
if 0 < self.open1.get(bowl+i, 0):
yield Arrangement(bowl, self, "open1", bowl+i)
break
for i in [2, 1]:
if 0 < self.open2.get(bowl+i, 0):
yield Arrangement(bowl, self, "open2", bowl+i)
break
def find_boxes(self):
items = self._find_boxes()
boxes = items["full"]
for more_boxes in items["open1"].values():
boxes.extend(more_boxes)
for more_boxes in items["open2"].values():
boxes.extend(more_boxes)
return list(reversed(sorted(boxes)))
def _find_boxes(self):
if self.prev_arrangement is None:
return {
"full": [],
"open1": {},
"open2": {},
}
else:
items = self.prev_arrangement._find_boxes()
rule = self.add_rule
if rule[0] == "new":
if self.next_bowl not in items["open2"]:
items["open2"][self.next_bowl] = [[self.next_bowl]]
else:
items["open2"][self.next_bowl].append([self.next_bowl])
elif rule[0] == "open1":
box = items["open1"][rule[1]].pop()
box.append(self.next_bowl)
items["full"].append(box)
elif rule[0] == "open2":
box = items["open2"][rule[1]].pop()
box.append(self.next_bowl)
if self.next_bowl not in items["open1"]:
items["open1"][self.next_bowl] = [box]
else:
items["open1"][self.next_bowl].append(box)
return items
def bowl_nesting (bowls):
bowls = list(reversed(sorted(bowls))) # Largest to smallest.
start_arrangement = Arrangement(bowls[0])
arrange = {start_arrangement.state(): start_arrangement}
for bowl in bowls:
next_arrange = {}
for state, arrangement in arrange.items():
for next_arrangement in arrangement.next_arrangements(bowl):
state = next_arrangement.state()
if state in next_arrange and next_arrange[state].boxes <= next_arrangement.boxes:
pass # We are not an improvement.
else:
next_arrange[next_arrangement.state()] = next_arrangement
arrange = next_arrange
min_boxes = len(bowls)
min_box_list = None
for arrangement in arrange.values():
if arrangement.boxes <= min_boxes:
min_boxes = arrangement.boxes
min_box_list = arrangement.find_boxes()
return min_box_list
print(bowl_nesting([15, 15, 12, 11, 11,8,5]))
Yes, we can calculate an optimal nesting. As you presented, start with the bowls sorted in reverse order.
15,15,12,11,11,8,5
Assign the minimum number of starting bowls, corresponding to the count of the largest bowl.
[15] [15]
As we iterate element by element, the state we need to keep is the smallest bowl size and count in each container per index visited.
index 0, [(15, 1), (15, 1)]
(The state can be further refined to a multiset of those packages with identical count and smallest bowl size, which would add some complication.)
The choice for any element is which box (or set of boxes with similar state) to add it to or whether to start a new box with it.
index 1, [(15, 1), (12, 2)]
or
index 1, [(15, 1), (15, 1), (12, 1)]
We can explore these branches in an iterative or recursive breadth first search prioritised by the number of elements remaining plus the number of packages in the state, avoiding previously seen states.
We can further prune the search space by avoiding branches with the same or more count of packages than the best we've already seen.
This approach would amount to brute force in the sense of exploring all relevant branches. But hopefully the significant restrictions of package size and bowl size relationship would narrow the search space considerably.
This "Answer" is based on btilly's solution (the accepted answer).
Thank you #btilly for sticking with this and taking the time to revise the algorithm and fix bugs!
Since this was originally set within the context of Google Apps Script, I've rewritten this in Javascript and want to share the JS code with anyone else that might want it.
btilly's improved algorithm does indeed run much quicker than the first. Though the improvement factor depends on the bowls provided I've noticed it running up to 50 times faster in some of my sample sets.
Below is the JS code. Some caveats:
I've kept the same structure and same naming as much as possible in copying over btilly's solution.
There's no guarantee I did not introduce bugs while porting over btilly's code.
I'm not too familiar with many modern/proper JS conventions and also I don't know Python at all, so translating some of the concepts was tough and although I think my code is now bug free, if you spot any bugs, inefficiencies, bad programming ideas, please let me know and I'll update the below code.
I added a count to the state creation to make each state unique, since in my Apps Script implementation the JS runtime kept stringifying the arrays so that two states were sometimes considered the same even if they were not (e.g. the previous arrangement's bowl was the same size as another arrangement's bowl, but not the same bowl - the way two 10" bowls might appear to a 9" bowl for example). This was not needed in Python since the generators were unique based on their memory addresses. If you know a better way to do this in JS, please let me know. Seems a little sloppy the way I did it.
Improved/faster code (Javascript):
class Arrangement2{
constructor(next_bowl, prev_arrangement, choice, position){
this.prev_arrangement = prev_arrangement;
this.add_rule = null;
this.open1 = {};
this.open2 = {};
this.next_bowl = next_bowl;
if (prev_arrangement == null){
this.boxes = 0;
for (let i = next_bowl; i < next_bowl + 4; i++){
this.open1[i] = 0;
this.open2[i] = 0;
}
}
else{
this.boxes = prev_arrangement.boxes;
for (let i = next_bowl; i < next_bowl + 4; i++){
this.open1[i] = prev_arrangement.open1[i] != null ? prev_arrangement.open1[i] : 0;
this.open2[i] = prev_arrangement.open2[i] != null ? prev_arrangement.open2[i] : 0;
}
}
if(choice != null){
this.choice(choice,position);
}
}
state(){
let open1 = {};
let open2 = {};
for(let i = 0; i < 4; i++){
open1[i+this.next_bowl] = this.open1[i+this.next_bowl];
open2[i+this.next_bowl] = this.open2[i+this.next_bowl];
}
var toReturn = [];
//Used to make each state unique, without this the algorithm may not always find the best solution
Arrangement2.count++;
toReturn.push(Arrangement2.count);
toReturn.push(open1);
toReturn.push(open2);
return toReturn;
}
choice(rule, position){
this.add_rule = [rule, position];
if( rule == "new" ){
this.boxes += 1;
this.open2[this.next_bowl] += 1;
}
else if( rule == "open1" ){
this.open1[position] -= 1;
}
else if( rule == "open2" ){
this.open2[position] -= 1;
this.open1[this.next_bowl] += 1;
}
}
* next_arrangements (bowl){
if( 0 < (this.open2[bowl+3] != null ? this.open2[bowl+3] : 0)){
yield new Arrangement2(bowl, this, "open2", bowl + 3);
}
else{
yield new Arrangement2(bowl, this, "new", null);
for(let i = 3; i > 0; i--){
if (this.open1[bowl+i] != null ? this.open1[bowl+i] : 0){
yield new Arrangement2(bowl, this, "open1", bowl+i);
break ;
}
}
for(let i = 2; i > 0; i--){
if (this.open2[bowl+i] != null ? this.open2[bowl+i] : 0){
yield new Arrangement2(bowl, this, "open2", bowl+i);
break ;
}
}
}
}
find_boxes(){
let items = this._find_boxes();
let boxes = items["full"];
for (const [key, more_boxes] of Object.entries(items["open1"])) {
boxes = boxes.concat(more_boxes);
}
for (const [key, more_boxes] of Object.entries(items["open2"])) {
boxes = boxes.concat(more_boxes);
}
//Max --> Min (i.e [ 12, 12, 11, 11, 10, 7, 7, 7 ])
boxes.sort(function(a, b){return b - a});
return boxes; //boxes.sort().reverse(); //list(reversed(sorted(boxes)));
}
_find_boxes(){
if (this.prev_arrangement == null){
return {
"full": [],
"open1": {},
"open2": {},
}
}
else{
let items = this.prev_arrangement._find_boxes();
let rule = this.add_rule;
if (rule[0] == "new"){
if (!(this.next_bowl in items["open2"])){
items["open2"][this.next_bowl] = [[this.next_bowl]];
}
else{
items["open2"][this.next_bowl].push([this.next_bowl]);
}
}
else if( rule[0] == "open1"){
let box = items["open1"][rule[1]].pop();
box.push(this.next_bowl);
items["full"].push(box);
}
else if( rule[0] == "open2"){
let box = items["open2"][rule[1]].pop();
box.push(this.next_bowl);
if (!(this.next_bowl in items["open1"])){
items["open1"][this.next_bowl] = [box];
}
else{
items["open1"][this.next_bowl].push(box);
}
}
return items;
}
}
__str__(){
return this.next_bowl + " " + JSON.stringify(this.boxes) + " open1:" + JSON.stringify(this.open1) + " open2:" + JSON.stringify(this.open2);
}
}
allStates_nesting_improved = function (bowls){
//Used to make each state unique, without this the algorithm may not always find the best solution
Arrangement2.count = 0;
//Max --> Min (i.e [ 12, 12, 11, 11, 10, 7, 7, 7 ])
bowls.sort(function(a, b){return b - a});
let start_arrangement = new Arrangement2(bowls[0], null);
let returnObj = start_arrangement.state();
let arrange = {[returnObj]:start_arrangement};
for (const [key, bowl] of Object.entries(bowls) ) {
let next_arrange = {};
for (let [state, arrangement] of Object.entries(arrange) ) {
let next_arrangements = arrangement.next_arrangements(bowl);
let next_arrangement = next_arrangements.next();
while(next_arrangement.value != undefined){
next_arrangement = next_arrangement.value;
let state = next_arrangement.state();
let nextArrange_state = next_arrange[state];
if ( next_arrange[state] != undefined && (nextArrange_state === state) && next_arrange[state].boxes <= next_arrangement.boxes){
continue ; // # We are not an improvement.
}
else{
next_arrange[next_arrangement.state()] = next_arrangement;
}
next_arrangement = next_arrangements.next();
}
}
arrange = next_arrange;
}
let min_boxes = bowls.length;
let min_box_list = null;
for (const [key, arrangement] of Object.entries(arrange) ) {
if (arrangement.boxes <= min_boxes){
min_boxes = arrangement.boxes;
min_box_list = arrangement.find_boxes();
}
}
console.log(min_box_list);
return min_box_list;
}
Original code (Javascript):
class Arrangement1{
constructor(next_bowl, prev_arrangement){
this.prev_arrangement = prev_arrangement;
this.add_rule = null;
this.open1 = {};
this.open2 = {};
this.next_bowl = next_bowl;
if (prev_arrangement == null){
this.boxes = 0;
for (let i = next_bowl; i < next_bowl + 4; i++){
this.open1[i] = 0;
this.open2[i] = 0;
}
}
else{
this.boxes = prev_arrangement.boxes;
for (let i = next_bowl; i < next_bowl + 4; i++){
this.open1[i] = prev_arrangement.open1[i] != null ? prev_arrangement.open1[i] : 0;
this.open2[i] = prev_arrangement.open2[i] != null ? prev_arrangement.open2[i] : 0;
}
}
}
state(){
//Used to make each state unique, without this the algorithm may not always find the best solution
Arrangement1.count++;
let open1 = {};
let open2 = {};
for(let i = 0; i < 4; i++){
open1[i+this.next_bowl] = this.open1[i+this.next_bowl];
open2[i+this.next_bowl] = this.open2[i+this.next_bowl];
}
var toReturn = [];
toReturn.push(Arrangement1.count);
toReturn.push(open1);
toReturn.push(open2);
return toReturn;
}
* next_arrangements (bowl){
let base_arrangement = new Arrangement1(bowl, this);
base_arrangement.boxes += 1;
base_arrangement.add_rule = ["new"];
let old_count = this.open2[bowl] != null ? this.open2[bowl] : 0;
base_arrangement.open2[bowl] = old_count + 1;
yield base_arrangement;
for(let i = 1; i < 4; i++){
if (0 < (this.open1[bowl+i] != null ? this.open1[bowl+i] : 0)){
let next_arrangement = new Arrangement1(bowl, this);
next_arrangement.open1[bowl+i] -= 1;
next_arrangement.add_rule = ["open", 1, bowl+i];
yield next_arrangement;
}
if (0 < (this.open2[bowl+i] != null ? this.open2[bowl+i] : 0)){
let next_arrangement = new Arrangement1(bowl, this);
next_arrangement.open2[bowl+i] -= 1;
next_arrangement.open1[bowl] += 1;
next_arrangement.add_rule = ["open", 2, bowl+i];
yield next_arrangement;
}
}
}
find_boxes(){
let items = this._find_boxes();
let boxes = items["full"];
for (const [key, more_boxes] of Object.entries(items["open1"])) {
boxes = boxes.concat(more_boxes);
}
for (const [key, more_boxes] of Object.entries(items["open2"])) {
boxes = boxes.concat(more_boxes);
}
//Max --> Min (i.e [ 12, 12, 11, 11, 10, 7, 7, 7 ])
boxes.sort(function(a, b){return b - a});
return boxes;
}
_find_boxes(){
if (this.prev_arrangement == null){
return {
"full": [],
"open1": {},
"open2": {},
}
}
else{
let items = this.prev_arrangement._find_boxes();
let rule = this.add_rule;
if (rule[0] == "new"){
if (!(this.next_bowl in items["open2"])){
items["open2"][this.next_bowl] = [[this.next_bowl]];
}
else{
items["open2"][this.next_bowl].push([this.next_bowl]);
}
}
else if( rule[0] == "open"){
if (rule[1] == 1){
let box = items["open1"][rule[2]].pop();
box.push(this.next_bowl);
items["full"].push(box);
}
else if( rule[1] == 2){
let box = items["open2"][rule[2]].pop();
box.push(this.next_bowl);
if (!(this.next_bowl in items["open1"])){
items["open1"][this.next_bowl] = [box];
}
else{
items["open1"][this.next_bowl].push(box);
}
}
}
return items;
}
}
__str__(){
return this.next_bowl + " " + JSON.stringify(this.boxes) + " open1:" + JSON.stringify(this.open1) + " open2:" + JSON.stringify(this.open2);
}
}
allStates_nesting = function (bowls){
//Used to make each state unique, without this the algorithm may not always find the best solution
Arrangement1.count = 0;
//Max --> Min (i.e [ 12, 12, 11, 11, 10, 7, 7, 7 ])
bowls.sort(function(a, b){return b - a});
let start_arrangement = new Arrangement1(bowls[0], null);
let returnObj = start_arrangement.state();
let arrange = {[returnObj]:start_arrangement};
for (const [key, bowl] of Object.entries(bowls) ) {
let next_arrange = {};
for (let [state, arrangement] of Object.entries(arrange) ) {
let next_arrangements = arrangement.next_arrangements(bowl);
let next_arrangement = next_arrangements.next();
while(next_arrangement.value != undefined){
next_arrangement = next_arrangement.value;
let state = next_arrangement.state();
let nextArrange_state = next_arrange[state];
if ( next_arrange[state] != undefined && (nextArrange_state === state) && next_arrange[state].boxes <= next_arrangement.boxes){
continue ; // # We are not an improvement.
}
else{
next_arrange[state] = next_arrangement;
}
next_arrangement = next_arrangements.next();
}
}
arrange = next_arrange;
}
let min_boxes = bowls.length;
let min_box_list = null;
for (const [key, arrangement] of Object.entries(arrange) ) {
if (arrangement.boxes <= min_boxes){
min_boxes = arrangement.boxes;
min_box_list = arrangement.find_boxes();
}
}
return min_box_list;
}
See it in action
Here is a link to a spreadsheet testbed with 3 algorithms:
Algorithm 1: A heuristic algorithm another developer provided (runs fast but doesn't always find the optimal solution and ignores some of the requirements in some of its solutions for simplicity's sake)
Algorithm 2: btilly's revised algorithm (faster)
Algorithm 3: btilly's first attempt
Bowl Nesting Spreadsheet
Feel free to make a copy and modify the code and/or add your own algorithm to compare it with the others. (The orange "Run" button won't work since the spreadsheet is in "Viewer" mode. You'll need to make a copy to run it).
To make a copy go to
File -> Make a copy.
Once you have your own copy, you can click the "Run" button or go to the code by clicking
Extensions -> Apps Script
You can then modify and/or add your own algorithm to the mix.
You'll also have to authorize the script to run as with all Apps Script scripts.
If you're worried about authorizing it, of course check out the code before clicking run to make sure there isn't anything nefarious in there.

Guide to improve/optimize my terrible code for is graph bipartite in scala

I wrote this code for https://leetcode.com/problems/is-graph-bipartite/
It works but I think the code is terrible. Issues:
How do I break when I find isBipartite=false
How do I only dfs(find if graph is bipartite) for g.keys which have not be visted. (when there are 2 graphs not connected to each other )
import scala.collection.immutable.Queue
case class Step(q: Queue[Int]=Queue(),visited: List[Int]=List(),colors: List[Int]=List(),isBipartite: Boolean = true)
object Solution {
def isBipartite(graph: Array[Array[Int]]): Boolean = {
val g= graph.foldLeft(Map[Int,List[Int]]())((mp,arr)=>{
val i= graph.indexOf(arr)
if(arr.isEmpty){
mp + (i->List())
}else
arr.foldLeft(mp)((m,v)=>{
m + (i->(m.getOrElse(i,Nil) :+ v))
})
})
//println(g)
val colors = List.fill(graph.size)(-1)
//println(colors)
g.keys.foldLeft(Step())((s,k)=>{
if(!s.visited.contains(k) || s.isBipartite==true)
bfs(k,g,s.copy(q=Queue(k),visited=(s.visited:+ k),colors=colors.updated(k,1)))
else
s.copy(isBipartite=false)
}).isBipartite
}
def bfs(start: Int, g: Map[Int,List[Int]], step1: Step): Step = {
val s=Stream.iterate(step1) {
case (step) =>
//dequeue
val (vertex, rest)=step.q.dequeue
val newS=g.getOrElse(vertex,Nil).foldLeft(step)((s,v)=>{
//println("vertex color for vertex "+vertex+" "+s.colors(vertex))
//println("v color "+s.colors(v))
if(s.colors(vertex)==s.colors(v)){
//println("is not bipartite")
step.copy(isBipartite=false)
}else if(s.colors(v) == -1){
//println(s.colors)
s.copy(colors=s.colors.updated(v,(1-s.colors(vertex))))
}else{
s
}
})
//add neighbours to queue
val newQueue=rest.enqueue(g.getOrElse(vertex,Nil).filterNot(newS.visited.contains))
//mark neighbours visited
val newVisited: List[Int] = newS.visited ++ g.getOrElse(vertex,Nil)
newS.copy(q=newQueue,visited=newVisited)
}.takeWhile(t=> t.q.nonEmpty).filterNot(n=>n.isBipartite).headOption
if(!s.isEmpty)
s.get
else
Step()
}
}
I came up with a little better solution
import scala.collection.immutable.Queue
case class Step(q: Queue[Int]=Queue(),colors: List[Int]=List(),isBipartite: Boolean = true)
object Solution {
def isBipartite(graph: Array[Array[Int]]): Boolean = {
//create a g
val g= graph.foldLeft(Map[Int,List[Int]]())((mp,arr)=>{
val i= graph.indexOf(arr)
if(arr.isEmpty) mp + (i->List())
else arr.foldLeft(mp)((m,v)=>m + (i->(m.getOrElse(i,Nil) :+ v)))
})
//init colors array
val colors = List.fill(graph.size)(-1)
//iterate over all g keys which are not processed
g.keys.foldLeft(Step(Queue(),colors,true))((s,k)=>
if(s.colors(k) == -1 || s.isBipartite==true){
bfs(k,g,s.copy(q=Queue(k),colors=s.colors.updated(k,1)))
} else s
).isBipartite
}
//color of neighbors should be opposite of vertex to be bipartite
def bfs(start: Int, g: Map[Int,List[Int]], step: Step): Step = {
Stream.iterate(step) {
case (step) =>
val (vertex, rest)=step.q.dequeue
g.getOrElse(vertex,Nil).foldLeft(step.copy(q=rest))((s,v)=>{
if(!s.isBipartite) s
else if(s.colors(vertex)==s.colors(v)) s.copy(isBipartite=false)
else if(s.colors(v) == -1) s.copy(q=s.q.enqueue(v),colors=s.colors.updated(v,(1-s.colors(vertex))))
else s
})
}.takeWhile(t=> t.q.nonEmpty).last
}
}

Is there a way to find a path in a tree?

Let's say we have a tree like the one below. Is there an algorithm that given 2 nodes find the path that connects them. For example, given (A,E) it will return [A,B,E], or given (D,G) it will return [D,B,A,C,G]
A
/ \
B C
/ \ / \
D E F G
You will need to have a tree implementation where a child node has a link to its parent.
Then for both nodes you can build the path from the node to the root, just by following the parent link.
Then compare the two paths, starting from the ends of the paths (where the root is): as long as they are the same, remove that common node from both paths.
Finally you are left with two diverting paths. Reverse the second, and join the two paths, putting the last removed node in between the two.
Here is an implementation in JavaScript:
function getPathToRoot(a) {
if (a.parent == null) return [a];
return [a].concat(getPathToRoot(a.parent));
}
function findPath(a, b) {
let p = getPathToRoot(a);
let q = getPathToRoot(b);
let common = null;
while (p.length > 0 && q.length > 0 && p[p.length-1] == q[q.length-1]) {
common = p.pop();
q.pop();
}
return p.concat(common, q.reverse());
}
// Create the example tree
let nodes = {};
for (let label of "ABCDEFG") {
nodes[label] = { label: label, parent: null };
}
nodes.B.parent = nodes.C.parent = nodes.A;
nodes.D.parent = nodes.E.parent = nodes.B;
nodes.F.parent = nodes.G.parent = nodes.C;
// Perform the algorithm
let path = findPath(nodes.E, nodes.G);
// Output the result
console.log("Path from E to G is:");
for (let node of path) {
console.log(node.label);
}

Map/Fold the multiple .OBJ-index-buffers to OpenGL's 1 index buffer

I am trying to load in a .obj-File and draw it with the help of glDrawElements.
Now, with glDrawArrays everything works perfectly, but it is - of course - inefficient.
The problem I have right now is, that an .obj-file uses multiple index-buffers (for each attribute) while OpenGL may only use one. So I need to map them accordingly.
There are a lot of pseudo-algorithms out there and I even found a C++ implementation. I do know quite a bit of C++ but strangely neither helped me with my implementation in Scala.
Let's see:
private def parseObj(path: String): Model =
{
val objSource: List[String] = Source.fromFile(path).getLines.toList
val positions: List[Vector3] = objSource.filter(_.startsWith("v ")).map(_.split(" ")).map(v => new Vector3(v(1).toFloat,v(2).toFloat,v(3).toFloat))//, 1.0f))
val normals: List[Vector4] = objSource.filter(_.startsWith("vn ")).map(_.split(" ")).map(v => new Vector4(v(1)toFloat,v(2).toFloat, v(3).toFloat, 0.0f))
val textureCoordinates: List[Vector2] = objSource.filter(_.startsWith("vt ")).map(_.split(" ")).map(v => new Vector2(v(1).toFloat, 1-v(2).toFloat)) // TODO 1-y because of blender
val faces: List[(Int, Int, Int)] = objSource.filter(_.startsWith("f ")).map(_.split(" ")).flatten.filterNot(_ == "f").map(_.split("/")).map(a => ((a(0).toInt, a(1).toInt, a(2).toInt)))
val vertices: List[Vertex] = for(face <- faces) yield(new Vertex(positions(face._1-1), textureCoordinates(face._2-1)))
val f: List[(Vector3, Vector2, Vector4)] = for(face <- faces) yield((positions(face._1-1), textureCoordinates(face._2-1), normals(face._3-1)))
println(f.mkString("\n"))
val indices: List[Int] = faces.map(f => f._1-1) // Wrong!
new Model(vertices.toArray, indices.toArray)
}
The val indices: List[Int] was my first naive approach and of course is wrong. But let's start at the top:
I load in the file and go through it. (I assume you know how an .obj-file is made up)
I read in the vertices, texture-coordinates and normals. Then I come to the faces.
Now, each face in my example has 3 values v_x, t_y, n_z defining the vertexAtIndexX, textureCoordAtIndexY, normalAtIndexZ. So each of these define one Vertex while a triple of these (or one line in the file) defines a Face/Polygon/Triangle.
in val vertices: List[Vertex] = for(face <- faces) yield(new Vertex(positions(face._1-1), textureCoordinates(face._2-1))) I actually try to create Vertices (a case-class that currently only holds positions and texture-coordinates and neglects normals for now)
The real problem is this line:
val indices: List[Int] = faces.map(f => f._1-1) // Wrong!
To get the real indices I basically need to do this instead of
val vertices: List[Vertex] = for(face <- faces) yield(new Vertex(positions(face._1-1), textureCoordinates(face._2-1)))
and
val indices: List[Int] = faces.map(f => f._1-1) // Wrong!
Pseudo-Code:
Iterate over all faces
Iterate over all vertices in a face
Check if we already have that combination of(position, texturecoordinate, normal) in our newVertices
if(true)
indices.put(indexOfCurrentVertex)
else
create a new Vertex from the face
store the new vertex in the vertex list
indices.put(indexOfNewVertex)
Yet I'm totally stuck. I've tried different things, but can't come up with a nice and clean solution that actually works.
Things like:
val f: List[(Vector3, Vector2, Vector4)] = for(face <- faces) yield((positions(face._1-1), textureCoordinates(face._2-1), normals(face._3-1)))
and trying to f.distinct are not working, because there is nothing to distinct, all the entries there are unique, which totally makes sense if I look at the file and yet that's what the pseudo-code tells me to check.
Of course then I would need to fill the indices accordingly (preferably in a one-liner and with a lot of functional beauty)
But I should try to find duplicates, so... I'm kind of baffled. I guess I mix up the different "vertices" and "positions" too much, with all the referencing.
So, am I thinking wrong, or is the algorithm/thinking right and I just need to implement this in nice, clean (and actually working) Scala code?
Please, enlighten me!
As per comments, I made a little update:
var index: Int = 0
val map: mutable.HashMap[(Int, Int, Int), Int] = new mutable.HashMap[(Int, Int, Int), Int].empty
val combinedIndices: ListBuffer[Int] = new ListBuffer[Int]
for(face <- faces)
{
val vID: Int = face._1-1
val nID: Int = face._2-1
val tID: Int = face._3-1
var combinedIndex: Int = -1
if(map.contains((vID, nID, tID)))
{
println("We have a duplicate, wow!")
combinedIndex = map.get((vID, nID, tID)).get
}
else
{
combinedIndex = index
map.put((vID, nID, tID), combinedIndex)
index += 1
}
combinedIndices += combinedIndex
}
where faces still is:
val faces: List[(Int, Int, Int)] = objSource.filter(_.startsWith("f ")).map(_.split(" ")).flatten.filterNot(_ == "f").map(_.split("/")).map(a => ((a(0).toInt, a(1).toInt, a(2).toInt)))
Fun fact I'm still not understanding it obviously, because that way I never ever get a duplicate!
Meaning that combinedIndices at the end just holds the natural numbers like:
ListBuffer(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, ...)
This is javascript (sorry not scala) but it's commented and shoul be fairly easy to convert.
// bow-tie
var objString = "v 0 0 0\nv 1 1 0\nv 1 -1 0\nv -1 1 0\nv -1 -1 0\n" +
"vt 0 .5\nvt 1 1\nvt 1 0\n" +
"vn 0 0 1\n" +
"f 1/1/1 2/2/1 3/3/1\nf 1/1/1 4/2/1 5/3/1";
// output indices should be [0, 1, 2, 0, 3, 4]
// parse the file
var lines = objString.split("\n");
var data = lines.map(function(line) { return line.split(" "); });
var v = [];
var t = [];
var n = [];
var f = [];
var indexMap = new Map(); // HashMap<face:string, index:integer>
var nextIndex = 0;
var vertices = [];
var indices = [];
// fill vertex, texture and normal arrays
data.filter(function(d) { return d[0] == "v"; }).forEach(function(d) { v.push([parseFloat(d[1]), parseFloat(d[2]), parseFloat(d[3])]); });
data.filter(function(d) { return d[0] == "vt"; }).forEach(function(d) { t.push([parseFloat(d[1]), parseFloat(d[2])]); });
data.filter(function(d) { return d[0] == "vn"; }).forEach(function(d) { n.push([parseFloat(d[1]), parseFloat(d[2]), parseFloat(d[3])]); });
//
console.log("V", v.toString());
console.log("T", t.toString());
console.log("N", n.toString());
// create vertices and indices arrays by parsing faces
data.filter(function(d) { return d[0] == "f"; }).forEach(function(d) {
var f1 = d[1].split("/").map(function(d) { return parseInt(d)-1; });
var f2 = d[2].split("/").map(function(d) { return parseInt(d)-1; });
var f3 = d[3].split("/").map(function(d) { return parseInt(d)-1; });
// 1
if(indexMap.has(d[1].toString())) {
indices.push(indexMap.get(d[1].toString()));
} else {
vertices = vertices.concat(v[f1[0]]).concat(t[f1[1]]).concat(n[f1[2]]);
indexMap.set(d[1].toString(), nextIndex);
indices.push(nextIndex++);
}
// 2
if(indexMap.has(d[2].toString())) {
indices.push(indexMap.get(d[2].toString()));
} else {
vertices = vertices.concat(v[f2[0]]).concat(t[f2[1]]).concat(n[f2[2]]);
indexMap.set(d[2].toString(), nextIndex);
indices.push(nextIndex++);
}
// 3
if(indexMap.has(d[3].toString())) {
indices.push(indexMap.get(d[3].toString()));
} else {
vertices = vertices.concat(v[f3[0]]).concat(t[f3[1]]).concat(n[f3[2]]);
indexMap.set(d[3].toString(), nextIndex);
indices.push(nextIndex++);
}
});
//
console.log("Vertices", vertices.toString());
console.log("Indices", indices.toString());
The output
V 0,0,0,1,1,0,1,-1,0,-1,1,0,-1,-1,0
T 0,0.5,1,1,1,0
N 0,0,1
Vertices 0,0,0,0,0.5,0,0,1,1,1,0,1,1,0,0,1,1,-1,0,1,0,0,0,1,-1,1,0,1,1,0,0,1,-1,-1,0,1,0,0,0,1
Indices 0,1,2,0,3,4
The JSFiddle http://jsfiddle.net/8q7jLvsq/2
The only thing I'm doing ~differently is using the string hat represents one of the parts of a face as the key into my indexMap (ex: "25/32/5").
EDIT JSFiddle http://jsfiddle.net/8q7jLvsq/2/ this version combines repeated values for vertex, texture and normal. This optimizes OBJ files that repeat the same values values making every face unique.
// bow-tie
var objString = "v 0 0 0\nv 1 1 0\nv 1 -1 0\nv 0 0 0\nv -1 1 0\nv -1 -1 0\n" +
"vt 0 .5\nvt 1 1\nvt 1 0\nvt 0 .5\nvt 1 1\nvt 1 0\n" +
"vn 0 0 1\nvn 0 0 1\nvn 0 0 1\nvn 0 0 1\nvn 0 0 1\nvn 0 0 1\n" +
"f 1/1/1 2/2/2 3/3/3\nf 4/4/4 5/5/5 6/6/6";
// output indices should be [0, 1, 2, 0, 3, 4]
// parse the file
var lines = objString.split("\n");
var data = lines.map(function(line) { return line.split(" "); });
var v = [];
var t = [];
var n = [];
var f = [];
var vIndexMap = new Map(); // map to earliest index in the list
var vtIndexMap = new Map();
var vnIndexMap = new Map();
var indexMap = new Map(); // HashMap<face:string, index:integer>
var nextIndex = 0;
var vertices = [];
var indices = [];
// fill vertex, texture and normal arrays
data.filter(function(d) { return d[0] == "v"; }).forEach(function(d, i) {
v[i] = [parseFloat(d[1]), parseFloat(d[2]), parseFloat(d[3])];
var key = [d[1], d[2], d[3]].toString();
if(!vIndexMap.has(key)) {
vIndexMap.set(key, i);
}
});
data.filter(function(d) { return d[0] == "vt"; }).forEach(function(d, i) {
t[i] = [parseFloat(d[1]), parseFloat(d[2])];
var key = [d[1], d[2]].toString();
if(!vtIndexMap.has(key)) {
vtIndexMap.set(key, i);
}
});
data.filter(function(d) { return d[0] == "vn"; }).forEach(function(d, i) {
n[i] = [parseFloat(d[1]), parseFloat(d[2]), parseFloat(d[3])];
var key = [d[1], d[2], d[3]].toString();
if(!vnIndexMap.has(key)) {
vnIndexMap.set(key, i);
}
});
//
console.log("V", v.toString());
console.log("T", t.toString());
console.log("N", n.toString());
// create vertices and indices arrays by parsing faces
data.filter(function(d) { return d[0] == "f"; }).forEach(function(d) {
var f1 = d[1].split("/").map(function(d, i) {
var index = parseInt(d)-1;
if(i == 0) index = vIndexMap.get(v[index].toString());
else if(i == 1) index = vtIndexMap.get(t[index].toString());
else if(i == 2) index = vnIndexMap.get(n[index].toString());
return index;
});
var f2 = d[2].split("/").map(function(d, i) {
var index = parseInt(d)-1;
if(i == 0) index = vIndexMap.get(v[index].toString());
else if(i == 1) index = vtIndexMap.get(t[index].toString());
else if(i == 2) index = vnIndexMap.get(n[index].toString());
return index;
});
var f3 = d[3].split("/").map(function(d, i) {
var index = parseInt(d)-1;
if(i == 0) index = vIndexMap.get(v[index].toString());
else if(i == 1) index = vtIndexMap.get(t[index].toString());
else if(i == 2) index = vnIndexMap.get(n[index].toString());
return index;
});
// 1
if(indexMap.has(f1.toString())) {
indices.push(indexMap.get(f1.toString()));
} else {
vertices = vertices.concat(v[f1[0]]).concat(t[f1[1]]).concat(n[f1[2]]);
indexMap.set(f1.toString(), nextIndex);
indices.push(nextIndex++);
}
// 2
if(indexMap.has(f2.toString())) {
indices.push(indexMap.get(f2.toString()));
} else {
vertices = vertices.concat(v[f2[0]]).concat(t[f2[1]]).concat(n[f2[2]]);
indexMap.set(f2.toString(), nextIndex);
indices.push(nextIndex++);
}
// 3
if(indexMap.has(f3.toString())) {
indices.push(indexMap.get(f3.toString()));
} else {
vertices = vertices.concat(v[f3[0]]).concat(t[f3[1]]).concat(n[f3[2]]);
indexMap.set(f3.toString(), nextIndex);
indices.push(nextIndex++);
}
});
//
console.log("Vertices", vertices.toString());
console.log("Indices", indices.toString());

LINQ Grouping: Is there a cleaner way to do this without a for loop

I am trying to create a very simple distribution chart and I want to display the counts of tests score percentages in their corresponding 10's ranges.
I thought about just doing the grouping on the Math.Round((d.Percentage/10-0.5),0)*10 which should give me the 10's value....but I wasn't sure the best way to do this given that I would probably have missing ranges and all ranges need to appear even if the count is zero. I also thought about doing an outer join on the ranges array but since I'm fairly new to Linq so for the sake of time I opted for the code below. I would however like to know what a better way might be.
Also note: As I tend to work with larger teams with varying experience levels, I'm not all that crazy about ultra compact code unless it remains very readable to the average developer.
Any suggestions?
public IEnumerable<TestDistribution> GetDistribution()
{
var distribution = new List<TestDistribution>();
var ranges = new int[] { 0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110 };
var labels = new string[] { "0%'s", "10%'s", "20%'s", "30%'s", "40%'s", "50%'s", "60%'s", "70%'s", "80%'s", "90%'s", "100%'s", ">110% "};
for (var n = 0; n < ranges.Count(); n++)
{
var count = 0;
var min = ranges[n];
var max = (n == ranges.Count() - 1) ? decimal.MaxValue : ranges[n+1];
count = (from d in Results
where d.Percentage>= min
&& d.Percentage<max
select d)
.Count();
distribution.Add(new TestDistribution() { Label = labels[n], Frequency = count });
}
return distribution;
}
// ranges and labels in a list of pairs of them
var rangesWithLabels = ranges.Zip(labels, (r,l) => new {Range = r, Label = l});
// create a list of intervals (ie. 0-10, 10-20, .. 110 - max value
var rangeMinMax = ranges.Zip(ranges.Skip(1), (min, max) => new {Min = min, Max = max})
.Union(new[] {new {Min = ranges.Last(), Max = Int32.MaxValue}});
//the grouping is made by the lower bound of the interval found for some Percentage
var resultsDistribution = from c in Results
group c by
rangeMinMax.FirstOrDefault(r=> r.Min <= c.Percentage && c.Percentage < r.Max).Min into g
select new {Percentage = g.Key, Frequency = g.Count() };
// left join betweem the labels and the results with frequencies
var distributionWithLabels =
from l in rangesWithLabels
join r in resultsDistribution on l.Range equals r.Percentage
into rd
from r in rd.DefaultIfEmpty()
select new TestDistribution{
Label = l.Label,
Frequency = r != null ? r.Frequency : 0
};
distribution = distributionWithLabels.ToList();
Another solution if the ranges and labels can be created in another way
var ranges = Enumerable.Range(0, 10)
.Select(c=> new {
Min = c * 10,
Max = (c +1 )* 10,
Label = (c * 10) + "%'s"})
.Union(new[] { new {
Min = 100,
Max = Int32.MaxValue,
Label = ">110% "
}});
var resultsDistribution = from c in Results
group c by ranges.FirstOrDefault(r=> r.Min <= c.Percentage && c.Percentage < r.Max).Min
into g
select new {Percentage = g.Key, Frequency = g.Count() };
var distributionWithLabels =
from l in ranges
join r in resultsDistribution on l.Min equals r.Percentage
into rd
from r in rd.DefaultIfEmpty()
select new TestDistribution{
Label = l.Label,
Frequency = r != null ? r.Frequency : 0
};
This works
public IEnumerable<TestDistribution> GetDistribution()
{
var range = 12;
return Enumerable.Range(0, range).Select(
n => new TestDistribution
{
Label = string.Format("{1}{0}%'s", n*10, n==range-1 ? ">" : ""),
Frequency =
Results.Count(
d =>
d.Percentage >= n*10
&& d.Percentage < ((n == range - 1) ? decimal.MaxValue : (n+1)*10))
});
}

Resources