Equality of All Elements in A Range - algorithm

What Phobos algorithm should I use to check if all elements in a range are equal or not? I've looked in std.algorithm and the closest I've found is equal but it takes two ranges as argument. I also cannot find a way to apply reduce to solve this problem.

Nice, Adam. A few more possibilities:
foo.empty || foo.equal(repeat(foo.front, foo.length))
or
foo.empty || repeat(foo.front).startsWith(foo)
or
foo.findAdjacent!("a != b").empty

One option would be to use canFind:
import std.algorithm;
import std.range;
void main() {
int[] foo = [1,1,2];
if(!foo.empty) {
if(!canFind!"a != b"(foo, foo.front))
// they are equal
else
// not equal
} else { /* nothing to compare against */ }
}
The logic here is if they are all equal, then it should not be able to find an item that is not equal to the first item.
Andrei's answer has several more options!

Related

Solving "Check for BST" question on GeeksForGeeks but keep getting wrong answers but I can't seem to find the issue. Can someone please explain?

Trying to solve "Check for BST" question on GeeksForGeeks but I keep getting wrong errors but I don't think there is an issue with my code. Can someone please look at my code and explain? Thank you.
public class Solution
{
//Function to check whether a Binary Tree is BST or not.
static boolean answer = true;
boolean isBST(Node root)
{
// code here.
if(root==null) return false;
traverseLeft(root.left,root.data);
traverseRight(root.right,root.data);
return answer;
}
void traverseLeft(Node root, int val){
if(root == null) return;
if(root.data >= val){
answer = false;
return;
}
traverseLeft(root.left,root.data);
traverseRight(root.right,root.data);
}
void traverseRight(Node root, int val){
if(root == null) return;
if(root.data <= val){
answer = false;
return;
}
traverseLeft(root.left,root.data);
traverseRight(root.right,root.data);
}
}
There are several issues with your attempt:
result should not be a static variable. As a static variable, its value will survive across different test cases, never resetting to true after the first test has run, and thus possibly giving false negatives.
Instead you should let the recursive function return whether the BST is valid in that particular subtree. The caller should then capture that boolean result for both its children and perform a logical AND to determine what its own response will be: true if, and only when, it is true for both recursive calls for the children and its own value is within the acceptable range. In all other cases it should return false. That way you have no need for a more global variable.
It is not enough to compare a node's value with the value of its parent. For instance, your logic would consider the following tree a valid BST, but it isn't:
6
/
1
\
10
While the value 10 passes the test your code has (it is at least 1), it should also not be greater than 6. So a correct algorithm will pass two limiting values to the recursive calls: a lower limit and an upper limit.

Using the Haxe While Loop to Remove All of a Value from an Array

I'm wanting to remove all of a possibly duplicated value in an array. At the moment I'm using the remove(x:T):Bool function in a while loop, but I'm wondering about the expression part.
I've started by using:
function removeAll(array:Array<String>, element:String):Void
while (array.remove(element)) {}
but I'm wondering if any of these lines would be more efficient:
while (array.remove(element)) continue;
while (array.remove(element)) true;
while (array.remove(element)) 0;
or if it makes any kind of difference.
I'm guessing that using continue is less efficient because it actually has to do something, true and 0 are slightly more efficient, but still do something, and {} would probably be most efficient.
Does anyone have any background information on this?
While other suggested filter, it will create a new instance of list/array which may cause your other code to lose reference.
If you loop array.remove, it is going to loop through all the elements in the front of the array every time, which is not so performant.
IMO a better approach is to use a reverse while loop:
var i = array.length;
while(--i >= 0)
if(array[i] == element) array.splice(i, 1);
It doesn't make any difference. In fact, there's not even any difference in the generated code for the {}, 0 and false cases: they all end up generating {}, at least on the JS target.
However, you could run into issues if you have a large array with many duplicates: in that case, remove() would be called many times, and it has to iterate over the array each time (until it finds a match, that is). In that case, it's probably more efficient to use filter():
function removeAll(array:Array<String>, element:String):Array<String>
return array.filter(function(e) return e != element);
Personally, I also find this to be a bit more elegant than your while-loop with an empty body. But again, it depends on the use case: this does create a new array, and thus causes an allocation. Usually, that's not worth worrying about, but if you for instance do it in the update loop of a game, you might want to avoid it.
In terms of the expression part of the while loop, it seems that it's just set to empty brases ({}) when compiled so it doesn't really matter what you do.
In terms of performance, a much better solution is the Method 2 from the following:
class Test
{
static function main()
{
var thing:Array<String> = new Array<String>();
for (index in 0...1000)
{
thing.push("0");
thing.push("1");
}
var copy1 = thing.copy();
var copy2 = thing.copy();
trace("epoch");
while (copy1.remove("0")) {}
trace("check");
// Method 2.
copy2 = [
for (item in Lambda.filter(copy2, function(v)
{return v != "0";}))
item
];
trace("check");
}
}
which can be seen [here](https://try.haxe.org/#D0468"Try Haxe example."). For 200,000 one-character elements in an Array<String>, Method 2 takes 0.017s while Method 1 takes 44.544s.
For large arrays it will be faster to use a temporary array and then assign that back after populating ( method3 in try )?
OR
If you don't want to use a temp you can assign back and splice ( method4 in try )?
https://try.haxe.org/#5f80c
Both are more verbose codewise as I setup vars, but on mac seems faster at runtime, summary of my method3 approach:
while( i < l ) { if( ( s = copy[ i++ ] ) != '0' ) arr[ j++ ] = s;
copy = arr;
am I missing something obvious against these approaches?

Difference between one pass (scan) and two pass(scan)

I had an Interview, a day before.
The Interviewer told me to , " Write a program to add a node at the end of a linked list ".
I had given him a solution. but he told me to implement it in one pass (one scan).
Can Anybody explain me, whats the meaning of one pass, and how to find the program written is in one pass or two pass?
Here is my code
public void atLast(int new_data)
{
Node new_node=new Node(new_data);
if(head==null)
{
head=new Node(new_data);
return;
}
new_node.next=null;
Node last=head;
while(last.next!=null)
{
last=last.next;
}
last.next=new_node;
return;
}
If that is the code you gave the interviewer must have misread it because it is a single pass.
In your case a "pass" would be your while loop. It could also be done with recursion, for, or any other type of loop that goes through the elements in the array (or other form of a list of items).
In your code you run through the list of Node and insert the element at the end. This is done in one loop making it a single pass.
Now to look at a case with two passes. Say for example you were asked to remove the element with the largest value and wrote something similar to this:
int index = 0;
int count = 0;
int max = 0;
while(temp_node != null)
{
if(temp_node.data > max)
{
index = count;
max = temp_node.data;
}
count++;
temp_node = temp_node.next;
}
for(int i = 0; i < count; i++)
{
if(i == index)
{
//Functionality to remove node.
}
}
The first pass (while) detects the Node which has the maximum value. The second pass (for) removes this Node by looping through all the elements again until the correct one is found.
I'd imagine "two passes" here means that you iterated through the whole list twice in your code. You shouldn't need to do that to add a new node.

Create 3rd vector while looping through 2 others

I'm totally newbie in C++ and I need to solve a problem with vectors. What I need is to merge two existing vectors and create third one. While I saw several answers, the difference here is I need vector #3 (values3) to contain not all values, but only those which are in both vectors #1 (values1) and #2 (values2). So, if integer 2 is in vector 1 but is not in vector 2, this number does not fit me. I should use a function provided below. Commented lines are which I don't know what to write in. Other lines are working.
void CommonValues(vector<MainClass> & values1, vector<MainClass> & values2, vector<MainClass> & values3)
{
MainClass Class;
string pav;
int kiek;
vector<MainClass>::iterator iter3; // ?
for (vector<MainClass>::iterator iter1 = values1.begin(); iter1 != values1.end(); iter1++)
{
for (vector<MainClass>::iterator iter2 = values2.begin(); iter2 != values2.end(); iter2++)
{
if (iter1 == iter2)
{
pav = iter2->TakePav();
iter3->TakePav(pav); // ?
kiek = iter1->TakeKiek() + iter2->TakeKiek();
iter3->TakeKie(kiek); // ?
iter3++; // ?
}
}
}
}
You can sort values1 and values2, then use std::intersection: http://en.cppreference.com/w/cpp/algorithm/set_intersection
Your code at the moment won't work, among other problems, you are comparing iterator from vector 1 with iterator from vector 2, which doesn't make any sense. If you want to do it by looping, you should iterate through one vector and check if the value, for example *iter1, is in the 2nd vector.

word distribution problem

I have a big file of words ~100 Gb and have limited memory 4Gb. I need to calculate word distribution from this file. Now one option is to divide it into chunks and sort each chunk and then merge to calculate word distribution. Is there any other way it can be done faster? One idea is to sample but not sure how to implement it to return close to correct solution.
Thanks
You can build a Trie structure where each leaf (and some nodes) will contain the current count. As words will intersect with each other 4GB should be enough to process 100 GB of data.
Naively I would just build up a hash table until it hits a certain limit in memory, then sort it in memory and write this out. Finally, you can do n-way merging of each chunk. At most you will have 100/4 chunks or so, but probably many fewer provided some words are more common than others (and how they cluster).
Another option is to use a trie which was built for this kind of thing. Each character in the string becomes a branch in a 256-way tree and at the leaf you have the counter. Look up the data structure on the web.
If you can pardon the pun, "trie" this:
public class Trie : Dictionary<char, Trie>
{
public int Frequency { get; set; }
public void Add(string word)
{
this.Add(word.ToCharArray());
}
private void Add(char[] chars)
{
if (chars == null || chars.Length == 0)
{
throw new System.ArgumentException();
}
var first = chars[0];
if (!this.ContainsKey(first))
{
this.Add(first, new Trie());
}
if (chars.Length == 1)
{
this[first].Frequency += 1;
}
else
{
this[first].Add(chars.Skip(1).ToArray());
}
}
public int GetFrequency(string word)
{
return this.GetFrequency(word.ToCharArray());
}
private int GetFrequency(char[] chars)
{
if (chars == null || chars.Length == 0)
{
throw new System.ArgumentException();
}
var first = chars[0];
if (!this.ContainsKey(first))
{
return 0;
}
if (chars.Length == 1)
{
return this[first].Frequency;
}
else
{
return this[first].GetFrequency(chars.Skip(1).ToArray());
}
}
}
Then you can call code like this:
var t = new Trie();
t.Add("Apple");
t.Add("Banana");
t.Add("Cherry");
t.Add("Banana");
var a = t.GetFrequency("Apple"); // == 1
var b = t.GetFrequency("Banana"); // == 2
var c = t.GetFrequency("Cherry"); // == 1
You should be able to add code to traverse the trie and return a flat list of words and their frequencies.
If you find that this too still blows your memory limit then might I suggest that you "divide and conquer". Maybe scan the source data for all the first characters and then run the trie separately against each and then concatenate the results after all of the runs.
do you know how many different words you have? if not a lot (i.e. hundred thousand) then you can stream the input, determine words and use a hash table to keep the counts. after input is done just traverse the result.
Just use a DBM file. It’s a hash on disk. If you use the more recent versions, you can use a B+Tree to get in-order traversal.
Why not use any relational DB? The procedure would be as simple as:
Create a table with the word and count.
Create index on word. Some databases have word index (f.e. Progress).
Do SELECT on this table with the word.
If word exists then increase counter.
Otherwise - add it to the table.
If you are using python, you can check the built-in iter function. It will read line by line from your file and will not cause memory problems. You should not "return" the value but "yield" it.
Here is a sample that I used to read a file and get the vector values.
def __iter__(self):
for line in open(self.temp_file_name):
yield self.dictionary.doc2bow(line.lower().split())

Resources