Quickselect algorithm condition - algorithm

So I am going through the Quick select algorithm from the CLRS book and I understand the whole concept of the algorithm. But one thing that I am not able to grasp is the initial condition they have at the top. Following is my implementation of the algorithm from the book:
private int quickSelect(int[] arr, int p, int r, int i){
if(r==p){
return arr[p];
}
int q=partition(arr,p,r);
if(i-1==q){
return arr[q];
}
else if(i-1<q){
return quickSelect(arr,p,q-1,i);
}
else{
return quickSelect(arr,q+1,r,i);
}
}
Here, why do we check if p==r and return arr[p]? The value of position i passed might not be the same as p/r. If they give a value higher than the array length, then this would not make sense. Instead of that, it would be better to throw an exception after checking it. I went through a lot of resources on the internet and some of them also use the same condition at the start.
QuickSelect Algorithm
In this post, it is given that the condition is to check if there is only one element is present but I am not sure how that works.

Related

Finding Minimum Value of each point when interval with values are given(See Body)

This is from a contest!
I have suppose M intervals of type [L,R] ,(1<= L , R <= N ) each having a cost Ci.
Now These intervals aren't to be taken as whole(We can split them!)
After splitting them, I have to report the minimum value possible , i.e. if i (1<=i <=N) belongs to K intervals , I want the minimum value of all the costs of Intervals that contain i!
What am I doing ? I tried to create a Segment Tree (Modified a bit) ! I am using lazy propagation.! Note every segment is a waste to me except Segments of Length one ! Why? Just because I need the minimum value of each point rather than a segment! So I Update each interval and then build my solution from it
It isn't working properly I guess (It is giving wrong answer!)
I just want to know whether am I totally wrong (So I can quit it ! ) Or not !
My Update function:
void Update(int L,int R,int qe ,int qr,int e,int idx)
{
if(Tree[idx].lazy!=INT_MAX)
{
Tree[idx].value=min(Tree[idx].value,Tree[idx].lazy);
if(L!=R)
{
Tree[rc(idx)].lazy=min(Tree[rc(idx)].lazy,Tree[idx].lazy);
Tree[lc(idx)].lazy=min(Tree[lc(idx)].lazy,Tree[idx].lazy);
}
Tree[idx].lazy=INT_MAX;
}
if(L>qr || qe>R)
return ;
if(L>=qe && qr>=R)
{
Tree[idx].value=min(Tree[idx].value,e);
if(L!=R)
{
Tree[rc(idx)].lazy=min(Tree[rc(idx)].lazy,e);
Tree[lc(idx)].lazy=min(Tree[lc(idx)].lazy,e);
}
return ;
}
Update(L,mid(L,R),qe,qr,e,lc(idx));
Update(mid(L,R)+1,R,qe,qr,e,rc(idx));
Tree[idx]=Merge(Tree[lc(idx)],Tree[rc(idx)]);
return ;
}
GET Function:
int Get(int L,int R,int qe,int idx)
{
if(Tree[idx].lazy!=INT_MAX)
{
Tree[idx].value=min(Tree[idx].value,Tree[idx].lazy);
if(L!=R)
{
Tree[rc(idx)].lazy=min(Tree[rc(idx)].lazy,Tree[idx].lazy);
Tree[lc(idx)].lazy=min(Tree[lc(idx)].lazy,Tree[idx].lazy);
}
Tree[idx].lazy=INT_MAX;
}
if(L==qe && qe==R)
return Tree[idx].value;
if(qe<=mid(L,R))
return Get(L,mid(L,R),qe,lc(idx));
else
return Get(mid(L,R)+1,R,qe,rc(idx));
}
Note the actual problem requires much more than this.! It is just facilitating the problem not actually solving the problem!
Actually My code really works and it gives me correct output. Lately I figured out that I was making a mistake somewhere else.
The explanation of my segment tree is as follows:
1) Build a tree with all values +INFINITY
2) Now whenever a range comes go till that range and mark it's child as lazy but here we do not necessarily change the value we take the min of Lazy value just because it is not an update rather than one more value.!
3) When relaxing the Lazy node, You do not necessarily change the value you take the min of Lazy parameter and value !
4) Now whenever you query(for point),the Lazy values will traverse down and give you correct output.
But I realised I can do it via simple brute force too! By maintaining one array in complexity O(N+M).

Lazy Shuffle Algorithms

I have a large list of elements that I want to iterate in random order. However, I cannot modify the list and I don't want to create a copy of it either, because 1) it is large and 2) it can be expected that the iteration is cancelled early.
List<T> data = ...;
Iterator<T> shuffled = shuffle(data);
while (shuffled.hasNext()) {
T t = shuffled.next();
if (System.console().readLine("Do you want %s?", t).startsWith("y")) {
return t;
}
}
System.out.println("That's all");
return t;
I am looking for an algorithm were the code above would run in O(n) (and preferably require only O(log n)space), so caching the elements that were produced earlier is not an option. I don't care if the algorithm is biased (as long as it's not obvious).
(I uses pseudo-Java in my question, but you can use other languages if you wish)
Here is the best I got so far.
Iterator<T> shuffle(final List<T> data) {
int p = data.size();
while ((data.size() % p) == 0) p = randomPrime();
return new Iterator<T>() {
final int prime = p;
int n = 0, i = 0;
public boolean hasNext() { return i < data.size(); }
public T next() {
i++; n += prime;
return data.get(n);
}
}
}
Iterating all elements in O(n), constant space, but obviously biased as it can produce only data.size() permutations.
The easiest shuffling approaches I know of work with indices. If the List is not an ArrayList, you may end up with a very inefficient algorithm if you try to use one of the below (a LinkedList does have a get by ID, but it's O(n), so you'll end up with O(n^2) time).
If O(n) space is fine, which I'm assuming it's not, I'd recommend the Fisher-Yates / Knuth shuffle, it's O(n) time and is easy to implement. You can optimise it so you only need to perform a single operation before being able to get the first element, but you'll need to keep track of the rest of the modified list as you go.
My solution:
Ok, so this is not very random at all, but I can't see a better way if you want less than O(n) space.
It takes O(1) space and O(n) time.
There may be a way to push it up the space usage a little and get more random results, but I haven't figured that out yet.
It has to do with relative primes. The idea is that, given 2 relative primes a (the generator) and b, when you loop through a % b, 2a % b, 3a % b, 4a % b, ..., you will see every integer 0, 1, 2, ..., b-2, b-1, and this will also happen before seeing any integer twice. Unfortunately I don't have a link to a proof (the wikipedia link may mention or imply it, I didn't check in too much detail).
I start off by increasing the length until we get a prime, since this implies that any other number will be a relative prime, which is a whole lot less limiting (and just skip any number greater than the original length), then generate a random number, and use this as the generator.
I'm iterating through and printing out all the values, but it should be easy enough to modify to generate the next one given the current one.
Note I'm skipping 1 and len-1 with my nextInt, since these will produce 1,2,3,... and ...,3,2,1 respectively, but you can include these, but probably not if the length is below a certain threshold.
You may also want to generate a random number to multiply the generator by (mod the length) to start from.
Java code:
static Random gen = new Random();
static void printShuffle(int len)
{
// get first prime >= len
int newLen = len-1;
boolean prime;
do
{
newLen++;
// prime check
prime = true;
for (int i = 2; prime && i < len; i++)
prime &= (newLen % i != 0);
}
while (!prime);
long val = gen.nextInt(len-3) + 2;
long oldVal = val;
do
{
if (val < len)
System.out.println(val);
val = (val + oldVal) % newLen;
}
while (oldVal != val);
}
This is an old thread, but in case anyone comes across this in future, a paper by Andrew Kensler describes a way to do this in constant time and constant space. Essentially, you create a reversible hash function, and then use it (and not an array) to index the list. Kensler describes a method for generating the necessary function, and discusses "cycle walking" as a way to deal with a domain that is not identical to the domain of the hash function. Afnan Enayet's summary of the paper is here: https://afnan.io/posts/2019-04-05-explaining-the-hashed-permutation/.
You may try using a buffer to do this. Iterate through a limited set of data and put it in a buffer. Extract random values from that buffer and send it to output (or wherever you need it). Iterate through the next set and keep overwriting this buffer. Repeat this step.
You'll end up with n + n operations, which is still O(n). Unfortunately, the result will not be actually random. It will be close to random if you choose your buffer size properly.
On a different note, check these two: Python - run through a loop in non linear fashion, random iteration in Python
Perhaps there's a more elegant algorithm to do this better. I'm not sure though. Looking forward to other replies in this thread.
This is not a perfect answer to your question, but perhaps it's useful.
The idea is to use a reversible random number generator and the usual array-based shuffling algorithm done lazily: to get the i'th shuffled item, swap a[i] with and a randomly chosen a[j] where j is in [i..n-1], then return a[i]. This can be done in the iterator.
After you are done iterating, reset the array to original order by "unswapping" using the reverse direction of the RNG.
The unshuffling reset will never take longer than the original iteration, so it doesn't change asymptotic cost. Iteration is still linear in the number of iterations.
How to build a reversible RNG? Just use an encryption algorithm. Encrypt the previously generated pseudo-random value to go forward, and decrypt to go backward. If you have a symmetric encryption algorithm, then you can add a "salt" value at each step forward to prevent a cycle of two and subtract it for each step backward. I mention this because RC4 is simple and fast and symmetric. I've used it before for tasks like this. Encrypting 4-byte values then computing mod to get them in the desired range will be quick indeed.
You can press this into the Java iterator pattern by extending Iterator to allow resets. See below. Usage will look like:
ShuffledList<Integer> lst = new SuffledList<>();
... build the list with the usual operations
ResetableInterator<Integer> i = lst.iterator();
while (i.hasNext()) {
int val = i.next();
... use the randomly selected value
if (anyConditinoAtAll) break;
}
i.reset(); // Unshuffle the array
I know this isn't perfect, but it will be fast and give a good shuffle. Note that if you don't reset, the next iterator will still be a new random shuffle, but the original order will be lost forever. If the loop body can generate an exception, you'd want the reset in a finally block.
class ShuffledList<T> extends ArrayList<T> implements Iterable<T> {
#Override
public Iterator<T> iterator() {
return null;
}
public interface ResetableInterator<T> extends Iterator<T> {
public void reset();
}
class ShufflingIterator<T> implements ResetableInterator<T> {
int mark = 0;
#Override
public boolean hasNext() {
return true;
}
#Override
public T next() {
return null;
}
#Override
public void remove() {
throw new UnsupportedOperationException("Not supported.");
}
#Override
public void reset() {
throw new UnsupportedOperationException("Not supported yet.");
}
}
}

Why is the following two duplicate finder algorithms have different time complexity?

I was reading this question. The selected answer contains the following two algorithms. I couldn't understand why the first one's time complexity is O(ln(n)). At the worst case, if the array don't contain any duplicates it will loop n times so does the second one. Am I wrong or am I missing something? Thank you
1) A faster (in the limit) way
Here's a hash based approach. You gotta pay for the autoboxing, but it's O(ln(n)) instead of O(n2). An enterprising soul would go find a primitive int-based hash set (Apache or Google Collections has such a thing, methinks.)
boolean duplicates(final int[] zipcodelist)
{
Set<Integer> lump = new HashSet<Integer>();
for (int i : zipcodelist)
{
if (lump.contains(i)) return true;
lump.add(i);
}
return false;
}
2)Bow to HuyLe
See HuyLe's answer for a more or less O(n) solution, which I think needs a couple of add'l steps:
static boolean duplicates(final int[] zipcodelist) {
final int MAXZIP = 99999;
boolean[] bitmap = new boolean[MAXZIP+1];
java.util.Arrays.fill(bitmap, false);
for (int item : zipcodeList)
if (!bitmap[item]) bitmap[item] = true;
else return true;
}
return false;
}
The first solution should have expected complexity of O(n), since the whole zip code list must be traversed, and processing each zip code is O(1) expected time complexity.
Even taking into consideration that insertion into HashMap may trigger a re-hash, the complexity is still O(1). This is a bit of non sequitur, since there may be no relation between Java HashMap and the assumption in the link, but it is there to show that it is possible.
From HashSet documentation:
This class offers constant time performance for the basic operations (add, remove, contains and size), assuming the hash function disperses the elements properly among the buckets.
It's the same for the second solution, which is correctly analyzed: O(n).
(Just an off-topic note, BitSet is faster than array, as seen in the original post, since 8 booleans are packed into 1 byte, which uses less memory).

How to find the rank of a node in an AVL tree?

I need to implement two rank queries [rank(k) and select(r)]. But before I can start on this, I need to figure out how the two functions work.
As far as I know, rank(k) returns the rank of a given key k, and select(r) returns the key of a given rank r.
So my questions are:
1.) How do you calculate the rank of a node in an AVL(self balancing BST)?
2.) Is it possible for more than one key to have the same rank? And if so, what woulud select(r) return?
I'm going to include a sample AVL tree which you can refer to if it helps answer the question.
Thanks!
Your question really boils down to: "how is the term 'rank' normally defined with respect to an AVL tree?" (and, possibly, how is 'select' normally defined as well).
At least as I've seen the term used, "rank" means the position among the nodes in the tree -- i.e., how many nodes are to its left. You're typically given a pointer to a node (or perhaps a key value) and you need to count the number of nodes to its left.
"Select" is basically the opposite -- you're given a particular rank, and need to retrieve a pointer to the specified node (or the key for that node).
Two notes: First, since neither of these modifies the tree at all, it makes no real difference what form of balancing is used (e.g., AVL vs. red/black); for that matter a tree with no balancing at all is equivalent as well. Second, if you need to do this frequently, you can improve speed considerably by adding an extra field to each node recording how many nodes are to its left.
Rank is the number of nodes in the Left sub tree plus one, and is calculated for every node. I believe rank is not a concept specific to AVL trees - it can be calculated for any binary tree.
Select is just opposite to rank. A rank is given and you have to return a node matching that rank.
The following code will perform rank calculation:
void InitRank(struct TreeNode *Node)
{
if(!Node)
{
return;
}
else
{ Node->rank = 1 + NumeberofNodeInTree(Node->LChild);
InitRank(Node->LChild);
InitRank(Node->RChild);
}
}
int NumeberofNodeInTree(struct TreeNode *Node)
{
if(!Node)
{
return 0;
}
else
{
return(1+NumeberofNodeInTree(Node->LChild)+NumeberofNodeInTree(Node->RChild));
}
}
Here is the code i wrote and worked fine for AVL Tree to get the rank of a particular value. difference is just you used a node as parameter and i used a key a parameter. you can modify this as your own way. Sample code:
public int rank(int data){
return rank(data,root);
}
private int rank(int data, AVLNode r){
int rank=1;
while(r != null){
if(data<r.data)
r = r.left;
else if(data > r.data){
rank += 1+ countNodes(r.left);
r = r.right;
}
else{
r.rank=rank+countNodes(r.left);
return r.rank;
}
}
return 0;
}
[N.B] If you want to start your rank from 0 then initialize variable rank=0.
you definitely should have implemented the method countNodes() to execute this code.

Sudoku Solver by Backtracking not working

Assuming a two dimensional array holding a 9x9 sudoku grid, where is my solve function breaking down? I'm trying to solve this using a simple backtracking approach. Thanks!
bool solve(int grid[9][9])
{
int i,j,k;
bool isSolved = false;
if(!isSolved(grid))
isSolved = false;
if(isSolved)
return isSolved;
for(i=0; i<9; i++)
{
for(j=0; j<9; j++)
{
if(grid[i][j] == 0)
{
for(k=1; k<=9; k++)
{
if(legalMove(grid,i,j,k))
{
grid[i][j] = k;
isSolved = solve(grid);
if (isSolved)
return true;
}
grid[i][j] = 0;
}
isSolved = false;
}
}
}
return isSolved;
}
Even after changing the isSolved issues, my solution seems to breakdown into an infinite loop. It appears like I am missing some base-case step, but I'm not sure where or why. I have looked at similar solutions and still can't identify the issue. I'm just trying to create basic solver, no need to go for efficiency. Thanks for the help!
Yea your base case is messed up. In recursive functions base cases should be handled at the start. You got
bool isSolved = false;
if(!isSolved(grid))
isSolved = false;
if(isSolved)
return isSolved;
notice your isSolved variable can never be set to true, hence your code
if(isSolved)
return isSolved;
is irrelevant.
Even if you fix this, its going to feel like an infinite loop even though it is finite. This is because your algorithm has a possible total of 9*9*9 = 729 cases to check every time it calls solve. Entering this function n times may require up to 729^n cases to be checked. It won't be checking that many cases obviously because it will find dead ends when placement is illegal, but whose to say that 90% of the arragements of the possible numbers result in cases where all but one number fit legally? Moreover, even if you were to check k cases on average where k is a small number (k<=10) this would still blow up (run time of k^n.)
The trick is to "try" placing numbers where they will likely result in a high probability of being the actual good placement. Probably the simplest way I can think of doing this is a constraint satisfaction solver, or a search algorithm with a heuristic (like A*.)
I actually wrote a sudoku solver based on a constraint satisfaction solver and it would solve 100x100 sudokus in less than a second.
If by some miracle the "brute force" backtracking algorithm works well for you in the 9x9 case try higher values, you will quickly see a deterioation in run time.
I'm not bashing the backtracking algorithm, in fact I love it, its been shown time and time again that backtracking if implemented correctly can be just as efficient as dynamic programming, however, in your case you aren't implementing it correctly. You are bruteforcing it, you might as well just make your code non-recursive, it will accomplish the same thing.
You refer to isSolved as both a function and a boolean variable.
I don't think this is legal, and its definitely not smart.
Your functions should have distinct names from your variables.
It seems that regardless of whether or not it is a legal move, you are assigning "0" to the square, with that "grid[i][j] = 0;" line. Maybe you meant to put "else" and THEN "grid[i][j] = 0;" ?

Resources