LSD Radix Sort for Integers - ruby

I'm having trouble wrapping my head around using radix sort for a group of fixed-length integers. In my below attempt to implement least significant digit radix sort, I have a function called num_at which returns the digit d in the number num. The code I've written has w = 2, where w represents the length of each number. (Essentially, then, this code is written for 3 digit numbers as my input below shows).
I modeled this off of key-indexed counting for each digit, but I'm getting an output of [0, 12, 32, 32, 44, 0] and frankly having a tough time following why. This is my input: [32,12,99,44,77, 12] I initiated the count array with 99 values because of all the possible values there can be between 0 - 99 in accordance with key indexed counting. Does anything here immediately jump out as incorrect? Also, is my num_at method the right way to do this or is there a better way?
def num_at(num, d)
return num if num/10 == 0
a = num
#return the dth digit of num
counter = 0
until counter == d
a/=10
counter += 1
end
a % 10
end
def radix_sort(arr)
w = 2
#count_arr can have any possible number from 0-99
aux = Array.new(arr.length) {0}
d = w-1
while d >= 0
count = Array.new(99) {0}
i = 0
#create freq arr
while i < arr.length
count[num_at(arr[i], d) + 1] += 1 #offset by 1
i += 1
end
#compute cumulates
i = 0
while i < count.length - 1
count[i + 1] += count[i]
i += 1
end
z = 0
#populate aux arr
while z < arr.length
aux[num_at(arr[z], d)] = arr[z]
count[num_at(arr[z], d)] += 1
z += 1
end
#override original arr
z = 0
while z < arr.length
arr[z] = aux[z]
z += 1
end
d -= 1
end
arr
end
This is functioning Java code for LSD radix sort taken from a textbook that I'm trying to implement with Ruby:
public static void lsd(String[] a)
{
int N = a.length;
int W = a[0].length;
for (int d = W-1; d >= 0; d--)
{
int[] count = new int[R];
for (int i = 0; i < N; i++)
count[a[i].charAt(d) + 1]++;
for (int k = 1; k < 256; k++)
count[k] += count[k-1];
for (int i = 0; i < N; i++)
temp[count[a[i].charAt(d)]++] = a[i];
for (int i = 0; i < N; i++)
a[i] = temp[i];
}
and the pseudocode for key-indexed counting (which is being repeated for every char):
Task: sort an array a[] of N integers between 0 and R-1
Plan: produce sorted result in array temp[]
1. Count frequencies of each letter using key as index
2. Compute frequency cumulates
3. Access cumulates using key as index to find record positions.
4. Copy back into original array
5. List item
LSD radix sort. Consider characters d from right to left Stably sort
using dth character as the key via key-indexed counting.
The above java code / pseudocode information is pulled from this link: http://www.cs.princeton.edu/courses/archive/spring07/cos226/lectures/11RadixSort.pdf

Related

Why second version of dynamic programming is wrong

If I was given a array of positive integer, like [2,19,6,16,5,10,7,4,11,6], I wish to find
the biggest subset sum attainable from the above array so that the sum is divisible by 3. I try to solve it using dynamic programming
let dp[i][j] to be the biggest sum attained up to index i in array with remainder of j, which is
0,1,2 since I am finding something divisible by 3.
And I have two implementation below:
int n = nums.length;
int[][] dp = new int[n+1][3];
dp[0][0] = 0;
dp[0][1] = Integer.MIN_VALUE;
dp[0][2] = Integer.MIN_VALUE;
for(int i = 1; i <= n; i++) {
for(int j = 0; j < 3; j++) {
int remain = nums[i-1] % 3;
int remainder = (j + 3 - remain) % 3;
dp[i][j] = Math.max(dp[i-1][remainder] + nums[i-1], dp[i-1][j]);
}
}
return dp[n][0];
int n = nums.length;
int[][] dp = new int[n+1][3];
dp[0][0] = nums[0] % 3 == 0 ? nums[0] : Integer.MIN_VALUE;
dp[0][1] = nums[0] % 3 == 1 ? nums[0] : Integer.MIN_VALUE;
dp[0][2] = nums[0] % 3 == 2 ? nums[0] : Integer.MIN_VALUE;
for(int i = 1; i < n; i++) {
for(int j = 0; j < 3; j++) {
int remain = nums[i] % 3;
int remainder = (j + 3 - remain) % 3;
dp[i][j] = Math.max(dp[i-1][remainder] + nums[i], dp[i-1][j]);
}
}
return dp[n-1][0] == Integer.MIN_VALUE ? 0 : dp[n-1][0];
Both implementation above was base on the fact that I either add nums[i] or not, and I add the nums[i] to the table with the corresponding remainder before/after I added nums[i], which is like knapsack DP, but the first version pass all test cases and the one below failed for some of them. Like [2,19,6,16,5,10,7,4,11,6], it gives 81 instead of the correct answer 84, can anyone explain why the second version is wrong?
The first version is calculating the largest subset sum divisible by 3; the second version calculates the largest sum divisible by 3 of a subset that includes nums[0], the first element.
The sole difference in the two versions is the base case for dynamic programming. The first version has the correct base cases: after processing zero elements, the only subset sum possible is zero. In the second version, the base case starts at 1, and implies that after processing one element, the only subset sum possible is the one containing that first element. All future subset sums are forced to use that element.
Try running the code on the array [1, 3]. The second version will return zero, because it does not consider subsets without 1.

How can we output the indices j and k that identify the maximum subarray

Maximum Subarray Problem
Given an array of n integers, find the subarray, A[j:k] that maximizes the sum as
requirement :
How can we output the indices j and k that
identify the maximum subarray A[j : k]
Please help me out
thanks
You could simply use the most popular algorithm for this, Kadanes Algorithm.
As the algorithm states, I assume a non-empty array passed to it. Then I simply track the indices where the maximum sum begins and ends using two variables, prev and curr
int maxsubarray(int arr[])
{
int maxsofar = arr[0], sum = arr[0], prev = 0, curr = 0;
for i = 1 to arr.size
//number greater than the addition of sums
if(nums[i] > sum + nums[i]) prev = i;
//maximum sum till this point in the array
sum = max(nums[i], sum + nums[i])
// if this is better than what we have so far
if(sum > maxsofar)
//store this sum and the array value
maxsofar = sum
curr = i
print "start index of sum:", prev
print "end index of sum:", curr
// return the maximum value also found so far
return maxsofar;
}
The first solution that came to my mind instantly was: Instead of storing a single value in M, you can use an object like below:
struct M {
float sum,
int start_index,
int end_index
}
Now change your algorithm as below to work with M object:
Algorithm MaxSubFastest2(A):
M[0] = {0, 1, 0}
for t in 1 to n do:
if M[t-1]+A[t] > 0:
M[t] = {M[t-1]+A[t], M[t-1].start_index, M[t-1].end_index+1}
else:
M[t] = {0, 1, 0}
m = 0
start_index = 0
end_index = 0
for t in 1 to n do:
if M[t].sum > m:
m = M[t].sum
start_index = M[t].start_index
end_index = M[t].end_index
if start_index <= end_index:
subarray exists => return m, start_index, end_index
subarray does not exist => error
This algorithm has the same time complexity as above, but will use extra memory.
A more memory optimized version is here:
Algorithm MaxSubFastest3(A):
max_so_far = 0
max_ending_here = 0
start =0
end = 0
s=0
for i in 1 to n:
max_ending_here += A[i]
if max_so_far < max_ending_here:
max_so_far = max_ending_here
start = s
end = i
if max_ending_here < 0:
max_ending_here = 0
s = i + 1
output max_so_far, start, end
This takes up same time and memory as the one in question.
You can find other variations of the same algorithm here.
Plus, I notice that the algorithm you've given in the question will always produce 0 as maximum sum for a list full of -ve numbers. I wonder if that is the expected behavior..

Smallest monetary amount that can be obtained using only coins of specified denominations that exceeds a threshold

In other words, given a set of n positive integers A and a threshold B, I want to find the smallest C so that:
C > B
C = A[1] * k[1] + A[2] * k[2] + ... + A[n] * k[n], k[i] are integers >= 0
As an example if A = { 6, 11, 16 } then the values we can obtain are: { 0, 6, 11, 12, 16, 17, 18, 22, 23, 24, 27, 28, 29, 30, 32 ... } so if B = 14 then C would be 16, B = 22 => C = 23, B = 18 => C = 22
This problem was given these constraints: 2 < n < 5000 0 < A[i] < 50000 and 1 < B < 10^9 ( this is why I got stuck ). Also you had to calculate for an array B of size < 1000 an array C (but this may not matter). And the algorithm should run in under 0.3 sec in C++.
An algorithm like the one described here solves it but it is not fast enough: https://www.geeksforgeeks.org/dynamic-programming-set-7-coin-change/
I calculate table until B + Amin because Amin * k <= B <= Amin * ( k + 1 ) <= B + Amin
Here's the algorithm (in pseudo C++):
int n, A[n], B;
int Amin; // smallest number from A
// table[i] will tell us if it is possible or not to obtain the number i
bool table[B + Amin];
table[0] = true;
for( int i = 0; i < n; ++i )
{
int x = A[i]; // current number / denomination
for( int j = x; j <= B + Amin; ++j )
if( table[j - x] )
table[j] = true;
}
// now we can do something like this:
int result = B + 1;
while( !table[result] )
++result;
This algorithm has a complexity of O(n*B) and I'm looking for something that is independent of B ( or maybe has O(log(B)) or O(sqrt(B)) )
Note: if we make the first requirement C >= B then the problem doesn't change ( just add +1 to B ) and we can ask it like this: If we have specific coins or banknotes ( infinite of them ) and want to purchase something with them, then what is the amount we can pay so that the cashier has to give back minimal change.
Things that I suspect may help:
https://en.wikipedia.org/wiki/Coin_problem
If Greatest Common Divisor ( x, y ) = 1 then anything higher than xy − x − y can be obtained using x and y.
https://en.wikipedia.org/wiki/Extended_Euclidean_algorithm
https://en.wikipedia.org/wiki/Subset_sum_problem
Edit: added example and note.
I don't think you can get better than O(n*B), because the Frobenius number (above this number all amounts can be built with the given denominations) of 49999 and 49998 is 2499750005 a lot bigger than 10^9 and you need to calculate the best value at least for some inputs. If gcd(A) > 1 then the Frobenius number doesn't exist, but this could be prevented by dividing all A and B (round down) by gcd(A) and multiply the C you get by gcd(A) to get the final result.
There is still a lot of room for improvements in your pseudo code. You look at all denominations almost B+Amin times and also set the value in the table to true multiple times.
The standard implementation would look something like this:
sort(A);
table[0] = true;
for (int i = A[0]; i <= B + A[0]; i++)
for (int j = 0; j < n && A[j] <= i; j++)
if (table[i - A[j]]) {
table[i] = true;
break;
}
This is already a little better (note the break). I call it the backwards implementation, because you look back from all positions in the table to see if you can find a value which has the difference of one of the given denominations. You could also introduce a counter for the number of consecutive values set to true in the table (increase the counter when you set a value in the table to true, reset if the value couldn't be built, return B+1 if counter == A[0] - 1).
Maybe you could even get better results with a forward implementation, because the table can be very sparse, here table values which are false are skipped instead of denominations:
table[0] = true;
for (int i = 0; i <= B + Amin; i++)
if (table[i])
for (j = 0; j < n; j++)
if (i + A[j] <= B + Amin)
table[i + A[j]] = true;

Replace operators of equation, so that the sum is equal to zero

I'm given the equation like this one:
n = 7
1 + 1 - 4 - 4 - 4 - 2 - 2
How can I optimally replace operators, so that the sum of the equation is equal to zero, or print  -1. I think of one algorithm, but it is not optimal. I have an idea to bruteforce all cases with complexity O(n*2^n), but (n < 300).
Here is the link of the problem: http://codeforces.com/gym/100989/problem/M.
You can solve this with dynamic programming. Keep a map of all possible partial sums (mapping to the minimum number of changes to reach this sum), and then update it one number at a time,
Here's a concise Python solution:
def signs(nums):
xs = {nums[0]: 0}
for num in nums[1:]:
ys = dict()
for d, k in xs.iteritems():
for cost, n in enumerate([num, -num]):
ys[d+n] = min(ys.get(d+n, 1e100), k+cost)
xs = ys
return xs.get(0, -1)
print signs([1, 1, -4, -4, -4, -2, -2])
In theory this has exponential complexity in the worst case (since the number of partial sums can double at each step). However, if (as here) the given numbers are always (bounded) small ints, then the number of partial sums grows linearly, and the program works in O(n^2) time.
A somewhat more optimised version uses a sorted array of (subtotal, cost) instead of a dict. One can discard partial sums that are too large or too small (making it impossible to end up at 0 assuming all of the remaining elements are between -300 and +300). This runs approximately twice as fast, and is a more natural implementation to port to a lower-level language than Python for maximum speed.
def merge(xs, num):
i = j = 0
ci = 0 if num >= 0 else 1
cj = 0 if num < 0 else 1
num = abs(num)
while j < len(xs):
if xs[i][0] + num < xs[j][0] - num:
yield (xs[i][0] + num, xs[i][1] + ci)
i += 1
elif xs[i][0] + num > xs[j][0] - num:
yield (xs[j][0] - num, xs[j][1] + cj)
j += 1
else:
yield (xs[i][0] + num, min(xs[i][1] + ci, xs[j][1] + cj))
i += 1
j += 1
while i < len(xs):
yield (xs[i][0] + num, xs[i][1] + ci)
i += 1
def signs2(nums):
xs = [(nums[0], 0)]
for i in xrange(1, len(nums)):
limit = (len(nums) - 1 - i) * 300
xs = [x for x in merge(xs, nums[i]) if -limit <= x[0] <= limit]
for x, c in xs:
if x == 0: return c
return -1
print signs2([1, 1, -4, -4, -4, -2, -2])
Here is the implementation in C++:
unordered_map <int, int> M, U;
unordered_map<int, int>::iterator it;
int a[] = {1, -1, 4, -4};
int solve() {
for(int i = 0; i < n; ++i) {
if(i == 0) M[a[i]] = 1;
else {
vector <pair <int, int>> vi;
for(it = M.begin(); it != M.end(); ++it) {
int k = it->first, d = it->second;
vi.push_back({k + a[i], d});
vi.push_back({k - a[i], d + 1});
}
for(int j = 0; j < vi.size(); ++j) M[vi[j].first] = MAXN;
for(int j = 0; j < vi.size(); ++j) {
M[vi[j].first] = min(M[vi[j].first], vi[j].second);
}
}
}
return (M[0] == 0 ? -1 : M[0] - 1);
}
What I can think of:
You calculate the original equation. This results in -14.
Now you sort the numbers (taking into account their + or -)
When the equation results in a negative number, you look for the largest numbers to fix the equation. When a number is too large, you skip it.
orig_eq = -14
After sorting:
-4, -4, -4, -2, -2, 1, 1
You loop over this and select each number if the equation orig_eq - current number is closer to zero.
This way you can select each number to change the sign of

Number of contiguous subarrays in which element of array is max

Given an array of 'n' integers, i need to find for each element of the array, the number of continuous subarrays that have that element as its max element.
Elements can repeat.
Is there a way to do it in less than O(n^2).
O(nlogn) or O(n)?
Example-
If array is {1,2,3}. Then-
For '1': 1 Such subarray {1}.
For '2': 2 Such subarrays {2},{1,2}
For '3': 3 Such subarrays {3},{2,3},{1,2,3}
I am having hard time trying to explain my solution in words. I will just add the code. It will explain itself:
#include <iostream>
#include <fstream>
using namespace std;
#define max 10000
int main(int argc, const char * argv[]) {
ifstream input("/Users/appleuser/Documents/Developer/xcode projects/SubArrayCount/SubArrayCount/input.in");
int n, arr[max], before[max]={0}, after[max]={0}, result[max];
input >> n;
for (int i=0; i<n; i++)
input >> arr[i];
for (int i=0;i<n;i++)
for (int j=i-1;j>=0&&arr[j]<arr[i];j-=before[j]+1)
before[i]+=before[j]+1;
for (int i=n-1;i>=0;i--)
for (int j=i+1;j<n&&arr[j]<arr[i];j+=after[j]+1)
after[i]+=after[j]+1;
for (int i=0;i<n;i++)
result[i]= (before[i]+1)*(after[i]+1);
for (int i=0; i<n; i++)
cout << result [i] << " ";
cout << endl;
return 0;
}
Explanation for (before[i]+1)*(after[i]+1):
for each value we need the numbers lies before and less than the value and the numbers lies after and less than the value.
| 0 1 2 3 4 5 .... count of numbers less than the value and appears before.
---------------------
0 | 1 2 3 4 5 6
1 | 2 4 6 8 10 12
2 | 3 6 9 12 15 18
3 | 4 8 12 16 20 24
4 | 5 10 15 20 25 30
5 | 6 12 18 24 30 36
. |
. |
. |
count of numbers less than the value and appears after.
Example: for a number that have 3 values less than it and appears before and have 4 values less than it and appears after. answer is V(3,4) = 20 = (3+1) * (4+1)
please, let me know the results.
Did you manage to find the source link of the problem?
You could store SubArrays sizes in another Array (arr2) to save yourself recalculating them.
arr2 must be the length of the max value in the arr1
i.e -
Take the Array {1,2,4,6,7,8}
arr2 is declared like this:
arr2 = []
for i in range(max(arr1)):
arr2.append(0)
Now, the algorithm goes like this:
Say you hit number 6.
Since 6-1=5 does not exist it has a default value of 0 corresponding to index 5 in arr2, because nothing has been added there yet. So you store 0+1=1 in position 6 of arr2. Then you hit the number 7. You check if 7-1=6 exists in arr2. It does, with a value of 1. Hence add the value of 1+1=2 to position 7 in arr2.
For each value in arr2 we simply add this to the count. We can do so simultaneously with a count variable.
This algorithm is O(n)
Here is my O(N) time java solution using Stack. Basic idea is to move from left to right keeping track of sub arrays ending at "i" and then right to left keeping track of sub arrays starting from "i":
public int[] countSubarrays(int[] arr) {
Stack<Integer> stack = new Stack<>();
int[] ans = new int[arr.length];
for(int i = 0; i < arr.length; i++) {
while(!stack.isEmpty() && arr[stack.peek()] < arr[i]) {
ans[i] += ans[stack.pop()];
}
stack.push(i);
ans[i]++;
}
stack.clear();
int[] temp = new int[arr.length];
for(int i = arr.length - 1; i >= 0; i--) {
while(!stack.isEmpty() && arr[stack.peek()] < arr[i]) {
int idx = stack.pop();
ans[i] += temp[idx];
temp[i] += temp[idx];
}
stack.push(i);
temp[i]++;
}
return ans;
}
You haven't specified in which way repeating elements are handled/what that element is (the element at the precise position in the array, or any element in the array with the same value).
Assuming the problem is for the element at a precise index this can be solved easily in linear time:
define ctSubarrays(int[] in , int at)
int minInd = at, maxInd = at;
//search for the minimum-index (lowest index with a smaller element than in[at]
for(; minInd > 0 && in[minInd - 1] < in[at] ; minInd--);
//search for the maximum-index (highest index with a smaller element than in[at]
for(; maxInd < length(at) - 1 && in[maxInd + 1] < in[at] ; maxInd++);
//now we've got the length of the largest subarray meeting all constraints
//next step: get the number of possible subarrays containing in[at]
int spaceMin = at - minInd;
int spaceMax = maxInd - at;
return spaceMin * spaceMax;
Lets look at an example.
{4, 5, 6, 3, 2}
Iterating from the begin till the end we can detect single increasing subarray: {4, 5, 6} and two single elements 3, and 2.
So we're detecting lengths of subarrays 3, 1, and 1.
First subarray {4, 5, 6} gives us 6 possible decisions, i.e. 1 + 2 + 3 = 6. It's a key.
For any length of increasing subarray N we can calculate the number of decisions as N * (N + 1)/2.
Therefore we have 3 * (3 + 1)/2 + 1 * (1 + 1)/2 + 1 * (1 + 1)/2, i.e. 6 + 1 + 1 = 8.
While we need a single iteration only, we have O(N) algorithm.
If the array is sorted,
count = 1;
for (i = 1 to n-1){
if(a[i-1] == a[i]){
count = count + 1;
}else if(a[i-1] + 1 == a[i]){
count of sub arrays for a[i-1] = count;
count = count + 1;
}else{
count of sub arrays for a[i-1] = count;
count = 1;
}
}
count of sub arrays for a[n-1] = count;
If the array is not sorted,
Assumption 3:If the array is like {3,1,2,3} then #sub arrays for 3 is 3
aMin = min(a);//O(n)
aMax = max(a);
len = (aMax - aMin + 1);
create array b of size len;
for (j = 0 to len-1){
b[j] = 0;
}
count = 1;
for (i = 1 to n-1){
if(a[i-1] == a[i]){
count = count + 1;
}else if(a[i-1] + 1 == a[i]){
if(b[a[i-1] - aMin] < count){
b[a[i-1] - aMin] = count;
}
count = count + 1;
}else{
if(b[a[i-1] - aMin] < count){
b[a[i-1] - aMin] = count;
}
count = 1;
}
}
if(b[a[n-1] - aMin] < count){
b[a[n-1] - aMin] = count;
}
for (i = 0 to n-1){
count of sub arrays for a[i] = b[a[i] - aMin];
}
This will work even if the array contains negative integers
If Assumption 3 fails according to your problem, and it is like,
Assumption 4:If the array is like {3,1,2,3} then #sub arrays for 3 is 4
{3}, {1,2,3}, {2,3}, {3}
Modify the above code by replacing
if(b[a[i-1] - aMin] < count){
b[a[i-1] - aMin] = count;
}
with this
b[a[i-1] - aMin] = b[a[i-1] - aMin] + count;
Create a value-to-index map and traverse from bottom to top - maintain an augmented tree of intervals. Each time an index is added, adjust the appropriate interval and calculate the total from the relevant segment. For example:
A = [5,1,7,2,3] => {1:1, 2:3, 3:4, 5:0, 7:2}
indexes interval total sub-arrays with maximum exactly
1 (1,1) 1 => 1
1,3 (3,3) 2 => 1
1,3,4 (3,4) 3 => 2
1,3,4,0 (0,1) 5 => 2
1,3,4,0,2 (0,4) 7 => 3 + 2*3 = 9
Insertion and deletion in augmented trees are of O(log n) time-complexity. Worst-case total time-complexity is O(n log n).
Using JavaScript Not sure the Big O notation. But here i'm looping the list. Then starting 2 loops. One counting down from i, and the other counting up from i+1.
let countArray = []
for(let i = 0; i < arr.length; i++){
let count = 0;
*This will count downwards starting at i*
for(let j = i; j >= 0; j--){
if(arr[j] > arr[i]) {break;}
count++;
}
*This will count upwards starting at i+1 so that you dont get a duplicate of the first value*
for(let j = i+1; j < arr.length; j++){
if(arr[j] >= arr[i]) {break;}
count++;
}
countArray.push(count);
}
return countArray;

Resources