Understanding Big-O with a specific example - algorithm

I am working on a rather simple question, to make sure that I understand these concepts.
The question is: there exists an array A of n elements, either being RED, WHITE, or BLUE. Rearrange the array such that all WHITE elements come before all BLUE elements, and all BLUE elements come before all RED elements. Construct an algorithm in O(n) time and O(1) space.
From my understanding, the pseudocode for the solution would be:
numW = numB = 0
for i = 0 to n:
if ARRAY[i] == WHITE:
numW++
else if ARRAY[i] == BLUE:
numB++
for i = 0 to n:
if numW > 0:
ARRAY[i] = WHITE
numW--
else if numB > 0:
ARRAY[i] = BLUE
numB--
else:
ARRAY[i] = RED
I believe it is O(n) because it runs through the loop twice and O(2n) is in O(n). I believe the space is O(1) because it is not dependent on the overall number of elements i.e. there will always be a count for each
Is my understanding correct?

If it's linear time, and your algorithm appears to be, then it's O(n) as you suspect. There's a great summary here: Big-O for Eight Year Olds?

Yes, your solution runs in O(n) time in O(1) space.
Below is my solution which also runs in O(n) time and O(1) space, but also works when we have references to objects, as #kenneth suggested in the comments.
import java.util.Arrays;
import java.util.Random;
import static java.lang.System.out;
class Color{
char c;
Color(char c){
this.c = c;
}
}
public class Solution {
private static void rearrangeColors(Color[] collection){
int ptr = 0;
// move all whites to the left
for(int i=0;i<collection.length;++i){
if(collection[i].c == 'W'){
swap(collection,ptr,i);
ptr++;
}
}
// move all blacks to the left after white
for(int i=ptr;i<collection.length;++i){
if(collection[i].c == 'B'){
swap(collection,ptr,i);
ptr++;
}
}
}
private static void swap(Color[] collection,int ptr1,int ptr2){
Color temp = collection[ptr1];
collection[ptr1] = collection[ptr2];
collection[ptr2] = temp;
}
private static void printColors(Color[] collection){
for(int i=0;i<collection.length;++i){
out.print(collection[i].c + ( i != collection.length - 1 ? "," : ""));
}
out.println();
}
public static void main(String[] args) {
// generate a random collection of 'Color' objects
Random r = new Random();
int array_length = r.nextInt(20) + 1;// to add 1 if in case 0 gets generated
Color[] collection = new Color[array_length];
char[] colors_domain = {'B','W','R'};
for(int i=0;i<collection.length;++i){
collection[i] = new Color(colors_domain[r.nextInt(3)]);
}
// print initial state
printColors(collection);
// rearrange them according to the criteria
rearrangeColors(collection);
// print final state
printColors(collection);
}
}

I won't say this is 100% correct, but a quick test case here did work. If anything, it shows the idea of being able to do it in one pass. Is it faster? Probably not. OP's answer I believe is still the best for this case.
#include <stdio.h>
char temp;
#define SWAP(a,b) { temp = a; a = b; b = temp;}
int main()
{
int n = 10;
char arr[] = "RWBRWBRWBR";
printf("%s\n", arr);
int white = 0;
for(int i=0; i<n; i++)
{
if(arr[i] == 'B')
{
SWAP(arr[i], arr[n-1]);
i--; n--;
}
else if(arr[i] == 'R')
{
SWAP(arr[i], arr[white]);
white++;
}
}
printf("%s\n", arr);
}

Related

Why Unrolled LinkedList is semi filled?

Going through all the articles related to the implementation of Unrolled LinkedList, it seems that based on the capacity we generate a threshold and if the elements increase beyond that we create another node and put the element in that node.
Threshold = (capacity/2)+1
But why are we not filling the array in Unrolled LinkedList to its full capacity and then creating another node? Why do we need a threshold and keep the array semi-filled?
Quoted from Geeks for Geeks - insertion in Unrolled Linked List:
/* Java program to show the insertion operation
* of Unrolled Linked List */
import java.util.Scanner;
import java.util.Random;
// class for each node
class UnrollNode {
UnrollNode next;
int num_elements;
int array[];
// Constructor
public UnrollNode(int n)
{
next = null;
num_elements = 0;
array = new int[n];
}
}
// Operation of Unrolled Function
class UnrollLinkList {
private UnrollNode start_pos;
private UnrollNode end_pos;
int size_node;
int nNode;
// Parameterized Constructor
UnrollLinkList(int capacity)
{
start_pos = null;
end_pos = null;
nNode = 0;
size_node = capacity + 1;
}
// Insertion operation
void Insert(int num)
{
nNode++;
// Check if the list starts from NULL
if (start_pos == null) {
start_pos = new UnrollNode(size_node);
start_pos.array[0] = num;
start_pos.num_elements++;
end_pos = start_pos;
return;
}
// Attaching the elements into nodes
if (end_pos.num_elements + 1 < size_node) {
end_pos.array[end_pos.num_elements] = num;
end_pos.num_elements++;
}
// Creation of new Node
else {
UnrollNode node_pointer = new UnrollNode(size_node);
int j = 0;
for (int i = end_pos.num_elements / 2 + 1;
i < end_pos.num_elements; i++)
node_pointer.array[j++] = end_pos.array[i];
node_pointer.array[j++] = num;
node_pointer.num_elements = j;
end_pos.num_elements = end_pos.num_elements / 2 + 1;
end_pos.next = node_pointer;
end_pos = node_pointer;
}
}
// Display the Linked List
void display()
{
System.out.print("\nUnrolled Linked List = ");
System.out.println();
UnrollNode pointer = start_pos;
while (pointer != null) {
for (int i = 0; i < pointer.num_elements; i++)
System.out.print(pointer.array[i] + " ");
System.out.println();
pointer = pointer.next;
}
System.out.println();
}
}
/* Main Class */
class UnrolledLinkedList_Check {
// Driver code
public static void main(String args[])
{
Scanner sc = new Scanner(System.in);
// create instance of Random class
Random rand = new Random();
UnrollLinkList ull = new UnrollLinkList(5);
// Perform Insertion Operation
for (int i = 0; i < 12; i++) {
// Generate random integers in range 0 to 99
int rand_int1 = rand.nextInt(100);
System.out.println("Entered Element is " + rand_int1);
ull.Insert(rand_int1);
ull.display();
}
}
}
why are we not filling the array in Unrolled LinkedList to its full capacity and then creating another node?
Actually, the code that is provided does fill the array to full capacity:
if (end_pos.num_elements + 1 < size_node) {
end_pos.array[end_pos.num_elements] = num;
end_pos.num_elements++;
}
The threshold is not used here. Only when the array reaches its capacity, the threshold plays a role as a new array gets created:
UnrollNode node_pointer = new UnrollNode(size_node);
int j = 0;
for (int i = end_pos.num_elements / 2 + 1; i < end_pos.num_elements; i++)
node_pointer.array[j++] = end_pos.array[i];
Here we see the second half of the full array is copied into the new array. We can imagine this process as splitting a block into two blocks -- much like happens in a B-tree.
This allows for fast insertion the next time a value needs to be inserted (not at the end, but) at a specific offset in that array. If it were left full, it would trigger a new block at each insertion into that array. By leaving slack space in an array, we ensure fast insertion for at least a few of the future insertions that happen to be in that array.
I think it is to ensure having at least 50% utilization.
The algorithm not only splits in half if an insert is done, but also redistributes content if it's utilized at less than 50%. I think the second part is key: if you don't do the first part, you can't add a check that redistributes if the node is underutilized because your newly created node would immediately break that.
If you don't do the redistribution at all you might end up with a situation where you have a lot of underutilized nodes.
My first intuition was the same as the other comment (to avoid creating new nodes every time) but this wouldn't necessarily be an issue if you always check the next node before creating a new node, so I'm not sure that sufficient as a reason? (But maybe I'm missing something here)

Paper cut algorithm

I want to create a function to determine the most number of pieces of paper on a parent paper size
The formula above is still not optimal. If using the above formula will only produce at most 32 cut/sheet.
I want it like below.
This seems to be a very difficult problem to solve optimally. See http://lagrange.ime.usp.br/~lobato/packing/ for a discussion of a 2008 paper claiming that the problem is believed (but not proven) to be NP-hard. The researchers found some approximation algorithms and implemented them on that website.
The following solution uses Top-Down Dynamic Programming to find optimal solutions to this problem. I am providing this solution in C#, which shouldn't be too hard to convert into the language of your choice (or whatever style of pseudocode you prefer). I have tested this solution on your specific example and it completes in less than a second (I'm not sure how much less than a second).
It should be noted that this solution assumes that only guillotine cuts are allowed. This is a common restriction for real-world 2D Stock-Cutting applications and it greatly simplifies the solution complexity. However, CS, Math and other programming problems often allow all types of cutting, so in that case this solution would not necessarily find the optimal solution (but it would still provide a better heuristic answer than your current formula).
First, we need a value-structure to represent the size of the starting stock, the desired rectangle(s) and of the pieces cut from the stock (this needs to be a value-type because it will be used as the key to our memoization cache and other collections, and we need to to compare the actual values rather than an object reference address):
public struct Vector2D
{
public int X;
public int Y;
public Vector2D(int x, int y)
{
X = x;
Y = y;
}
}
Here is the main method to be called. Note that all values need to be in integers, for the specific case above this just means multiplying everything by 100. These methods here require integers, but are otherwise are scale-invariant so multiplying by 100 or 1000 or whatever won't affect performance (just make sure that the values don't overflow an int).
public int SolveMaxCount1R(Vector2D Parent, Vector2D Item)
{
// make a list to hold both the item size and its rotation
List<Vector2D> itemSizes = new List<Vector2D>();
itemSizes.Add(Item);
if (Item.X != Item.Y)
{
itemSizes.Add(new Vector2D(Item.Y, Item.X));
}
int solution = SolveGeneralMaxCount(Parent, itemSizes.ToArray());
return solution;
}
Here is an example of how you would call this method with your parameter values. In this case I have assumed that all of the solution methods are part of a class called SolverClass:
SolverClass solver = new SolverClass();
int count = solver.SolveMaxCount1R(new Vector2D(2500, 3800), new Vector2D(425, 550));
//(all units are in tenths of a millimeter to make everything integers)
The main method calls a general solver method for this type of problem (that is not restricted to just one size rectangle and its rotation):
public int SolveGeneralMaxCount(Vector2D Parent, Vector2D[] ItemSizes)
{
// determine the maximum x and y scaling factors using GCDs (Greastest
// Common Divisor)
List<int> xValues = new List<int>();
List<int> yValues = new List<int>();
foreach (Vector2D size in ItemSizes)
{
xValues.Add(size.X);
yValues.Add(size.Y);
}
xValues.Add(Parent.X);
yValues.Add(Parent.Y);
int xScale = NaturalNumbers.GCD(xValues);
int yScale = NaturalNumbers.GCD(yValues);
// rescale our parameters
Vector2D parent = new Vector2D(Parent.X / xScale, Parent.Y / yScale);
var baseShapes = new Dictionary<Vector2D, Vector2D>();
foreach (var size in ItemSizes)
{
var reducedSize = new Vector2D(size.X / xScale, size.Y / yScale);
baseShapes.Add(reducedSize, reducedSize);
}
//determine the minimum values that an allowed item shape can fit into
_xMin = int.MaxValue;
_yMin = int.MaxValue;
foreach (var size in baseShapes.Keys)
{
if (size.X < _xMin) _xMin = size.X;
if (size.Y < _yMin) _yMin = size.Y;
}
// create the memoization cache for shapes
Dictionary<Vector2D, SizeCount> shapesCache = new Dictionary<Vector2D, SizeCount>();
// find the solution pattern with the most finished items
int best = solveGMC(shapesCache, baseShapes, parent);
return best;
}
private int _xMin;
private int _yMin;
The general solution method calls a recursive worker method that does most of the actual work.
private int solveGMC(
Dictionary<Vector2D, SizeCount> shapeCache,
Dictionary<Vector2D, Vector2D> baseShapes,
Vector2D sheet )
{
// have we already solved this size?
if (shapeCache.ContainsKey(sheet)) return shapeCache[sheet].ItemCount;
SizeCount item = new SizeCount(sheet, 0);
if ((sheet.X < _xMin) || (sheet.Y < _yMin))
{
// if it's too small in either dimension then this is a scrap piece
item.ItemCount = 0;
}
else // try every way of cutting this sheet (guillotine cuts only)
{
int child0;
int child1;
// try every size of horizontal guillotine cut
for (int c = sheet.X / 2; c > 0; c--)
{
child0 = solveGMC(shapeCache, baseShapes, new Vector2D(c, sheet.Y));
child1 = solveGMC(shapeCache, baseShapes, new Vector2D(sheet.X - c, sheet.Y));
if (child0 + child1 > item.ItemCount)
{
item.ItemCount = child0 + child1;
}
}
// try every size of vertical guillotine cut
for (int c = sheet.Y / 2; c > 0; c--)
{
child0 = solveGMC(shapeCache, baseShapes, new Vector2D(sheet.X, c));
child1 = solveGMC(shapeCache, baseShapes, new Vector2D(sheet.X, sheet.Y - c));
if (child0 + child1 > item.ItemCount)
{
item.ItemCount = child0 + child1;
}
}
// if no children returned finished items, then the sheet is
// either scrap or a finished item itself
if (item.ItemCount == 0)
{
if (baseShapes.ContainsKey(item.Size))
{
item.ItemCount = 1;
}
else
{
item.ItemCount = 0;
}
}
}
// add the item to the cache before we return it
shapeCache.Add(item.Size, item);
return item.ItemCount;
}
Finally, the general solution method uses a GCD function to rescale the dimensions to achieve scale-invariance. This is implemented in a static class called NaturalNumbers. I have included the rlevant parts of this class below:
static class NaturalNumbers
{
/// <summary>
/// Returns the Greatest Common Divisor of two natural numbers.
/// Returns Zero if either number is Zero,
/// Returns One if either number is One and both numbers are >Zero
/// </summary>
public static int GCD(int a, int b)
{
if ((a == 0) || (b == 0)) return 0;
if (a >= b)
return gcd_(a, b);
else
return gcd_(b, a);
}
/// <summary>
/// Returns the Greatest Common Divisor of a list of natural numbers.
/// (Note: will run fastest if the list is in ascending order)
/// </summary>
public static int GCD(IEnumerable<int> numbers)
{
// parameter checks
if (numbers == null || numbers.Count() == 0) return 0;
int first = numbers.First();
if (first <= 1) return 0;
int g = (int)first;
if (g <= 1) return g;
int i = 0;
foreach (int n in numbers)
{
if (i == 0)
g = n;
else
g = GCD(n, g);
if (g <= 1) return g;
i++;
}
return g;
}
// Euclidian method with Euclidian Division,
// From: https://en.wikipedia.org/wiki/Euclidean_algorithm
private static int gcd_(int a, int b)
{
while (b != 0)
{
int t = b;
b = (a % b);
a = t;
}
return a;
}
}
Please let me know of any problems or questions you might have with this solution.
Oops, forgot that I was also using this class:
public class SizeCount
{
public Vector2D Size;
public int ItemCount;
public SizeCount(Vector2D itemSize, int itemCount)
{
Size = itemSize;
ItemCount = itemCount;
}
}
As I mentioned in the comments, it would actually be pretty easy to factor this class out of the code, but it's still in there right now.

Make unique array with minimal sum

It is a interview question. Given an array, e.g., [3,2,1,2,7], we want to make all elements in this array unique by incrementing duplicate elements and we require the sum of the refined array is minimal. For example the answer for [3,2,1,2,7] is [3,2,1,4,7] and its sum is 17. Any ideas?
It's not quite as simple as my earlier comment suggested, but it's not terrifically complicated.
First, sort the input array. If it matters to be able to recover the original order of the elements then record the permutation used for the sort.
Second, scan the sorted array from left to right (ie from low to high). If an element is less than or equal to the element to its left, set it to be one greater than that element.
Pseudocode
sar = sort(input_array)
for index = 2:size(sar) ! I count from 1
if sar(index)<=sar(index-1) sar(index) = sar(index-1)+1
forend
Is the sum of the result minimal ? I've convinced myself that it is through some head-scratching and trials but I haven't got a formal proof.
If you only need to find ONE of the best solution, here's the algorythm with some explainations.
The idea of this problem is to find an optimal solution, which can be found only by testing all existing solutions (well, they're infinite, let's stick with the reasonable ones).
I wrote a program in C, because I'm familiar with it, but you can port it to any language you want.
The program does this: it tries to increment one value to the max possible (I'll explain how to find it in the comments under the code sections), than if the solution is not found, decreases this value and goes on with the next one and so on.
It's an exponential algorythm, so it will be very slow on large values of duplicated data (yet, it assures you the best solution is found).
I tested this code with your example, and it worked; not sure if there's any bug left, but the code (in C) is this.
#include <stdio.h>
#include <stdlib.h>
#include <limits.h>
typedef int BOOL; //just to ease meanings of values
#define TRUE 1
#define FALSE 0
Just to ease comprehension, I did some typedefs. Don't worry.
typedef struct duplicate { //used to fasten the algorythm; it uses some more memory just to assure it's ok
int value;
BOOL duplicate;
} duplicate_t;
int maxInArrayExcept(int *array, int arraySize, int index); //find the max value in array except the value at the index given
//the result is the max value in the array, not counting th index
int *findDuplicateSum(int *array, int arraySize);
BOOL findDuplicateSum_R(duplicate_t *array, int arraySize, int *tempSolution, int *solution, int *totalSum, int currentSum); //resursive function used to find solution
BOOL check(int *array, int arraySize); //checks if there's any repeated value in the solution
These are all the functions we'll need. All split up for comprehension purpose.
First, we have a struct. This struct is used to avoid checking, for every iteration, if the value on a given index was originally duplicated. We don't want to modify any value not duplicated originally.
Then, we have a couple functions: first, we need to see the worst case scenario: every value after the duplicated ones is already occupied: then we need to increment the duplicated value up to the maximum value reached + 1.
Then, there are the main Function we'll discute later about.
The check Function only checks if there's any duplicated value in a temporary solution.
int main() { //testing purpose
int i;
int testArray[] = { 3,2,1,2,7 }; //test array
int nTestArraySize = 5; //test array size
int *solutionArray; //needed if you want to use the solution later
solutionArray = findDuplicateSum(testArray, nTestArraySize);
for (i = 0; i < nTestArraySize; ++i) {
printf("%d ", solutionArray[i]);
}
return 0;
}
This is the main Function: I used it to test everything.
int * findDuplicateSum(int * array, int arraySize)
{
int *solution = malloc(sizeof(int) * arraySize);
int *tempSolution = malloc(sizeof(int) * arraySize);
duplicate_t *duplicate = calloc(arraySize, sizeof(duplicate_t));
int i, j, currentSum = 0, totalSum = INT_MAX;
for (i = 0; i < arraySize; ++i) {
tempSolution[i] = solution[i] = duplicate[i].value = array[i];
currentSum += array[i];
for (j = 0; j < i; ++j) { //to find ALL the best solutions, we should also put the first found value as true; it's just a line more
//yet, it saves the algorythm half of the duplicated numbers (best/this case scenario)
if (array[j] == duplicate[i].value) {
duplicate[i].duplicate = TRUE;
}
}
}
if (findDuplicateSum_R(duplicate, arraySize, tempSolution, solution, &totalSum, currentSum));
else {
printf("No solution found\n");
}
free(tempSolution);
free(duplicate);
return solution;
}
This Function does a lot of things: first, it sets up the solution array, then it initializes both the solution values and the duplicate array, that is the one used to check for duplicated values at startup. Then, we find the current sum and we set the maximum available sum to the maximum integer possible.
Then, the recursive Function is called; this one gives us the info about having found the solution (that should be Always), then we return the solution as an array.
int findDuplicateSum_R(duplicate_t * array, int arraySize, int * tempSolution, int * solution, int * totalSum, int currentSum)
{
int i;
if (check(tempSolution, arraySize)) {
if (currentSum < *totalSum) { //optimal solution checking
for (i = 0; i < arraySize; ++i) {
solution[i] = tempSolution[i];
}
*totalSum = currentSum;
}
return TRUE; //just to ensure a solution is found
}
for (i = 0; i < arraySize; ++i) {
if (array[i].duplicate == TRUE) {
if (array[i].duplicate <= maxInArrayExcept(solution, arraySize, i)) { //worst case scenario, you need it to stop the recursion on that value
tempSolution[i]++;
return findDuplicateSum_R(array, arraySize, tempSolution, solution, totalSum, currentSum + 1);
tempSolution[i]--; //backtracking
}
}
}
return FALSE; //just in case the solution is not found, but we won't need it
}
This is the recursive Function. It first checks if the solution is ok and if it is the best one found until now. Then, if everything is correct, it updates the actual solution with the temporary values, and updates the optimal condition.
Then, we iterate on every repeated value (the if excludes other indexes) and we progress in the recursion until (if unlucky) we reach the worst case scenario: the check condition not satisfied above the maximum value.
Then we have to backtrack and continue with the iteration, that will go on with other values.
PS: an optimization is possible here, if we move the optimal condition from the check into the for: if the solution is already not optimal, we can't expect to find a better one just adding things.
The hard code has ended, and there are the supporting functions:
int maxInArrayExcept(int *array, int arraySize, int index) {
int i, max = 0;
for (i = 0; i < arraySize; ++i) {
if (i != index) {
if (array[i] > max) {
max = array[i];
}
}
}
return max;
}
BOOL check(int *array, int arraySize) {
int i, j;
for (i = 0; i < arraySize; ++i) {
for (j = 0; j < i; ++j) {
if (array[i] == array[j]) return FALSE;
}
}
return TRUE;
}
I hope this was useful.
Write if anything is unclear.
Well, I got the same question in one of my interviews.
Not sure if you still need it. But here's how I did it. And it worked well.
num_list1 = [2,8,3,6,3,5,3,5,9,4]
def UniqueMinSumArray(num_list):
max=min(num_list)
for i,V in enumerate(num_list):
while (num_list.count(num_list[i])>1):
if (max > num_list[i]+1) :
num_list[i] = max + 1
else:
num_list[i]+=1
max = num_list[i]
i+=1
return num_list
print (sum(UniqueMinSumArray(num_list1)))
You can try with your list of numbers and I am sure it will give you the correct unique minimum sum.
I got the same interview question too. But my answer is in JS in case anyone is interested.
For sure it can be improved to get rid of for loop.
function getMinimumUniqueSum(arr) {
// [1,1,2] => [1,2,3] = 6
// [1,2,2,3,3] = [1,2,3,4,5] = 15
if (arr.length > 1) {
var sortedArr = [...arr].sort((a, b) => a - b);
var current = sortedArr[0];
var res = [current];
for (var i = 1; i + 1 <= arr.length; i++) {
// check current equals to the rest array starting from index 1.
if (sortedArr[i] > current) {
res.push(sortedArr[i]);
current = sortedArr[i];
} else if (sortedArr[i] == current) {
current = sortedArr[i] + 1;
// sortedArr[i]++;
res.push(current);
} else {
current++;
res.push(current);
}
}
return res.reduce((a,b) => a + b, 0);
} else {
return 0;
}
}

Parsing morse code

I am trying to solve this problem.
The goal is to determine the number of ways a morse string can be interpreted, given a dictionary of word.
What I did is that I first "translated" words from my dictionary into morse. Then, I used a naive algorithm, searching for all the ways it can be interpreted recursively.
#include <iostream>
#include <vector>
#include <map>
#include <string>
#include <iterator>
using namespace std;
string morse_string;
int morse_string_size;
map<char, string> morse_table;
unsigned int sol;
void matches(int i, int factor, vector<string> &dictionary) {
int suffix_length = morse_string_size-i;
if (suffix_length <= 0) {
sol += factor;
return;
}
map<int, int> c;
for (vector<string>::iterator it = dictionary.begin() ; it != dictionary.end() ; it++) {
if (((*it).size() <= suffix_length) && (morse_string.substr(i, (*it).size()) == *it)) {
if (c.find((*it).size()) == c.end())
c[(*it).size()] = 0;
else
c[(*it).size()]++;
}
}
for (map<int, int>::iterator it = c.begin() ; it != c.end() ; it++) {
matches(i+it->first, factor*(it->second), dictionary);
}
}
string encode_morse(string s) {
string ret = "";
for (unsigned int i = 0 ; i < s.length() ; ++i) {
ret += morse_table[s[i]];
}
return ret;
}
int main() {
morse_table['A'] = ".-"; morse_table['B'] = "-..."; morse_table['C'] = "-.-."; morse_table['D'] = "-.."; morse_table['E'] = "."; morse_table['F'] = "..-."; morse_table['G'] = "--."; morse_table['H'] = "...."; morse_table['I'] = ".."; morse_table['J'] = ".---"; morse_table['K'] = "-.-"; morse_table['L'] = ".-.."; morse_table['M'] = "--"; morse_table['N'] = "-."; morse_table['O'] = "---"; morse_table['P'] = ".--."; morse_table['Q'] = "--.-"; morse_table['R'] = ".-."; morse_table['S'] = "..."; morse_table['T'] = "-"; morse_table['U'] = "..-"; morse_table['V'] = "...-"; morse_table['W'] = ".--"; morse_table['X'] = "-..-"; morse_table['Y'] = "-.--"; morse_table['Z'] = "--..";
int T, N;
string tmp;
vector<string> dictionary;
cin >> T;
while (T--) {
morse_string = "";
cin >> morse_string;
morse_string_size = morse_string.size();
cin >> N;
for (int j = 0 ; j < N ; j++) {
cin >> tmp;
dictionary.push_back(encode_morse(tmp));
}
sol = 0;
matches(0, 1, dictionary);
cout << sol;
if (T)
cout << endl << endl;
}
return 0;
}
Now the thing is that I only have 3 seconds of execution time allowed, and my algorithm won't work under this limit of time.
Is this the good way to do this and if so, what am I missing ? Otherwise, can you give some hints about what is a good strategy ?
EDIT :
There can be at most 10 000 words in the dictionary and at most 1000 characters in the morse string.
A solution that combines dynamic programming with a rolling hash should work for this problem.
Let's start with a simple dynamic programming solution. We allocate an vector which we will use to store known counts for prefixes of morse_string. We then iterate through morse_string and at each position we iterate through all words and we look back to see if they can fit into morse_string. If they can fit then we use the dynamic programming vector to determine how many ways we could have build the prefix of morse_string up to i-dictionaryWord.size()
vector<long>dp;
dp.push_back(1);
for (int i=0;i<morse_string.size();i++) {
long count = 0;
for (int j=1;j<dictionary.size();j++) {
if (dictionary[j].size() > i) continue;
if (dictionary[j] == morse_string.substring(i-dictionary[j].size(),i)) {
count += dp[i-dictionary[j].size()];
}
}
dp.push_back(count);
}
result = dp[morse_code.size()]
The problem with this solution is that it is too slow. Let's say that N is the length of morse_string and M is the size of the dictionary and K is the size of the largest word in the dictionary. It will do O(N*M*K) operations. If we assume K=1000 this is about 10^10 operations which is too slow on most machines.
The K cost came from the line dictionary[j] == morse_string.substring(i-dictionary[j].size(),i)
If we could speed up this string matching to constant or log complexity we would be okay. This is where rolling hashing comes in. If you build a rolling hash array of morse_string then the idea is that you can compute the hash of any substring of morse_string in O(1). So you could then do hash(dictionary[j]) == hash(morse_string.substring(i-dictionary[j].size(),i))
This is good but in the presence of imperfect hashing you could have multiple words from the dictionary with the same hash. That would mean that after getting a hash match you would still need to match the strings as well as the hashes. In programming contests, people often assume perfect hashing and skip the string matching. This is often a safe bet especially on a small dictionary. In case it doesn't produce a perfect hashing (which you can check in code) you can always adjust your hash function slightly and maybe the adjusted hash function will produce a perfect hashing.

Dynamic programming: Algorithm to solve the following?

I have recently completed the following interview exercise:
'A robot can be programmed to run "a", "b", "c"... "n" kilometers and it takes ta, tb, tc... tn minutes, respectively. Once it runs to programmed kilometers, it must be turned off for "m" minutes.
After "m" minutes it can again be programmed to run for a further "a", "b", "c"... "n" kilometers.
How would you program this robot to go an exact number of kilometers in the minimum amount of time?'
I thought it was a variation of the unbounded knapsack problem, in which the size would be the number of kilometers and the value, the time needed to complete each stretch. The main difference is that we need to minimise, rather than maximise, the value. So I used the equivalent of the following solution: http://en.wikipedia.org/wiki/Knapsack_problem#Unbounded_knapsack_problem
in which I select the minimum.
Finally, because we need an exact solution (if there is one), over the map constructed by the algorithm for all the different distances, I iterated through each and trough each robot's programmed distance to find the exact distance and minimum time among those.
I think the pause the robot takes between runs is a bit of a red herring and you just need to include it in your calculations, but it does not affect the approach taken.
I am probably wrong, because I failed the test. I don't have any other feedback as to the expected solution.
Edit: maybe I wasn't wrong after all and I failed for different reasons. I just wanted to validate my approach to this problem.
import static com.google.common.collect.Sets.*;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.Set;
import org.apache.log4j.Logger;
import com.google.common.base.Objects;
import com.google.common.base.Preconditions;
import com.google.common.collect.Lists;
import com.google.common.collect.Maps;
public final class Robot {
static final Logger logger = Logger.getLogger (Robot.class);
private Set<ProgrammedRun> programmedRuns;
private int pause;
private int totalDistance;
private Robot () {
//don't expose default constructor & prevent subclassing
}
private Robot (int[] programmedDistances, int[] timesPerDistance, int pause, int totalDistance) {
this.programmedRuns = newHashSet ();
for (int i = 0; i < programmedDistances.length; i++) {
this.programmedRuns.add (new ProgrammedRun (programmedDistances [i], timesPerDistance [i] ) );
}
this.pause = pause;
this.totalDistance = totalDistance;
}
public static Robot create (int[] programmedDistances, int[] timesPerDistance, int pause, int totalDistance) {
Preconditions.checkArgument (programmedDistances.length == timesPerDistance.length);
Preconditions.checkArgument (pause >= 0);
Preconditions.checkArgument (totalDistance >= 0);
return new Robot (programmedDistances, timesPerDistance, pause, totalDistance);
}
/**
* #returns null if no strategy was found. An empty map if distance is zero. A
* map with the programmed runs as keys and number of time they need to be run
* as value.
*
*/
Map<ProgrammedRun, Integer> calculateOptimalStrategy () {
//for efficiency, consider this case first
if (this.totalDistance == 0) {
return Maps.newHashMap ();
}
//list of solutions for different distances. Element "i" of the list is the best set of runs that cover at least "i" kilometers
List <Map<ProgrammedRun, Integer>> runsForDistances = Lists.newArrayList();
//special case i = 0 -> empty map (no runs needed)
runsForDistances.add (new HashMap<ProgrammedRun, Integer> () );
for (int i = 1; i <= totalDistance; i++) {
Map<ProgrammedRun, Integer> map = new HashMap<ProgrammedRun, Integer> ();
int minimumTime = -1;
for (ProgrammedRun pr : programmedRuns) {
int distance = Math.max (0, i - pr.getDistance ());
int time = getTotalTime (runsForDistances.get (distance) ) + pause + pr.getTime();
if (minimumTime < 0 || time < minimumTime) {
minimumTime = time;
//new minimum found
map = new HashMap<ProgrammedRun, Integer> ();
map.putAll(runsForDistances.get (distance) );
//increase count
Integer num = map.get (pr);
if (num == null) num = Integer.valueOf (1);
else num++;
//update map
map.put (pr, num);
}
}
runsForDistances.add (map );
}
//last step: calculate the combination with exact distance
int minimumTime2 = -1;
int bestIndex = -1;
for (int i = 0; i <= totalDistance; i++) {
if (getTotalDistance (runsForDistances.get (i) ) == this.totalDistance ) {
int time = getTotalTime (runsForDistances.get (i) );
if (time > 0) time -= pause;
if (minimumTime2 < 0 || time < minimumTime2 ) {
minimumTime2 = time;
bestIndex = i;
}
}
}
//if solution found
if (bestIndex != -1) {
return runsForDistances.get (bestIndex);
}
//try all combinations, since none of the existing maps run for the exact distance
List <Map<ProgrammedRun, Integer>> exactRuns = Lists.newArrayList();
for (int i = 0; i <= totalDistance; i++) {
int distance = getTotalDistance (runsForDistances.get (i) );
for (ProgrammedRun pr : programmedRuns) {
//solution found
if (distance + pr.getDistance() == this.totalDistance ) {
Map<ProgrammedRun, Integer> map = new HashMap<ProgrammedRun, Integer> ();
map.putAll (runsForDistances.get (i));
//increase count
Integer num = map.get (pr);
if (num == null) num = Integer.valueOf (1);
else num++;
//update map
map.put (pr, num);
exactRuns.add (map);
}
}
}
if (exactRuns.isEmpty()) return null;
//finally return the map with the best time
minimumTime2 = -1;
Map<ProgrammedRun, Integer> bestMap = null;
for (Map<ProgrammedRun, Integer> m : exactRuns) {
int time = getTotalTime (m);
if (time > 0) time -= pause; //remove last pause
if (minimumTime2 < 0 || time < minimumTime2 ) {
minimumTime2 = time;
bestMap = m;
}
}
return bestMap;
}
private int getTotalTime (Map<ProgrammedRun, Integer> runs) {
int time = 0;
for (Map.Entry<ProgrammedRun, Integer> runEntry : runs.entrySet()) {
time += runEntry.getValue () * runEntry.getKey().getTime ();
//add pauses
time += this.pause * runEntry.getValue ();
}
return time;
}
private int getTotalDistance (Map<ProgrammedRun, Integer> runs) {
int distance = 0;
for (Map.Entry<ProgrammedRun, Integer> runEntry : runs.entrySet()) {
distance += runEntry.getValue() * runEntry.getKey().getDistance ();
}
return distance;
}
class ProgrammedRun {
private int distance;
private int time;
private transient float speed;
ProgrammedRun (int distance, int time) {
this.distance = distance;
this.time = time;
this.speed = (float) distance / time;
}
#Override public String toString () {
return "(distance =" + distance + "; time=" + time + ")";
}
#Override public boolean equals (Object other) {
return other instanceof ProgrammedRun
&& this.distance == ((ProgrammedRun)other).distance
&& this.time == ((ProgrammedRun)other).time;
}
#Override public int hashCode () {
return Objects.hashCode (Integer.valueOf (this.distance), Integer.valueOf (this.time));
}
int getDistance() {
return distance;
}
int getTime() {
return time;
}
float getSpeed() {
return speed;
}
}
}
public class Main {
/* Input variables for the robot */
private static int [] programmedDistances = {1, 2, 3, 5, 10}; //in kilometers
private static int [] timesPerDistance = {10, 5, 3, 2, 1}; //in minutes
private static int pause = 2; //in minutes
private static int totalDistance = 41; //in kilometers
/**
* #param args
*/
public static void main(String[] args) {
Robot r = Robot.create (programmedDistances, timesPerDistance, pause, totalDistance);
Map<ProgrammedRun, Integer> strategy = r.calculateOptimalStrategy ();
if (strategy == null) {
System.out.println ("No strategy that matches the conditions was found");
} else if (strategy.isEmpty ()) {
System.out.println ("No need to run; distance is zero");
} else {
System.out.println ("Strategy found:");
System.out.println (strategy);
}
}
}
Simplifying slightly, let ti be the time (including downtime) that it takes the robot to run distance di. Assume that t1/d1 ≤ … ≤ tn/dn. If t1/d1 is significantly smaller than t2/d2 and d1 and the total distance D to be run are large, then branch and bound likely outperforms dynamic programming. Branch and bound solves the integer programming formulation
minimize ∑i ti xi
subject to
∑i di xi = D
∀i xi &in; N
by using the value of the relaxation where xi can be any nonnegative real as a guide. The latter is easily verified to be at most (t1/d1)D, by setting x1 to D/d1 and ∀i ≠ 1 xi = 0, and at least (t1/d1)D, by setting the sole variable of the dual program to t1/d1. Solving the relaxation is the bound step; every integer solution is a fractional solution, so the best integer solution requires time at least (t1/d1)D.
The branch step takes one integer program and splits it in two whose solutions, taken together, cover the entire solution space of the original. In this case, one piece could have the extra constraint x1 = 0 and the other could have the extra constraint x1 ≥ 1. It might look as though this would create subproblems with side constraints, but in fact, we can just delete the first move, or decrease D by d1 and add the constant t1 to the objective. Another option for branching is to add either the constraint xi = ⌊D/di⌋ or xi ≤ ⌊D/di⌋ - 1, which requires generalizing to upper bounds on the number of repetitions of each move.
The main loop of branch and bound selects one of a collection of subproblems, branches, computes bounds for the two subproblems, and puts them back into the collection. The efficiency over brute force comes from the fact that, when we have a solution with a particular value, every subproblem whose relaxed value is at least that much can be thrown away. Once the collection is emptied this way, we have the optimal solution.
Hybrids of branch and bound and dynamic programming are possible, for example, computing optimal solutions for small D via DP and using those values instead of branching on subproblems that have been solved.
Create array of size m and for 0 to m( m is your distance) do:
a[i] = infinite;
a[0] = 0;
a[i] = min{min{a[i-j] + tj + m for all j in possible kilometers of robot. and j≠i} , ti if i is in possible moves of robot}
a[m] is lowest possible value. Also you can have array like b to save a[i]s selection. Also if a[m] == infinite means it's not possible.
Edit: we can solve it in another way by creating a digraph, again our graph is dependent to m length of path, graph has nodes labeled {0..m}, now start from node 0 connect it to all possible nodes; means if you have a kilometer i you can connect 0 and vi with weight ti, except for node 0->x, for all other nodes you should connect node i->j with weight tj-i + m for j>i and j-i is available in input kilometers. now you should find shortest path from v0 to vn. but this algorithm still is O(nm).
Let G be the desired distance run.
Let n be the longest possible distance run without pause.
Let L = G / n (Integer arithmetic, discard fraction part)
Let R = G mod n (ie. The remainder from the above division)
Make the robot run it's longest distance (ie. n) L times, and then whichever distance (a, b, c, etc.) is greater than R by the least amount (ie the smallest available distance that is equal to or greater than R)
Either I understood the problem wrong, or you're all over thinking it
I am a big believer in showing instead of telling. Here is a program that may be doing what you are looking for. Let me know if it satisfies your question. Simply copy, paste, and run the program. You should of course test with your own data set.
import java.util.Arrays;
public class Speed {
/***
*
* #param distance
* #param sprints ={{A,Ta},{B,Tb},{C,Tc}, ..., {N,Tn}}
*/
public static int getFastestTime(int distance, int[][] sprints){
long[] minTime = new long[distance+1];//distance from 0 to distance
Arrays.fill(minTime,Integer.MAX_VALUE);
minTime[0]=0;//key=distance; value=time
for(int[] speed: sprints)
for(int d=1; d<minTime.length; d++)
if(d>=speed[0] && minTime[d] > minTime[d-speed[0]]+speed[1])
minTime[d]=minTime[d-speed[0]]+speed[1];
return (int)minTime[distance];
}//
public static void main(String... args){
//sprints ={{A,Ta},{B,Tb},{C,Tc}, ..., {N,Tn}}
int[][] sprints={{3,2},{5,3},{7,5}};
int distance = 21;
System.out.println(getFastestTime(distance,sprints));
}
}

Resources