K product array - algorithm

I am working on an algorithms problem. You have an array numbers, size of array t , number number_of_elements and number multiplication_value. You have to find any set of number_of_elements indexes of the elements of the array , which product will be equal to multiplication_value. It is guaranteed, that such set of indexes exists
That problem looks like 2 sum, but I can't extrapolate it to my case.
I have tried naive algorithm for O(n), but it fails, when you have bad first number in an array. I think there is a way to use recursion in here. I guess it is well-known problem, but I couldn't find the solution
Example in:
t = 7
number_of_elements = 2
multiplication_value = 27
numbers = [9,1,1,27,3,27,3]
Example out:
1 3
My code ideas:
def return_index_values(numbers,multiplication_value,number_of_elements):
cur_number = int(multiplication_value)
list_of_indexes = []
values = []
for i in range(len(numbers)):
if ((cur_number == 1) and (len(values) == number_of_elements)):
print(values)
#finishing if everything worked
break
else:
if (cur_number % int(numbers[i]) == 0):
if(len(values) < number_of_elements):
#pushing values if possible
values.append(int(numbers[i]))
list_of_indexes.append(i)
cur_number = int(cur_number / int(numbers[i]))
print(cur_number)
else:
pass
if(len(values) == number_of_elements):
if mult_check(values,int(multiplication_value)):
#mult_check checks if the array's element multiplication gives a value
break
else:
#started dealing with bad cases, but it doesn't work properly
values.sort()
val_popped = values.pop()
cur_number = cur_number * val_popped
Bad case for my code
numbers = [9,3,1,27,3,27,3]

Here is one implementation. Not necessarily the best solution but it gives you some sense of how it can be done.
It first sorts the numbers by the element keeping the indices information. Then it performs recursion calls.
number_of_elements = 2
multiplication_value = 27
numbers = [9,1,1,27,3,27,3]
def preprocess(numbers, multiplication_value, number_of_elements):
l = []
for i, num in enumerate(numbers):
l.append((num, i))
return sorted(l, key = lambda tup: tup[0])
def subroutine(numbers, multiplication_value, number_of_elements, idx_start, result):
if idx_start >= len(numbers):
return False
if number_of_elements == 0:
return True if multiplication_value == 1 else False
for i in range(idx_start, len(numbers)):
num = numbers[i][0]
if num <= multiplication_value:
if multiplication_value % num == 0:
idx = numbers[i][1]
result.append(idx)
found = subroutine(numbers, multiplication_value / num, number_of_elements - 1, i + 1, result)
if not found:
del result[-1]
else:
return True
else:
return False
return False
result = []
processed_numbers = preprocess(numbers, multiplication_value, number_of_elements)
subroutine(processed_numbers, multiplication_value, number_of_elements, 0, result)
print(result)

You can use itertools.combinations() (https://www.geeksforgeeks.org/itertools-combinations-module-python-print-possible-combinations/) to select number_of_elements entries from your list in all possible ways, then check each whether they multiply to the required number.

Related

Dynamic Programming for shortest subsequence that is not a subsequence of two strings

Problem: Given two sequences s1 and s2 of '0' and '1'return the shortest sequence that is a subsequence of neither of the two sequences.
E.g. s1 = '011' s2 = '1101' Return s_out = '00' as one possible result.
Note that substring and subsequence are different where substring the characters are contiguous but in a subsequence that needs not be the case.
My question: How is dynamic programming applied in the "Solution Provided" below and what is its time complexity?
My attempt involves computing all the subsequences for each string giving sub1 and sub2. Append a '1' or a '0' to each sub1 and determine if that new subsequence is not present in sub2.Find the minimum length one. Here is my code:
My Solution
def get_subsequences(seq, index, subs, result):
if index == len(seq):
if subs:
result.add(''.join(subs))
else:
get_subsequences(seq, index + 1, subs, result)
get_subsequences(seq, index + 1, subs + [seq[index]], result)
def get_bad_subseq(subseq):
min_sub = ''
length = float('inf')
for sub in subseq:
for char in ['0', '1']:
if len(sub) + 1 < length and sub + char not in subseq:
length = len(sub) + 1
min_sub = sub + char
return min_sub
Solution Provided (not mine)
How does it work and its time complexity?
It looks that the below solution looks similar to: http://kyopro.hateblo.jp/entry/2018/12/11/100507
def set_nxt(s, nxt):
n = len(s)
idx_0 = n + 1
idx_1 = n + 1
for i in range(n, 0, -1):
nxt[i][0] = idx_0
nxt[i][1] = idx_1
if s[i-1] == '0':
idx_0 = i
else:
idx_1 = i
nxt[0][0] = idx_0
nxt[0][1] = idx_1
def get_shortest(seq1, seq2):
len_seq1 = len(seq1)
len_seq2 = len(seq2)
nxt_seq1 = [[len_seq1 + 1 for _ in range(2)] for _ in range(len_seq1 + 2)]
nxt_seq2 = [[len_seq2 + 1 for _ in range(2)] for _ in range(len_seq2 + 2)]
set_nxt(seq1, nxt_seq1)
set_nxt(seq2, nxt_seq2)
INF = 2 * max(len_seq1, len_seq2)
dp = [[INF for _ in range(len_seq2 + 2)] for _ in range(len_seq1 + 2)]
dp[len_seq1 + 1][len_seq2 + 1] = 0
for i in range( len_seq1 + 1, -1, -1):
for j in range(len_seq2 + 1, -1, -1):
for k in range(2):
if dp[nxt_seq1[i][k]][nxt_seq2[j][k]] < INF:
dp[i][j] = min(dp[i][j], dp[nxt_seq1[i][k]][nxt_seq2[j][k]] + 1);
res = ""
i = 0
j = 0
while i <= len_seq1 or j <= len_seq2:
for k in range(2):
if (dp[i][j] == dp[nxt_seq1[i][k]][nxt_seq2[j][k]] + 1):
i = nxt_seq1[i][k]
j = nxt_seq2[j][k]
res += str(k)
break;
return res
I am not going to work it through in detail, but the idea of this solution is to create a 2-D array of every combinations of positions in the one array and the other. It then populates this array with information about the shortest sequences that it finds that force you that far.
Just constructing that array takes space (and therefore time) O(len(seq1) * len(seq2)). Filling it in takes a similar time.
This is done with lots of bit twiddling that I don't want to track.
I have another approach that is clearer to me that usually takes less space and less time, but in the worst case could be as bad. But I have not coded it up.
UPDATE:
Here is is all coded up. With poor choices of variable names. Sorry about that.
# A trivial data class to hold a linked list for the candidate subsequences
# along with information about they match in the two sequences.
import collections
SubSeqLinkedList = collections.namedtuple('SubSeqLinkedList', 'value pos1 pos2 tail')
# This finds the position after the first match. No match is treated as off the end of seq.
def find_position_after_first_match (seq, start, value):
while start < len(seq) and seq[start] != value:
start += 1
return start+1
def make_longer_subsequence (subseq, value, seq1, seq2):
pos1 = find_position_after_first_match(seq1, subseq.pos1, value)
pos2 = find_position_after_first_match(seq2, subseq.pos2, value)
gotcha = SubSeqLinkedList(value=value, pos1=pos1, pos2=pos2, tail=subseq)
return gotcha
def minimal_nonsubseq (seq1, seq2):
# We start with one candidate for how to start the subsequence
# Namely an empty subsequence. Length 0, matches before the first character.
candidates = [SubSeqLinkedList(value=None, pos1=0, pos2=0, tail=None)]
# Now we try to replace candidates with longer maximal ones - nothing of
# the same length is better at going farther in both sequences.
# We keep this list ordered by descending how far it goes in sequence1.
while candidates[0].pos1 <= len(seq1) or candidates[0].pos2 <= len(seq2):
new_candidates = []
for candidate in candidates:
candidate1 = make_longer_subsequence(candidate, '0', seq1, seq2)
candidate2 = make_longer_subsequence(candidate, '1', seq1, seq2)
if candidate1.pos1 < candidate2.pos1:
# swap them.
candidate1, candidate2 = candidate2, candidate1
for c in (candidate1, candidate2):
if 0 == len(new_candidates):
new_candidates.append(c)
elif new_candidates[-1].pos1 <= c.pos1 and new_candidates[-1].pos2 <= c.pos2:
# We have found strictly better.
new_candidates[-1] = c
elif new_candidates[-1].pos2 < c.pos2:
# Note, by construction we cannot be shorter in pos1.
new_candidates.append(c)
# And now we throw away the ones we don't want.
# Those that are on their way to a solution will be captured in the linked list.
candidates = new_candidates
answer = candidates[0]
r_seq = [] # This winds up reversed.
while answer.value is not None:
r_seq.append(answer.value)
answer = answer.tail
return ''.join(reversed(r_seq))
print(minimal_nonsubseq('011', '1101'))

QuickSort - Median Three

I am working on the QuickSort - Median Three Algorithm.
I have no problem with the first and last element sorting. But, when comes to the Median-three, I am slightly confused. I hope someone could help me on this.
Would be appreciate if someone could provide me some pseudocode?
My understanding is to get the middle index by doing this. (start + end) / 2 , then swap the middle pivot value to the first value, after all these done it should goes well with the normal quick sort ( partitioning and sorting).
Somehow, I couldn't get it works. Please help!
#Array Swap function
def swap(A,i,k):
temp=A[i]
A[i]=A[k]
A[k]=temp
# Get Middle pivot function
def middle(lista):
if len(lista) % 2 == 0:
result= len(lista) // 2 - 1
else:
result = len(lista) // 2
return result
def median(lista):
if len(lista) % 2 == 0:
return sorted(lista)[len(lista) // 2 - 1]
else:
return sorted(lista)[len(lista) // 2]
# Create partition function
def partition(A,start,end):
m = middle(A[start:end+1])
medianThree = [ A[start], A[m], A[end] ]
if A[start] == median(medianThree):
pivot_pos = start
elif A[m] == median(medianThree):
tempList = A[start:end+1]
pivot_pos = middle(A[start:end+1])
swap(A,start,pivot_pos+start)
elif A[end] == median(medianThree):
pivot_pos = end
#pivot = A[pivot_pos]
pivot = pivot_pos
# swap(A,start,end) // This line of code is to switch the first and last element pivot
swap(A,pivot,end)
p = A[pivot]
i = pivot + 1
for j in range(pivot+1,end+1):
if A[j] < p:
swap(A,i,j)
i+=1
swap(A,start,i-1)
return i-1
count = 0
#Quick sort algorithm
def quickSort(A,start,end):
global tot_comparisons
if start < end:
# This to create the partition based on the
pivot_pos = partition(A,start,end)
tot_comparisons += len(A[start:pivot_pos-1]) + len(A[pivot_pos+1:end])
# This to sort the the left partition
quickSort(A,start,pivot_pos -1)
#This to sort the right partition
quickSort(A,pivot_pos+1,end)

Arithmetic/Geometric series

The code below returns "Arithmetic", "Geometric" if the input array is an arithmetic and geometric series respectively and -1 if it is neither.
Although the code works fine, when I change
if s = arr.length - 1
to
if s == arr.length - 1
in the while loop, the code is not working properly anymore.
I do not understand why. Shouldn't == work instead of =?
def ArithGeo(arr)
# code goes here
len = arr.length
difference = arr[len-1] - arr[len-2]
ratio = arr[len-1]/arr[len-2]
k = 0
s = k + 1
while (arr[s] - arr[k]) == difference && s < arr.length
if s = arr.length - 1
return "Arithmetic"
end
k += 1
end
k = 0
while arr[s] / arr[k] == ratio && s < arr.length
if s = arr.length - 1
return "Geometric"
end
k += 1
end
return -1
end
You're never changing the value of s which I think you want to do. You should do that at the point that you increment k
k += 1
s = k + 1
Also, at the point where you reinitialize k for the geometric test, you want to reset s as well...
k = 0
s = k + 1
You could also get rid of the variable s completely and make it a method... add these three lines at the top of the code
def s(k)
k + 1
end
And remove all the lines where you assign a value to s and use s(k)... s(k) will be a method that always returns the next higher value to k
The difference between those two statements is that variable s is set for the first statement but not for the second. The first if statement has thus a side effect of setting s to arr.length - 1
if s = arr.length - 1 # s => arr.length - 1
if s == arr.length - 1 # s => undefined
Because the if statement is inside a while loop which uses s in its expression the change of the statement changes the behavior of the programm.
If you put == the statement will try to check if they are equals , with just = the statement work properly because your are only setting the value to a value , so this is always true.
If it's different compare something to equals than just set a variable , that can be always true.

Fastest solution for all possible combinations, taking k elements out of n possible with k>2 and n large

I am using MATLAB to find all of the possible combinations of k elements out of n possible elements. I stumbled across this question, but unfortunately it does not solve my problem. Of course, neither does nchoosek as my n is around 100.
Truth is, I don't need all of the possible combinations at the same time. I will explain what I need, as there might be an easier way to achieve the desired result. I have a matrix M of 100 rows and 25 columns.
Think of a submatrix of M as a matrix formed by ALL columns of M and only a subset of the rows. I have a function f that can be applied to any matrix which gives a result of either -1 or 1. For example, you can think of the function as sign(det(A)) where A is any matrix (the exact function is irrelevant for this part of the question).
I want to know what is the biggest number of rows of M for which the submatrix A formed by these rows is such that f(A) = 1. Notice that if f(M) = 1, I am done. However, if this is not the case then I need to start combining rows, starting of all combinations with 99 rows, then taking the ones with 98 rows, and so on.
Up to this point, my implementation had to do with nchoosek which worked when M had only a few rows. However, now that I am working with a relatively bigger dataset, things get stuck. Do any of you guys think of a way to implement this without having to use the above function? Any help would be gladly appreciated.
Here is my minimal working example, it works for small obs_tot but fails when I try to use bigger numbers:
value = -1; obs_tot = 100; n_rows = 25;
mat = randi(obs_tot,n_rows);
while value == -1
posibles = nchoosek(1:obs_tot,i);
[num_tries,num_obs] = size(possibles);
num_try = 1;
while value == 0 && num_try <= num_tries
check = mat(possibles(num_try,:),:);
value = sign(det(check));
num_try = num_try + 1;
end
i = i - 1;
end
obs_used = possibles(num_try-1,:)';
Preamble
As yourself noticed in your question, it would be nice not to have nchoosek to return all possible combinations at the same time but rather to enumerate them one by one in order not to explode memory when n becomes large. So something like:
enumerator = CombinationEnumerator(k, n);
while(enumerator.MoveNext())
currentCombination = enumerator.Current;
...
end
Here is an implementation of such enumerator as a Matlab class. It is based on classic IEnumerator<T> interface in C# / .NET and mimics the subfunction combs in nchoosek (the unrolled way):
%
% PURPOSE:
%
% Enumerates all combinations of length 'k' in a set of length 'n'.
%
% USAGE:
%
% enumerator = CombinaisonEnumerator(k, n);
% while(enumerator.MoveNext())
% currentCombination = enumerator.Current;
% ...
% end
%
%% ---
classdef CombinaisonEnumerator < handle
properties (Dependent) % NB: Matlab R2013b bug => Dependent must be declared before their get/set !
Current; % Gets the current element.
end
methods
function [enumerator] = CombinaisonEnumerator(k, n)
% Creates a new combinations enumerator.
if (~isscalar(n) || (n < 1) || (~isreal(n)) || (n ~= round(n))), error('`n` must be a scalar positive integer.'); end
if (~isscalar(k) || (k < 0) || (~isreal(k)) || (k ~= round(k))), error('`k` must be a scalar positive or null integer.'); end
if (k > n), error('`k` must be less or equal than `n`'); end
enumerator.k = k;
enumerator.n = n;
enumerator.v = 1:n;
enumerator.Reset();
end
function [b] = MoveNext(enumerator)
% Advances the enumerator to the next element of the collection.
if (~enumerator.isOkNext),
b = false; return;
end
if (enumerator.isInVoid)
if (enumerator.k == enumerator.n),
enumerator.isInVoid = false;
enumerator.current = enumerator.v;
elseif (enumerator.k == 1)
enumerator.isInVoid = false;
enumerator.index = 1;
enumerator.current = enumerator.v(enumerator.index);
else
enumerator.isInVoid = false;
enumerator.index = 1;
enumerator.recursion = CombinaisonEnumerator(enumerator.k - 1, enumerator.n - enumerator.index);
enumerator.recursion.v = enumerator.v((enumerator.index + 1):end); % adapt v (todo: should use private constructor)
enumerator.recursion.MoveNext();
enumerator.current = [enumerator.v(enumerator.index) enumerator.recursion.Current];
end
else
if (enumerator.k == enumerator.n),
enumerator.isInVoid = true;
enumerator.isOkNext = false;
elseif (enumerator.k == 1)
enumerator.index = enumerator.index + 1;
if (enumerator.index <= enumerator.n)
enumerator.current = enumerator.v(enumerator.index);
else
enumerator.isInVoid = true;
enumerator.isOkNext = false;
end
else
if (enumerator.recursion.MoveNext())
enumerator.current = [enumerator.v(enumerator.index) enumerator.recursion.Current];
else
enumerator.index = enumerator.index + 1;
if (enumerator.index <= (enumerator.n - enumerator.k + 1))
enumerator.recursion = CombinaisonEnumerator(enumerator.k - 1, enumerator.n - enumerator.index);
enumerator.recursion.v = enumerator.v((enumerator.index + 1):end); % adapt v (todo: should use private constructor)
enumerator.recursion.MoveNext();
enumerator.current = [enumerator.v(enumerator.index) enumerator.recursion.Current];
else
enumerator.isInVoid = true;
enumerator.isOkNext = false;
end
end
end
end
b = enumerator.isOkNext;
end
function [] = Reset(enumerator)
% Sets the enumerator to its initial position, which is before the first element.
enumerator.isInVoid = true;
enumerator.isOkNext = (enumerator.k > 0);
end
function [c] = get.Current(enumerator)
if (enumerator.isInVoid), error('Enumerator is positioned (before/after) the (first/last) element.'); end
c = enumerator.current;
end
end
properties (GetAccess=private, SetAccess=private)
k = [];
n = [];
v = [];
index = [];
recursion = [];
current = [];
isOkNext = false;
isInVoid = true;
end
end
We can test implementation is ok from command window like this:
>> e = CombinaisonEnumerator(3, 6);
>> while(e.MoveNext()), fprintf(1, '%s\n', num2str(e.Current)); end
Which returns as expected the following n!/(k!*(n-k)!) combinations:
1 2 3
1 2 4
1 2 5
1 2 6
1 3 4
1 3 5
1 3 6
1 4 5
1 4 6
1 5 6
2 3 4
2 3 5
2 3 6
2 4 5
2 4 6
2 5 6
3 4 5
3 4 6
3 5 6
4 5 6
Implementation of this enumerator may be further optimized for speed, or by enumerating combinations in an order more appropriate for your case (e.g., test some combinations first rather than others) ... Well, at least it works! :)
Problem solving
Now solving your problem is really easy:
n = 100;
m = 25;
matrix = rand(n, m);
k = n;
cont = true;
while(cont && (k >= 1))
e = CombinationEnumerator(k, n);
while(cont && e.MoveNext());
cont = f(matrix(e.Current(:), :)) ~= 1;
end
if (cont), k = k - 1; end
end

native string matching algorithm

Following is a very famous question in native string matching. Please can someone explain me the answer.
Suppose that all characters in the pattern P are different. Show how to accelerate NAIVE-STRING MATCHER to run in time O(n) on an n-character text T.
The basic idea:
Iterate through the input and the pattern at the same time, comparing their characters to each other
Whenever you get a non-matching character between the two, you can just reset the pattern position and keep the input position as is
This works because the pattern characters are all different, which means that whenever you have a partial match, there can be no other match overlapping with that, so we can just start looking from the end of the partial match.
Here's some pseudo-code that shouldn't be too difficult to understand:
input[n]
pattern[k]
pPos = 0
iPos = 0
while iPos < n
if pPos == k
FOUND!
if pattern[pPos] == input[iPos]
pPos++
iPos++
else
// if pPos is already 0, we need to increase iPos,
// otherwise we just keep comparing the same characters
if pPos == 0
iPos++
pPos = 0
It's easy to see that iPos increases at least every second loop, thus there can be at most 2n loop runs, making the running time O(n).
When T[i] and P[j] mismatches in NAIVE-STRING-MATCHER, we can skip all characters before T[i] and begin new matching from T[i + 1] with P[1].
NAIVE-STRING-MATCHER(T, P)
1 n length[T]
2 m length[P]
3 for s 0 to n - m
4 do if P[1 . . m] = T[s + 1 . . s + m]
5 then print "Pattern occurs with shift" s
Naive string search algorithm implementations in Python 2.7:
https://gist.github.com/heyhuyen/4341692
In the middle of implementing Boyer-Moore's string search algorithm, I decided to play with my original naive search algorithm. It's implemented as an instance method that takes a string to be searched. The object has an attribute 'pattern' which is the pattern to match.
1) Here is the original version of the search method, using a double for-loop.
Makes calls to range and len
def search(self, string):
for i in range(len(string)):
for j in range(len(self.pattern)):
if string[i+j] != self.pattern[j]:
break
elif j == len(self.pattern) - 1:
return i
return -1
2) Here is the second version, using a double while-loop instead.
Slightly faster, not making calls to range
def search(self, string):
i = 0
while i < len(string):
j = 0
while j < len(self.pattern) and self.pattern[j] == string[i+j]:
j += 1
if j == len(self.pattern):
return i
i += 1
return -1
3) Here is the original, replacing range with xrange.
Faster than both of the previous two.
def search(self, string):
for i in xrange(len(string)):
for j in xrange(len(self.pattern)):
if string[i+j] != self.pattern[j]:
break
elif j == len(self.pattern) - 1:
return i
return -1
4) Storing values in local variables = win! With the double while loop, this is the fastest.
def search(self, string):
len_pat = len(self.pattern)
len_str = len(string)
i = 0
while i < len_str:
j = 0
while j < len_pat and self.pattern[j] == string[i+j]:
j += 1
if j == len_pat:
return i
i += 1
return -1

Resources