Find correct bracket sequence, build from two parts of original bracket sequence - algorithm

Consider bracket sequences that consist only of '(' and ')'. Let S be any bracket sequence (not necessarily correct) with n items: S[1:n].
I need to write an algorithm, that will find such a number i (from 1 to n, if there is such a number), that S[(i+1):n]+S[1:i] is a correct bracket sequence. I also need this algorithm to have O(n) operations.
It seems to me that I should use deque for this algorithm to pop the last element and push it in the beginning of a deque until a correct bracket sequence appears. But I can't find an efficient way to check, if the new sequence is correct - if I use a special counter, that increases each time '(' appears and decreases otherwise (note that a correct sequence must start with '('), then n operations (to check if the sequence is correct) will be done for each rearrangement of the last element in the beginning and algorithm as a whole takes O(n^2) operations, but I need linear time.
Should I really use deque or is there any other way to check the correctness of the sequence in the deque?

You can do this with a single scan from left to right:
Keep track of the nesting depth of the parentheses. So for example, after "((" it is 2, and after "(()))" it is -1.
Keep track of the position at which this depth hit its minimum value during the scan. This will be the potential splitting point (after the bracket that caused the minimum depth).
At the end of the scan verify that the depth has reached 0. If not, the input is not valid, otherwise return the splitting point.
By consequence, every input that has an equal number of opening and closing parentheses will have a solution.
Here is an implementation of this algorithm in an interactive JavaScript snippet. As you enter the input, the output is updated. It displays the rearranged parts of the input, separated by "...":
function findSplit(s) {
let start = 0;
let depth = 0;
let minDepth = 0;
for (let i = 0; i < s.length; i++) {
if (s[i] == ")") {
depth--;
if (depth < minDepth) {
minDepth = depth;
// restart
start = i+1;
}
} else {
depth++;
}
}
if (depth != 0) return -1; // not valid
return start;
}
// I/O handling
let input = document.querySelector("input");
let output = document.querySelector("pre");
function refresh() {
let s = input.value;
let start = findSplit(s);
if (start == -1) output.textContent = "Invalid";
else output.textContent = s.slice(start) + "..." + s.slice(0, start);
}
input.oninput = refresh;
refresh();
<input value="())()(">
<pre></pre>

Related

How can I stop recursion after I succeed my insertion?

The problem definition is that
Given an additional digit 0 ≤ x ≤ 9, write a function that returns
the integer that results from inserting x in n, such that its digits
also appear in ascending order from left to right. For instance, if n
= 24667 and x = 5, the function should return 245667.
My code
// the divisions are integer division, no floating point
int x(int n, int insertValue)
{
if (n == 0) return 0;
int val = x(n/10, insertValue);
if((n%10) > insertValue)
{
int q = insertValue * 10 + (n%10);
return val * 100 + q;
}
return val*10 + (n%10);
}
For the case of, for example, x(2245,3), it outputs 223435. But I have already done with it while processing 224. It shouldn't go on adding the value to be inserted any more, I mean the 3 shouldn't be there before 5 .
I can come up with a solution that I can put in each recursion step a boolean flag that identify by taking modulo by 10 and dividing by 10 up to reach the single digit case. If there is no any identification, go in to that if block, else not. But it sounds too silly.
When you're dividing recursively, you are actually going right to left, not left to right, so you should check if a digit is smaller than the one inserted and not greater (unless you always let the recursion reach n==0 condition and make your comparisons on your way out of it but that would be ineffective).
The second thing is that you do not break recursion once you inserted the digit (which you were aware of as I can see now in the question title), so it gets inserted repeatedly before every digit that is larger than insertValue. As to how to do it: you were already stopping recursion with if(n==0) condition, i.e. if n==0 the function stops calling itself (it returns immediately). When inserting the digit the difference is that you need to use the original value (n) to return from the function instead of passing it further.
At this point it works well for your example but there's also one edge case you need to consider if you want the function to work property. When you have nothing to divide anymore [if(n==0)] you need to insert your digit anyway (return insertValue) so it does not get lost on the left edge like in x(2245,1) call.
Improvements for brevity:
% has the same precedence as * and /, so brackets around it are not needed here.
I removed val variable as it was now used only once and its calculation was not always necessary.
Here's the working code:
int x(int n, int insertValue){
if(n == 0) return insertValue;
//if insertion point was found, use original value (n)
if(n%10 <= insertValue)
return n*10 + insertValue;
//if not there yet, keep calling x()
return x(n/10, insertValue)*10 + n%10;
}

Given a circular linked list, find a suitable head such that the running sum of the values is never negative

I have a linked list with integers. For example:
-2 → 2
↑ ↓
1 ← 5
How do I find a suitable starting point such that the running sum always stays non-negative?
For example:
If I pick -2 as starting point, my sum at the first node will be -2. So that is an invalid selection.
If I pick 2 as the starting point, the running sums are 2, 7, 8, 6 which are all positive numbers. This is a valid selection.
The brute force algorithm is to pick every node as head and do the calculation and return the node which satisfies the condition, but that is O(𝑛²).
Can this be done with O(𝑛) time complexity or better?
Let's say you start doing a running sum at some head node, and you eventually reach a point where the sum goes negative.
Obviously, you know that the head node you started at is invalid as an answer to the question.
But also, you know that all of nodes contributing to that sum are invalid. You've already checked all the prefixes of that sublist, and you know that all the prefixes have nonnegative sums, so removing any prefix from the total sum can only make it smaller. Also, of course, the last node you added must be negative, you can't start their either.
This leads to a simple algorithm:
Start a cumulative sum at the head node.
If it becomes negative, discard all the nodes you've looked at and start at the next one
Stop when the sum includes the whole list (success), or when you've discarded all the nodes in the list (no answer exsits)
The idea is to use a window, i.e. two node references, where one runs ahead of the other, and the sum of the nodes within that window is kept in sync with any (forward) movement of either references. As long as the sum is non-negative, enlarge the window by adding the front node's value and moving the front reference ahead. When the sum turns negative, collapse the window, as all suffix sums in that window will now be negative. The window becomes empty, with back and front referencing the same node, and the running sum (necessarily) becomes zero, but then the forward reference will move ahead again, widening the window.
The algorithm ends when all nodes are in the window, i.e. when the front node reference meets the back node reference. We should also end the algorithm when the back reference hits or overtakes the list's head node, since that would mean we looked at all possibilities, but found no solution.
Here is an implementation of that algorithm in JavaScript. It first defines a class for Node and one for CircularList. The latter has a method getStartingNode which returns the node from where the sum can start and can accumulate without getting negative:
class Node {
constructor(value, next=null) {
this.value = value;
this.next = next;
}
}
class CircularList {
constructor(values) {
// Build a circular list for the given values
let node = new Node(values[0]);
this.head = node;
for (let i = values.length - 1; i > 0; i--) {
node = new Node(values[i], node);
}
this.head.next = node; // close the cycle
}
getStartingNode() {
let looped = false;
let back = this.head;
let front = this.head;
let sum = 0;
while (true) {
// As long as the sum is not negative (or window is empty),
// ...widen the window
if (front === back || sum >= 0) {
sum += front.value;
front = front.next;
if (front === back) break; // we added all values!
if (front === this.head) looped = true;
} else if (looped) {
// avoid endless looping when there is no solution
return null;
} else { // reset window
sum = 0;
back = front;
}
}
if (sum < 0) return null; // no solution
return back;
}
}
// Run the algorithm for the example given in question
let list = new CircularList([-2, 2, 5, 1]);
console.log("start at", list.getStartingNode()?.value);
As the algorithm is guaranteed to end when the back reference has visited all nodes, and the front reference will never overtake the back reference, this is has a linear time complexity. It cannot be less as all node values need to be read to know their sum.
I have assumed that the value 0 is allowed as a running sum, since the title says it should never be negative. If zero is not allowed, then just change the comparison operators used to compare the sum with 0. In that case the comparison back === front is explicitly needed in the first if statement, otherwise you may actually drop it, since that implies the sum is 0, and the second test in that if condition does the job.

Binary search for first occurrence of k

I have code that searches a sorted array and returns the index of the first occurrence of k.
I am wondering whether its possible to write this code using
while(left<right)
instead of
while(left<=right)
Here is the full code:
public static int searchFirstOfK(List<Integer> A, int k) {
int left = 0, right = A.size() - 1, result = -1;
// A.subList(left, right + 1) is the candidate set.
while (left <= right) {
int mid = left + ((right - left) / 2);
if (A.get(mid) > k) {
right = mid - 1;
} else if (A.get(mid) == k) {
result = mid;
// Nothing to the right of mid can be the first occurrence of k.
right = mid - 1;
} else { // A.get(mid) < k
left = mid + 1;
}
}
return result;
}
How do I know when to use left is less than or equal to right, or just use left is less than right.
Building on this answer to another binary search question: How can I simplify this working Binary Search code in C?
If you want to find the position of the first occurrence, you can't stop when you find a matching element. Your search should look like this (of course this assumes that the list is sorted):
int findFirst(List<Integer> list, int valueToFind)
{
int pos=0;
int limit=list.size();
while(pos<limit)
{
int testpos = pos+((limit-pos)>>1);
if (list.get(testpos)<valueToFind)
pos=testpos+1;
else
limit=testpos;
}
if (pos < list.size() && list.get(pos)==valueToFind)
return pos;
else
return -1;
}
Note that we only need to do one comparison per iteration. The binary search finds the unique position where all the preceding elements are less than valueToFind and all the following elements are greater or equal, and then it checks to see if the value you're looking for is actually there.
The linked answer highlights several advantages of writing a binary search this way.
Simply put No.
Consider the case of array having only one element i.e., {0} and the element to be searched is 0 as well.
In this case, left == right, but if your condition is while(left<right), then searchFirstOfK will return -1.
This answer is in context of the posted code. If we are talking about alternatives so that we can use while(left<right) then Matt Timmermans's answer is correct and is an even better approach.
Below is a comparison of Matt (OP - Let's call it Normal Binary) and Matt Timmermans (Let's call it Optimized Binary) approaches for a list containing values between 0 and 5000000:
This is an extremely interesting question. The thing is there is a way by which you can make your binary search right always. The thing is determining the correct ranges and avoiding the single element stuck-out behavior.
while(left+1<right)
{
m = (left+right)/2;
if(check condition is true)
left = m;
else
right = m;
}
Only key thing to remember is you always make the left as the smallest condition unsatisfying element and right as the biggest condition satisfying element. That way you won't get stuck up. Once you understand the range division by this method, you will never fail at binary search.
The above initialization will give you the largest condition satisfying element.
By changing the initialization you can get variety of elements (like small condition satisfying element).

Find the lexicographically largest unique string

I need an algorithm to find the largest unique (no duplicate characters) substring from a string by removing character (no rearranging).
String A is greater than String B if it satisfies these two conditions.
1. Has more characters than String B
Or
2. Is lexicographically greater than String B if equal length
For example, if the input string is dedede, then the possible unique combinations are de, ed, d, and e.
Of these combinations, the largest one is therefore ed since it has more characters than d and e and is lexicographically greater than de.
The algorithm must more efficient than generating all possible unique strings and sorting them to find the largest one.
Note: this is not a homework assignment.
How about this
string getLargest(string s)
{
int largerest_char_pos=0;
string result="";
if(s.length() == 1) return s;
for(int i=0;i<s.length();)
{
p=i;
for(int j=i+1;j<s.length();j++)
{
if(s[largerest_char_pos]< s[j]) largerest_char_pos =j;
}
res+=s[largerest_char_pos];
i=largerest_char_pos+1;
}
return result;
}
This is code snipet just gives you the lexicigraphically larger string. If you dont want duplicates you can just keep track of already added characters .
Let me state the rules for ordering in a way that I think is more clear.
String A is greater than string B if
- A is longer than B
OR
- A and B are the same length and A is lexicographically greater than B
If my restatement of the rules is correct then I believe I have a solution that runs in O(n^2) time and O(n) space. My solution is a greedy algorithm based on the observation that there are as many characters in the longest valid subsequence as there are unique characters in the input string. I wrote this in Go, and hopefully the comments are sufficient enough to describe the algorithm.
func findIt(str string) string {
// exc keeps track of characters that we cannot use because they have
// already been used in an earlier part of the subsequence
exc := make(map[byte]bool)
// ret is where we will store the characters of the final solution as we
// find them
var ret []byte
for len(str) > 0 {
// inc keeps track of unique characters as we scan from right to left so
// that we don't take a character until we know that we can still make the
// longest possible subsequence.
inc := make(map[byte]bool, len(str))
fmt.Printf("-%s\n", str)
// best is the largest character we have found that can also get us the
// longest possible subsequence.
var best byte
// best_pos is the lowest index that we were able to find best at, we
// always want the lowest index so that we keep as many options open to us
// later if we take this character.
best_pos := -1
// Scan through the input string from right to left
for i := len(str) - 1; i >= 0; i-- {
// Ignore characters we've already used
if _, ok := exc[str[i]]; ok { continue }
if _, ok := inc[str[i]]; !ok {
// If we haven't seen this character already then it means that we can
// make a longer subsequence by including it, so it must be our best
// option so far
inc[str[i]] = true
best = str[i]
best_pos = i
} else {
// If we've already seen this character it might still be our best
// option if it is a lexicographically larger or equal to our current
// best. If it is equal we want it because it is at a lower index,
// which keeps more options open in the future.
if str[i] >= best {
best = str[i]
best_pos = i
}
}
}
if best_pos == -1 {
// If we didn't find any valid characters on this pass then we are done
break
} else {
// include our best character in our solution, and exclude it for
// consideration in any future passes.
ret = append(ret, best)
exc[best] = true
// run the same algorithm again on the substring that is to the right of
// best_pos
str = str[best_pos+1:]
}
}
return string(ret)
}
I am fairly certain you can do this in O(n) time, but I wasn't sure of my solution so I posted this one instead.

C/C++/Java/C#: help parsing numbers

I've got a real problem (it's not homework, you can check my profile). I need to parse data whose formatting is not under my control.
The data look like this:
6,852:6,100,752
So there's first a number made of up to 9 digits, followed by a colon.
Then I know for sure that, after the colon:
there's at least one valid combination of numbers that add up to the number before the column
I know exactly how many numbers add up to the number before the colon (two in this case, but it can go as high as ten numbers)
In this case, 6852 is 6100 + 752.
My problem: I need to find these numbers (in this example, 6100 + 752).
It is unfortunate that in the data I'm forced to parse, the separator between the numbers (the comma) is also the separator used inside the number themselves (6100 is written as 6,100).
Once again: that unfortunate formatting is not under my control and, once again, this is not homework.
I need to solve this for up to 10 numbers that need to add up.
Here's an example with three numbers adding up to 6855:
6,855:360,6,175,320
I fear that there are cases where there would be two possible different solutions. HOWEVER if I get a solution that works "in most cases" it would be enough.
How do you typically solve such a problem in a C-style bracket language?
Well, I would start with the brute force approach and then apply some heuristics to prune the search space. Just split the list on the right by commas and iterate over all possible ways to group them into n terms (where n is the number of terms in the solution). You can use the following two rules to skip over invalid possibilities.
(1) You know that any group of 1 or 2 digits must begin a term.
(2) You know that no candidate term in your comma delimited list can be greater than the total on the left. (This also tells you the maximum number of digit groups that any candidate term can have.)
Recursive implementation (pseudo code):
int total; // The total read before the colon
// Takes the list of tokens as integers after the colon
// tokens is the set of tokens left to analyse,
// partialList is the partial list of numbers built so far
// sum is the sum of numbers in partialList
// Aggregate takes 2 ints XXX and YYY and returns XXX,YYY (= XXX*1000+YYY)
function getNumbers(tokens, sum, partialList) =
if isEmpty(tokens)
if sum = total return partialList
else return null // Got to the end with the wrong sum
var result1 = getNumbers(tokens[1:end], sum+token[0], Add(partialList, tokens[0]))
var result2 = getNumbers(tokens[2:end], sum+Aggregate(token[0], token[1]), Append(partialList, Aggregate(tokens[0], tokens[1])))
if result1 <> null return result1
if result2 <> null return result2
return null // No solution overall
You can do a lot better from different points of view, like tail recursion, pruning (you can have XXX,YYY only if YYY has 3 digits)... but this may work well enough for your app.
Divide-and-conquer would make for a nice improvement.
I think you should try all possible ways to parse the string and calculate the sum and return a list of those results that give the correct sum. This should be only one result in most cases unless you are very unlucky.
One thing to note that reduces the number of possibilities is that there is only an ambiguity if you have aa,bbb and bbb is exactly 3 digits. If you have aa,bb there is only one way to parse it.
Reading in C++:
std::pair<int,std::vector<int> > read_numbers(std::istream& is)
{
std::pair<int,std::vector<int> > result;
if(!is >> result.first) throw "foo!"
for(;;) {
int i;
if(!is >> i)
if(is.eof()) return result;
else throw "bar!";
result.second.push_back(i);
char ch;
if(is >> ch)
if(ch != ',') throw "foobar!";
is >> std::ws;
}
}
void f()
{
std::istringstream iss("6,852:6,100,752");
std::pair<int,std::vector<int> > foo = read_numbers(iss);
std::vector<int> result = get_winning_combination( foo.first
, foo.second.begin()
, foo.second.end() );
for( std::vector<int>::const_iterator i=result.begin(); i!=result.end(), ++i)
std::cout << *i << " ";
}
The actual cracking of the numbers is left as an exercise to the reader. :)
I think your main problem is deciding how to actually parse the numbers. The rest is just rote work with strings->numbers and iteration over combinations.
For instance, in the examples you gave, you could heuristically decide that a single-digit number followed by a three-digit number is, in fact, a four-digit number. Does a heuristic such as this hold true over a larger dataset? If not, you're also likely to have to iterate over the possible input parsing combinations, which means the naive solution is going to have a big polynomic complexity (O(nx), where x is >4).
Actually checking for which numbers add up is easy to do using a recursive search.
List<int> GetSummands(int total, int numberOfElements, IEnumerable<int> values)
{
if (numberOfElements == 0)
{
if (total == 0)
return new List<int>(); // Empty list.
else
return null; // Indicate no solution.
}
else if (total < 0)
{
return null; // Indicate no solution.
}
else
{
for (int i = 0; i < values.Count; ++i)
{
List<int> summands = GetSummands(
total - values[i], numberOfElements - 1, values.Skip(i + 1));
if (summands != null)
{
// Found solution.
summands.Add(values[i]);
return summands;
}
}
}
}

Resources