A lecturer gave this question in class:
[question]
A sequence of n integers is stored in
an array A[1..n]. An integer a in A is
called the majority if it appears more
than n/2 times in A.
An O(n) algorithm can be devised to
find the majority based on the
following observation: if two
different elements in the original
sequence are removed, then the
majority in the original sequence
remains the majority in the new
sequence. Using this observation, or
otherwise, write programming code to
find the majority, if one exists, in
O(n) time.
for which this solution was accepted
[solution]
int findCandidate(int[] a)
{
int maj_index = 0;
int count = 1;
for (int i=1;i<a.length;i++)
{
if (a[maj_index] == a[i])
count++;
else
count--;
if (count == 0)
{
maj_index =i;
count++;
}
}
return a[maj_index];
}
int findMajority(int[] a)
{
int c = findCandidate(a);
int count = 0;
for (int i=0;i<a.length;i++)
if (a[i] == c) count++;
if (count > n/2) return c;
return -1;//just a marker - no majority found
}
I can't see how the solution provided is a dynamic solution. And I can't see how based on the wording, he pulled that code out.
The origin of the term dynamic programming is trying to describe a really awesome way of optimizing certain kinds of solutions (dynamic was used since it sounded punchier). In other words, when you see "dynamic programming", you need to translate it into "awesome optimization".
'Dynamic programming' has nothing to do with dynamic allocation of memory or whatever, it's just an old term. In fact, it has little to do with modern meaing of "programming" also.
It is a method of solving of specific class of problems - when an optimal solution of subproblem is guaranteed to be part of optimal solution of bigger problem. For instance, if you want to pay $567 with a smallest amount of bills, the solution will contain at least one of solutions for $1..$566 and one more bill.
The code is just an application of the algorithm.
This is dynamic programming because the findCandidate function is breaking down the provided array into smaller, more manageable parts. In this case, he starts with the first array as a candidate for the majority. By increasing the count when it is encountered and decreasing the count when it is not, he determines if this is true. When the count equals zero, we know that the first i characters do not have a majority. By continually calculating the local majority we don't need to iterate through the array more than once in the candidate identification phase. We then check to see if that candidate is actually the majority by going through the array a second time, giving us O(n). It actually runs in 2n time, since we iterate twice, but the constant doesn't matter.
Related
To find all prime numbers from 1 to N.
I know we usually approach this problem using Sieve of Eratosthenes, I had an alternate approach in mind using gcd that I wanted your views on.
My approach->
Keep a maintaining a variable if all prime numbers are processed till any iteration. If gcd of this var, number i ==1. That means the nos. are co-prime so i must be prime.
For ex: gcd(210,11) == 1, so 11 is prime.
{210=2*3*5*7}
Pseudocode:
Init num_list={contains numbers 2 to N} [since 0 and 1 arent prime nos.]
curr_gcd = 2, gcd_val=1
For i=3;i<=N;i++
gcd_val=__gcd(curr_gcd,i)
if gcd_val == 1 //(prime)
curr_gcd = curr_gcd * i
else //(composite so remove from list)
numList.remove(i)
Alternatively, we can also have a list and push the prime numbers into that list.
SC = O(N)
TC = O(N log(N)) [TC to calculate gcd using euclid's method => O(log(max(a,b)))]
Does this seem right or I am calculating the TC incorrectly here. Please post your views on this.
TIA!
Looks like the time complexity of my approach is closer to O(log^2(n)) as pointed out by many in the comments.
Also, the curr_gcd var would become quite large as N is increased and would definitely overflow int and long size limits.
Thanks to everyone who responded!
Maybe your method is theoretically right,but evidently, it's not excellent.
It's efficiency is worse than SoE, the range of data that it needs is too large. So maybe it seems elegant to look but hard to use.
In my views, "To find all prime numbers from 1 to N" is already a well-known problem and that means it's solution is well considered.
At first, maybe we use brute-force to deal with it like this.
int primes[N],cnt;//store all prime numbers
bool st[N];//st[i]:whether i is rejected
void get_primes(int n){
for(int i=2;i<=n;i++){
if(st[i]) continue;
primes[cnt++]=i;
for(int j=i+i;j<=n;j+=i){
st[j]=true;
}
}
}
it's a O(n^2) time algorithm.Too slow to endure.
Go ahead. We have SoE, which use O(nlognlogn) time.
But we have a better algorithm called "liner sieve", which only use O(n) time, just as it's name. I implement it with C language like this.
int primes[N],cnt;
bool st[N];
void get_primes(int n){
for(int i=2;i<=n;i++){
if(!st[i]) primes[cnt++]=i;
for(int j=0;primes[j]*i<=n;j++){
st[primes[j]*i]=true;
if(i%primes[j]==0) break;
}
}
}
this O(n) algorithm is used by me to slove this kind of algorithm problems that appear in major IT companies and many kinds of OJ.
This is my assignment question I've been trying to understand for a couple of days and ultimately solve it. So far, I have got no success. So any guidance, help in understanding or solving the problem is appreciated.
You are given a set of m constraints over n Boolean variables
{x1, x2, ..., xn}.
The constraints are of two types:
equality constraints: xi = xj, for some i != j
inequality constraints: xi != xj, for some i != j
Design an efficient greedy algorithm that given the
set of equality and inequality constraints determines if it is
possible or not to satisfy all the constraints simultaneously.
If it
is possible to satisfy all the constraints, your algorithm should
output an assignment to the variables that satisfyes all the
constraints.
Choose a representation for the input to this problem
and state the problem formally using the notation Input: ..., Output:
....
Describe your greedy algorithm in plain English. In what
sense is your algorithm "greedy"?
Describe your greedy algorithm
in pseudocode.
Briefly justify the correctness of your algorithm.
State and justify the running time of your algorithm. The more
efficient algorithm the better.
What I've figured out so far is that this problem is related to the Boolean satisfiability (SAT) problem. I've tried setting all the variables to false first and then, by counter examples, prove that it cannot satisfy all the constraints at once.
I am getting confused between constraint satisfaction problems (CSP) and Horn SAT. I read certain articles on these to get a solution and this led me to confusion. My logic was to create a tree and apply DFS to check if constraints are satisfied, whereas Horn SAT solutions are leading me to mathematical proofs.
Any help is appreciated as this is my learning stage and I cannot master it all at once. :)
(informal) Classification:
So firstly, it's not the boolean SAT problem, because that's NP-complete. Your teacher has implied that this isn't NP-complete by asking for an efficient (ie. at most polynomial-time) way to always solve the problem.
Modelling (thinking about) the problem:
One way to think of this problem is as a graph, where inequalities represent one type of edge, while equalities represent another:
Thinking of this problem graphically helped me realise that it's a bit like a graph-colouring problem: we could set all nodes to ? (unset), then choose any node to set to true, then do a breadth-first search from that node to set all connecting nodes (setting them to either true or false), checking for any contradiction. If we complete this for a connected component of the graph, without finding contradictions, then we can ignore all nodes in that part and randomly set the value of another node, etc. If we do this until no connected components are left, and we still have no contradictions, then we've set the graph in a way that represents a legitimate solution.
Solution:
Because there's exactly n elements, we can make an associated "bucket" array of the equalities and another for the inequalities (each "bucket" could contain an array of what it equates to, but we could get even more efficient than this if we wanted [the complexity would remain the same]).
Your array of arrays for equalities could be imagined like this:
which would represent that:
0 == 1
1 == 2
3 == 4
Note that this is an irregular matrix, and requires 2*m space. We do the same thing for the an inequality matrix. Moreover, setting up both of these arrays (of arrays) uses O(m + n) space and time complexity.
Now, if there exists a solution, {x0, x1, x2, x3}, then {!x0, !x1, !x2, !x3} is also a solution. Proof:
(xi == xj) iff (!xi == !xj)
So it won't effect our solution if we set one of the elements randomly. Let's set xi to true, and set the others to ? [numerically we'll be dealing with three values: 0 (false), 1 (true), and 2 (unset)].
We'll call this array solution (even though it's not finished yet).
Now we can use recursion to consider all the consequences of setting our value:
(The below code is psuedo-code, as the questioner didn't specify a language. I've made it somewhat c++-style, but just to keep it generic and to use the pretty formatting colours.)
bool Set (int i, bool val) // i is the index
{
if (solution[i] != '?')
return (solution[i] == val);
solution[i] == val;
for (int j = 0; j < equalities[i].size(); j += 1)
{
bool success = Set(equalities[i][j], val);
if (!success)
return false; // Contradiction found
}
for (int j = 0; j < inequalities[i].size(); j += 1)
{
bool success = Set(inequalities[i][j], !val);
if (!success)
return false; // Contradiction found
}
return true; // No contradiction found
}
void Solve ()
{
for (int i = 0; i < solution.size(); i += 1)
solution[i] == '?';
for (int i = 0; i < solution.size(); i += 1)
{
if (solution[i] != '?')
continue; // value has already been set/checked
bool success = Set(i, true);
if (!success)
{
print "No solution";
return;
}
]
print "At least one solution exists. Here is a solution:";
print solution;
}
Because of the first if condition in the Set function, the function can only be executed (beyond the if statement) n times. The Set function can call itself only when passing the first if statement, which it does n times, 1 for each node value. Each time the Set function passes into the body of the function (beyond the if statement), the work it does is proportional to the number of edges associated with the corresponding node. The Solve function can call the Set function at most n times. Hence the number of times that the function can be called is O(m+n), which corresponds to the amount of work done during the solving process.
A trick here is to recognise that the Solve function will need to call the Set function C times, where C is the number of connected components of the graph. Note that each connected component is independent of each other, so the same rule applies: we can legitimately choose a value of one of its elements then consider the consequences.
The fastest solution would still need to read all of the constraints, O(m) and would need to output a solution when it's possible, O(n); therefore it's not possible to get a solution with better time complexity than O(m+n). The above is a greedy algorithm with O(m+n) time and space complexity.
It's probably possible to get better space complexity (while maintaining the O(m+n) time complexity), maybe even O(1), but I'm not sure.
As for Horn formulas, I'm embarrassed to admit that I know nothing about them, but this answer directly responds to everything that was asked of you in the assignment.
Let’s take an example 110 with constraints x1=x2 and x2!=x3
Remember since we are only given the constraints, the algorithm can also end up generating 001 as output as it satisfies the constraints too
One way to solve it would be
Have two lists one for each constraint type,
Each list holds a pair of i,j index.
Sort the lists based on the i index.
Now for each pair in equality constraint check if there’s no constraint in inequality that conflicts with it.
If it does then you can exit right away
Otherwise you have to check if there’s more pairs in equality constraint lists that have one of the pairs.
You can then assign one or zero to that and eventually you would be able to generate the complete output
I am battling to find the complexity of given code. I think I am struggling with identifying the correct complexity and how to actually analyze the complexity. The code to be analyzed is as follows:
public void doThings(int[] arr, int start){
boolean found = false;
int i = start;
while ((found != true) && (i<arr.length)){
i++;
if (arr[i]==17){
found=true;
}
}
}
public void reorganize(int[] arr){
for (int i=0; i<arr.length; i++){
doThings(arr, i);
}
}
The questions are:
1) What is the best case complexity of the reorganize method and for what inputs does it occur?
2) What is the worst case complexity of the reorganize method and for what inputs does it occur?
My answers are:
1) For the reorganize method there are two possible best cases that could occur. The first is when the array length is 1, meaning the loop in the reorganize and doThings method will run exactly once. The other possibility is when the ith item of the array is 17 meaning the doThings loop will not run completely on that ith iteration. Thus in both cases the best case=Ο(n).
2) The worst case would be when the number 17 is at the end of the array and when the number 17 is not in the array. This is will mean that the array will be traversed n×n times meaning the worst case would be Ο(n^2 ).
Could anyone please help me answer the questions correctly, if mine is incorrect and if possible explain the problem?
"best case" the array is empty, and you search nothing.
The worst case is that you look at every single element because you never see 17. All other cases are in between.
if (arr[i]==17){ is the "hottest path" of the code, meaning it is ran most often.
It will always execute a total of n*(n-1)/2 times (I think I did that math right) in the worst case because even when you set found = true, the reorganize method doesn't know about that, doesn't end, and continues to search even though you already scanned the entire array.
Basically, flatten the code without methods. You have this question.
What is the Big-O of a nested loop, where number of iterations in the inner loop is determined by the current iteration of the outer loop?
I am brand new to the computer science field. I'm hoping to land my first job in the field. It's a Salesforce internship here in SF and I was given this problem to figure out. I'm not really sure where to begin. Any help or just pointing me in the right direction would be greatly appreciated. Thank you and Happy Holidays.
Here is the problem:
Consider a scenario where you have a single list of integers and need to divide it into two separate lists where the sum of the integers in both lists are equal or close to equal. Each sublist must have at least one element. Please describe two approaches to solving this problem. The first approach should use an algorithm that is accurate and the second approach should use an algorithm that is faster. To clarify, our definition of accurate means "guaranteed to find the best possible answer that meets our requirements". For the faster solution, we still expect your algorithm to find a correct answer most of the time. The algorithms should not have the same Big-O notation.
Approximate algorithm with O(n log-n) complexity:
Sort the list in descending order (complexity O(n log-n));
Place the items of the sorted list in two piles, so that as each item is placed the difference between the sum of the two lists is minimized (complexity O(n)).
Exact solution with complexity (2^n):
Using the bits of every integer from 0 to 2^n as a bit mask for
inclusion in list 1, exhaustively test every possible assignment of
the list members to the two sets.
Here is a stupid simple solution to your problem that may get close to the best answer, but is not guaranteed to. This is a c# version, i don't know what language you are doing it in.
I'll leave it to you to find a version that is guaranteed to get the best answer.
public void SplitLists()
{
var numbers = new int[100];
var ran = new Random();
for (var i = 0; i < numbers.Length; i ++)
{
numbers[i] = ran.Next(10) + 1;
}
var list1 = new List<int>();
var list2 = new List<int>();
foreach (var num in numbers)
{
if(list1.Sum() + num < list2.Sum())
list1.Add(num);
else
list2.Add(num);
}
}
The algorithm:
public boolean search(int[] A, int target)
{
for(int i=0;i<A.length;i++)
{
if(target==A[i]) return true;
if(target<A[i]) return false;
}
return false;
}
I'm having trouble understanding this problem - I know it has something to do with the series, but the introduction of two comparisons per iteration really has me stumped. Can anybody help me out and explain this to me?
How I used to look at this was:
think of the best case, whats the least possible comparisons you can have? that would be when:
target==A[0] //first element
think of your worst case, whats the most comparisons you can have? that is when:
target==A[A.length-1] ///last elements or not found
so what would be our average case?
well take into consideration that first element is really fast, but last element is slow (O(1) vs O(n))
also as you move away from the begginning it starts taking longer, but at the same time as you get away from the end it gets faster. so your average case would lie in the middle.
if you are looking for a specific number avg number of comparisons might be
3 comparisons "for comparrison,target == A[i], target < A[i] times n/2 " which is our average number of comparissons
if you want to test it you can make a counter and increase 1 everytime you do a comparison in your algorithm