how to prove a task is done in minimum required commands [closed] - algorithm

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
Is it possible to show that a task is done in the minimum amount of required commands or lines of code in a language, it is obvious that if you can do a task in one command this is the shortest way to do so but this is only going to be true of tasks like addition, if I say created an algorithm for sorting how would I know that there does or does not exist a faster way to carry out this task?

First off, minimum number of lines of code does not necessarily mean minimum number of commands. (i.e. processor commands) As the former is not really significant in an algorithmic sense, I am assuming that you are trying to find out the latter.
On that note, there are a variety of techniques to prove the minimum number of steps(not commands) needed to do some complex tasks. Finding the minimal number of steps necessary to achieve a task does not directly correspond to the minimum number of commands; but it should be relatively trivial to modify these techniques to find out the minimum number of commands essential to solve the problem. Note that these techniques may not necessarily yield a lower bound for every complex task, and whether a lower bound can be found depends on the specific task.
Incidentally, (comparison-based) sorting, which was mentioned in your question, is one of the tasks for which there is such a proof method, namely decision trees. You may find a more detailed description of the method on many sources including here but the method simply tries to find the least number of comparisons that has to be made in order to sort an array. It is a well-known technique lying at the heart of proving why comparison-based sorting algorithms have a time complexity lower bound of NlogN.

Related

How to select the number of cluster centroid in K means [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I am going through a list of algorithm that I found and try to implement them for learning purpose. Right now I am coding K mean and is confused in the following.
How do you know how many cluster there is in the original data set
Is there any particular format that I have follow in choosing the initial cluster centroid besides all centroid have to be different? For example does the algorithm converge if I choose cluster centroids that are different but close together?
Any advice would be appreciated
Thanks
With k-means you are minimizing a sum of squared distances. One approach is to try all plausible values of k. As k increases the sum of squared distances should decrease, but if you plot the result you may see that the sum of squared distances decreases quite sharply up to some value of k, and then much more slowly after that. The last value that gave you a sharp decrease is then the most plausible value of k.
k-means isn't guaranteed to find the best possible answer each run, and it is sensitive to the starting values you give it. One way to reduce problems from this is to start it many times, with different starting values, and pick the best answer. It looks a bit odd if an answer for larger k is actually larger than an answer for smaller k. One way to avoid this is to use the best answer found for k clusters as the basis (with slight modifications) for one of the starting points for k+1 clusters.
In the standard K-Means the K value is chosen by you, sometimes based on the problem itself ( when you know how many classes exists OR how many classes you want to exists) other times a "more or less" random value. Typically the first iteration consists of randomly selecting K points from the dataset to serve as centroids. In the following iterations the centroids are adjusted.
After check the K-Means algorithm, I suggest you also see the K-means++, which is an improvement of the first version, as it tries to find the best K for each problem, avoiding the sometimes poor clusterings found by the standard k-means algorithm.
If you need more specific details on implementation of some machine learning algorithm, please let me know.

Powers of a half that sum to one [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
Call every subunitary ratio with its denominator a power of 2 a perplex.
Number 1 can be written in many ways as a sum of perplexes.
Call every sum of perplexes a zeta.
Two zetas are distinct if and only if one of the zeta has as least one perplex that the other does not have. In the image shown above, the last two zetas are considered to be the same.
Find all the numbers of ways 1 can be written as a zeta with N perplexes. Because this number can be big, calculate it modulo 100003.
Please don't post the code, but rather the algorithm. Be as precise as you can.
This problem was given at a contest and the official solution, written in the Romanian language, has been uploaded at https://www.dropbox.com/s/ulvp9of5b3bfgm0/1112_descr_P2_fractii2.docx?dl=0 , as a docx file. (you can use google translate)
I do not understand what the author of the solution meant to say there.
Well, this reminds me of BFS algorithms(Breadth first search), where you radiate out from a single point to find multiple solutions w/ different permutations.
Here you can use recursion, and set the base case as when N perplexes have been reached in that 1 call stack of the recursive function.
So you can say:
function(int N <-- perplexes, ArrayList<Double> currentNumbers, double dividedNum)
if N == 0, then you're done - enter the currentNumbers array into a hashtable
clone the currentNumbers ArrayList as cloneNumbers
remove dividedNum from cloneNumbers and add 2 dividedNum/2
iterate through index of cloneNumbers
for every number x in cloneNumbers, call function(N--, cloneNumbers, x)
This is a rough, very inefficient but short way to do it. There's obviously a lot of ways you can prune the algorithm(reduce the amount of duplicates going into the hashtable, prevent cloning as much as possible, etc), but because this shows the absolute permutation of every number, and then enters that sequence into a hashtable, the hashtable will use its equals() comparison to see that the sequence already exists(such as your last 2 zetas), and reject the duplicate. That way, you'll be left with the answer you want.
The efficiency of the current algorithm: O(|E|^(N)), where |E| is the absolute number of numbers you can have inside of the array at the end of all insertions, and N is the number of insertions(or as you said, # of perplexes). Obviously this isn't the most optimal speed, but it does definitely work.
Hope this helps!

Algorithmic complexity of checking if an element exists in an array [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
If I have an array of unsorted numbers and a number I'm looking for, I believe there's no way of checking if my number is in it except by going through each member and comparing.
Now, in mathematics and various theoretical branches I've been interested in, there's often the pattern that you usually get what you put in. What I mean is, there's usually an explanation for every unexpected result. Take the Monty Hall problem as an example. It seems counter-intuitive until you realize the host adds more information into the situation because he knows what door the car is behind.
Since you're iterating on the array, instead of just getting a yes or no answer, you also get the exact location of the element (if it's there). Wouldn't it then make sense that there's an algorithm that's less complex, but give you ONLY a single bit of information?
Am I completely off base here?
Is there an actual correlation between the amount of information you get and the complexity of an algorithm? What's the theory behind the relationship between the amount of information you get from an algorithm and it's complexity?
Yes, you're completely off base, sorry!
Algorithmic complexity is defined in terms of how many operations it takes to solve the problem of size N. If the array has N elements in it, then there is no way of determining whether the value appears in the array other than checking all N elements. That makes it linear, or O(N).
The fact that you can also determine the location of the value in O(N) (as indeed you can) doesn't mean that you can solve the simpler problem in less time.
When you are searching an array, indexing is the price that you pay for having an array. An ability to access an element by index is inherent in the structure of the array: in other words, once you say "I am going to search an array" (not a collection, but specifically an array) you have already paid for the index. At this point, there is no way to get rid of it, and the O(n) cost associated with searching the array.
However, this is not the only solution: if you agree to drop the ability to index as a requirement, you could build a collection that gives you a yes/no answer much faster. For example, if you use a hash table, your search time becomes O(1). Of course there is no associated index in a hash table: inability to access items in arbitrary order is your payment for an ability to check for presence or absence of items in constant time.

Subset product & quantum computers, is an instance solvable [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 8 years ago.
Improve this question
Suppose you have a quantum computer that can run Shor's algorithm for factorization of integers.
Is it then possible to produce an oracle that determines if no solution exists for an instance of the Subset Product problem, with 100% confidence, in sub-exponential time?
So, the oracle is given a sequence x1, ... xn, as the description of a subset product problem.
It responds either Yes, a solution to this instance does not exist, or No, a solution to this instance may or may not exist.
If we take he prime factors of all elements in the sequence and then check to see if all of them are present in the target product's factors, this should tell us if a solution is not at all possible. A solution exist may exist if and only if all the prime factors are accounted for. On quantum computers, prime factorization is sub-exponential.
Would like some feedback on if this is correct logic- if it works- and if the complexity is indeed different between classical and quantum systems for this oracle/algorithm. Would also appreciate an explanation on reductions - can Subset Product be reduced to 3SAT without consequence?
Your algorithm, if I understood it correctly, will fail for the elements [6, 15] and the target 10. It will determine that 6*15 = 2*3*3*5, which has all of the factors used in 10=2*5, and incorrectly assert that this means you can make 10 by multiplying 6 and 15.
There are two reasons that it's unlikely you'll be able to fix this algorithm:
Subset Product is NP-Complete. Finding a polynomial time quantum algorithm for it, or showing that no such algorithm exists, is probably as hard as determining if P=NP. That is to say, very very hard.
You don't want the prime factors, you want the "no-need-to-reduce" factors. For example, if every time a number in the problem has a prime factor of 13 it's accompanied by a factor of 17 then there's no need to break 221 into 13*17. You can apply Euclid's gcd algorithm to various combinations of elements to find these no-need-to-reduce factors, no quantum-ness required.

Do we really have case in this algorithm [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
I am having trouble for solving the running time of the following algorithm
Now first my question, is the case really important here(I can not come up with 2 different inputs of the same size that are different from each other) ?
Second, I think this algorithm runs in O(n^2). Am I right?
The comment you wrote in #OBu's answer is about only a quarter right:
1*n + 2*(n-1) + 3*(n-2) + ... +n*1
That equals to:
Sum(i=1..n, i*(n-i+1)) = n*Sum(i) - Sum(i*i) + n = n*[n(n+1)/2] - [n(n+1)(2n+1)/6] + n
If you want, feel free to compute the exact formula, but the overall complexity is O(n^3).
As a rule of thumb (more like a back-of-the-envelope computation trick I've picked up... just to give you a quick idea): if you are unsure about algorithms with multiple for's (with different lengths, but all in relation with n, as you have above) try to compute how many operations are performed around the middle of the algorithm (n/2). This usually gives you an idea on how the running time complexity for the whole thing might looks like - you are basically computing the largest element in the sum, so the overall complexity is always >= than the thing you compute (in most cases it's the same though).
Just to give you some hints:
How many loops do you have and how are they nested?
How often is each loop running (start solving this from the outer to the inner loop)
If in doubt, try it with n=4 or 5 and calculate each step. After this, you'll have a good feeling for what's going on.

Resources