Issue with time-analysis - big-o

I'm teaching myself data structures, and am on a section giving a brief outline of time analysis. The following problem was given:
"Each of the following are formulas for the number of operations in some
algorithm. Express each formula in big-O notation."
The problem then goes on to give multiple scenarios. One was:
g.) The number of times that n can be divided by 10 before dropping below
1.0.
(Note: It doesn't state what n is exactly, so I'm assuming it's just some input size. But I don't think it matters in terms of how the problem is stated)
I reasoned that as this would relate to its order of magnitude, it should just be log n. However, the text says that it should be quadratic. Is there something I am missing?
Any help to help my thinking would be greatly appreciated.

Related

Problem size, input size and asymptotic behavior for an algorithm (re post)

I am re-posting my question because accidentally I said that another Thread (with a similar topic) did answer my question, which it wasn't the case. I am sorry for any inconvenience
I am trying to understand how the input size, problem size and the asymptotic behavior of an arbitrary algorithm given in pseudo code format differ from each other. While I fully understand the input and the asymptotic behavior, I have problems understanding the problem size. To me it looks as if problem size= space complexity for a given problem. But I am not sure. I'd like to illustrate my confusion with the following example:
We have the following pseudo code:
ALGONE(x,y)
if x=0 or x=y then
return 1
end
return ALGONE(x-1,y-1) + ALGONE(x,y-1)
So let's say we give two inputs in $x$ and $y$ and $n$ represents the number of digits.
Since we are having addition as our main operation, and addition is an elementary operation, and for two numbers of n digits, we need n operations, then the asymptotic behavior is of the form O(n).
But what about the problem size in this case. I don't understand what am I supposed to say. The term problem-size is so vague. It depends on the algorithm but even then, even if one is able to understand the algorithm what do you give as an answer?
I'd assume that in this particular case the problem size, might be the number of bits we need to represent the input. But this is a guess of mine, grounded in nothing

What does "algorithm problem size" actually mean?

I'm currently in a Data Structures course at my university and did do some algorithm analysis in a prior class, but it was the section I had the most difficult time with in the previous course. We are now going over algorithm analysis in my data structures course and so I'm going back through my textbook from the previous course to see what it says on the matter.
In the textbook, it says "For every algorithm we want to analyze, we need to define the size of the prob-
lem." Doing some Google searching, it's not entirely clear what "problem size" actually means. I'm trying to get a more concrete definition of what a problem size is so I can identify it in an algorithm.
I know that, if I have an algorithm that is sorting a list of numbers, the problem size is n, the size of the list. With that said, saying that doesn't clarify what "problem size" actually is, except for in that context. An algorithm is not just a process to sort numbers, so I can't always say that the problem size is the number of elements in a list.
Hoping someone out there can clarify things for me, and that you all are doing well.
Thank you
The answer is right there in the part you quoted (emphasis mine):
For every algorithm we want to analyze, we need to define the size of the problem
The "problem size" is only defined numerically relative to the algorithm. For an algorithm where the input is an array or a list, the problem size is typically measured by its length; for a graph algorithm, the problem size is typically measured by the number of vertices and the number of edges (with two variables); for an algorithm where the input is a single number, the problem size may be measured by the number itself, or the amount of bits required to represent the number in binary, depending on context.
So the meaning of "problem size" is specific to the problem that the algorithm solves. If you want a more universal definition which could apply to all problems, then the problem size can be defined as the number of bits required to represent the input; but this definition is not practical, and is only used in theory to talk about classes of problems (such as those which are solvable in polynomial time).
The problem size is the number of bits needed to store an instance of the problem, when it is specified in a reasonable encoding.
To clarify the concept, let me define this in the layman's terms:
Given:
You have a big phone book.
Problem:
You are told to find the number of person John Mcallister.
Approach:
You can either search for this entry through each page (in the linear manner);
or, if the phone-book is sorted, you can utilize Binary Search;
Answer to your question:
Algorithm problem here is Finding the entry in the Phone Book;
Algorithm problem's size is the size of data, your algorithm should apply to (in your case, it's the size of your phone-book. If it has 10 entries per each page, and the book has 50 pages, the size is 50x10=500, to wit, 500 entries.)
As your algorithm should solve your task of examining entire phone book, the size of your task/problem, which you implement the algorithm for, is 500.
Problem Size is generally denoted with n and it literally means the size of input data.

Understanding Big-O Notation [duplicate]

This question already has answers here:
What is a plain English explanation of "Big O" notation?
(43 answers)
Closed 8 years ago.
I am reading a book "Beginning Algorithms" by Simon Harris and James Ross.
In the early pages, there is a section on Understanding Big-O Notation. I read this section, and re-read this section maybe about a dozen times. I still am not able to wrap my head around a couple of things. I appreciate any help to get rid of my confusion.
The author / authors state "The precise number of operations is not actually that important. The complexity of an algorithm is usually defined in terms of the order of magnitude of the number of operations required to perform a function, denoted by a capital O for order of followed by an expression representing some growth relative to the size of the problem denoted by the letter N."
This really made my head hurt, and unfortunately, everything else following this paragraph made no sense to me because this paragraph is supposed to lay the foundation for the next reading.
The book does not define "order of magnitude." I googled this and the results just told me that order of magnitude are defined in powers of 10. But what does that even mean? Do you take the number of operations and define that number in a power of 10, and that equals the complexity? Also, what is considered the "size of the problem?" Is the size of the problem the number of operations? Or is the size of the problem the "order of magnitude of the number of operations required to perform a function."
Any practical examples and a proper explanation of this would really help.
Thanks!
Keep it simple!
Just think of the Big-O as a way to express the performance of the algorythm. That performance will depend on the number of elements the algorythm is handeling = n.
An example, when you have to make a sum. You need one statement for the first addition, one statement for the second addition, and so on... So the performance will be linear with the number of elements = O(n).
Imagine a sort algorythm, which is very smart, and for each element of handles it automatically shortens the sort for the next element. This will be logarithmic with the number of elements = O(log(n)).
Or a complex formula with parameters, and with each extra parameter, the execution time multiplies. This will be exponential = O(10^n).

Choosing the optimal radix/number-of-buckets when sorting n-bit integers using radix sort

This is a popular question: What is the most efficient (in time complexity) way to sort 1 million 32-bit integers. Most answers seem to agree that one of the best ways would be to use radix sort since the number of bits in those numbers is assumed to be constant. This is also a very common thought exercise when CS students are first learning non-comparison based sorts. However, what I haven't seen described in detail (or at least clearly) is how to optimally choose the radix (or number of buckets) for the algorithm.
In this observed answer, the selection of the radix/number of buckets was done empirically and it turned out to be 2^8 for 32bits 1million integers. I'm wondering if there is a better way to choose that number? In "Introduction to Algorithms" (p.198-199) it explains Radix's run-time should be Big Theta(d(n+k)) (d=digits/passes, n=number of items, k=possible values). It then goes further and says that given n b-bit numbers, and any positive integer r <= b, radix-sort sorts the number in Big Theta((b/r)(n+2^r)) time. It then says: "If b>= floor(lg(n)), choosing r ~= floor(lg(n)) gives the best time to within a constant factor...".
But, if we choose r=lg(1million)~=20 != 8 as the observed answer suggests.
This tells me that I'm very likely misinterpreting the "choosing of r" approach the book is suggesting and missing something (very likely) or the observed answer didn't choose the optimal value.
Could anyone clear this up for me? Thank you in advance.
The observed answer points to something that seems to want credentials from Google and I'm not keen on "papers, please". However, I think that this is best solved empirically because how long each choice of parameter takes will depend on details of caching and other memory access behaviour. When we work out the time an algorithm will take in theory we don't normally use such a detailed model - we normally just think of the number of operations or number of memory accesses, and we usually even discard constant factors so we can use notations like O(n) vs O(n^2).
If you were doing a lot of similar radix sorts within a long-running program you could have it time a series of test runs before it started up to chose the best setting. This would make sure that it used the fastest setting even if different computers required different settings, because they had different sized caches, or a different ratio of access times between main memory and cache.

variant of knapsack problem

I have 'n' number of amounts (non-negative integers). My requirement is to determine an optimal set of amounts so that the sum of the combination is less than or equal to a given fixed limit and the total is as large as possible. There is no limit to the number of amounts that can be included in the optimal set.
for sake of example: amounts are 143,2054,546,3564,1402 and the given limit is 5000.
As per my understanding the knapsack problem has 2 attributes for each item (weight and value). But the problem stated above has only one attribute (amount). I hope that would make things simpler? :)
Can someone please help me with the algorithm or source code for solving this?
this is still an NP-hard problem, but if you want to (or have to) to do something like that, maybe this topic helps you out a bit:
find two or more numbers from a list of numbers that add up towards a given amount
where i solved it like this and NikiC modified it to be faster. only difference: that one was about getting the exact amount, not "as close as possible", but that would be only some small changes in code (and you'll have to translate it into the language you're using).
take a look at the comments in my code to understand what i'm trying to do, wich is, in short form:
calculating all possible combinations of the given parts and sum them up
if the result is the amount i'm looking for, save the solution to an array
at least, sort all possible solutions to get the one using the least parts
so you'll have to change:
save a solution if it's lower than the amount you're looking for
sort solutions by total amount instead of number of used parts
The book "Knapsack Problems" By Hans Kellerer, Ulrich Pferschy and David Pisinger calls this The Subset Sum Problem and dedicates an entire chapter (Ch 4) to it. The chapter is very comprehensive and covers algorithms as well as computational results.
Even though this problem is a special case of the knapsack problem, it is still NP-hard.

Resources