Suppose there is an array A containing n elements, and a line of code that contains an if-statement with multiple conditions, for example:
for i = 2 to n
if A[i] > m and A[i] - A[1] = EVEN
then set m to A[i]
Is the runtime for the second line n-1, or is it 2*(n-1) since there are two conditions for the if-statement?
Generally speaking, when you're talking about runtime, you need some sort of "cost model" to talk about how much each operation "costs." It's actually pretty unusual to see a cost model that would go into the level of detail that you're going into here - usually, you'd just abstract away the details and say that the cost of performing all those tests is O(1) (some constant that doesn't depend on the size of the input) rather than counting at that precise of a level.
If you are going to count at that precise of a level, you might also want to factor in the cost of the array lookups, whether or not things short-circuit, the effect of branch prediction or misprediction on the runtime, etc... and that partially explains why it's so rare to see people actually talk about things at that level of detail.
Related
Andersen's pointer analysis loops over the program multiple times and represents points-to relations in a graph. It iterates over the code until the points-to graph does not change anymore. I came across the following example from
http://pages.cs.wisc.edu/~fischer/cs701.f14/7.POINTER-ANALYSIS.html#andersen
p = &a;
p = &b;
m = &p;
r = *m;
q = &c;
m = &q;
In this case the points-to relation 'r points to c' is the only relation that is added to the graph after the second iteration. However, a precise outcome would be that r points to b after executing this code. So the second iteration makes the outcome less precise.
I understand that a completely precise pointer analysis does not exist so that points-to sets are an overestimation. But I have looked for many examples and it seems as if the extra iteration always makes the outcomes more imprecise. So why iterate multiple times? Wouldn't one iteration be sound and more precise?
I expect that there must be a reason for this so there must exist cases in which one iteration misses certain points-to relations that actually are true. Can someone come up with an example in which a true points-to relation is not found (i.e. the outcome is not sound) after the first iteration?
This Andersen's analysis is flow-insensitive. It has no notion of control-flow and treats all instructions (statements) as a part of a set.
In your example, the analysis cannot determine that r only points to b in any real execution because it does not encode what comes first between the statements, so it conservatively assumes that any order can occur. (Similar to how the first iteration found r = { a, b } and p = { a, b } despite it being obvious to any person that r and p would only point to b at execution.)
To solve this problem as you would like to, you need flow-sensitive analysis, which operates on, at least as an aide, a data structure encoding control-flow, like a control-flow graph, and maintains multiple points-to sets for each memory object (in the most naive way, a points-to set for each memory object at each program-point, but there are ways to carry around less information).
You can take a look at Flow-sensitive pointer analysis for millions of lines of code by Hardekopf and Lin (a draft is freely available on the second author's website). If you can get an intuition for the def-use graph they describe and the notation they use, you can work through the algorithm in Section V.
The downside of flow-sensitive points-to analysis is that it is much more expensive, in both time and space. I believe there is a lot of room to bring cost down, however. There are also other precision dimensions: field-sensitivity, context-sensitivity, path-sensitivity, and others.
As for your last point, imagine the last three statements were in a loop. In a real execution, r would point to b in the first loop iteration, and c in every other iteration. The first iteration of Andersen's in your link does not capture this.
For an assignment I have to theoretically analyze the complexity of two algorithms (sorting) to compare them. Then I will implement them and try to confirm the efficiency empirically.
Given that, I analyzed both algorithms and I know the efficiency classes but I have a problem identifying the basic operation. There was a hint that we should be careful in choosing a basic operation because it should be applicable for both algorithms. My problem is that I don't really know why I should take the same basic operation for both algorithms.
Pseudocode
Algo1:
//sorts given array A[0..n-1]
for i=0 to n-2
min <- i
for j <- i+1 to n-1
if A[j] < A[min] min <- j
swap A[i] and A[min]
Efficiency: Theta(n^2)
Algo2:
//sorts given array with limited range (u,l)
for j = 0 to u-l D[j] = 0
for i = 0 to n-1
D[A[i]-l] = D[A[i]-l]+1
for j=1 to u-l D[j] = D[j-1]+D[j]
for i=n-1 to 0
j = A[i]-l
S[D[j]-1] = A[i]
D[j] = D[j]-1
return S
Efficiency Levitin -> Theta(n), Johnsonbaugh -> Theta(n+m) m: distinctable integers in array
So my understanding is that I choose the operation occuring the most as basic operation and I don't see why there is a difference when I choose different basic operations for each algorithm. In the end it doesn't matter because it will lead to the same efficiency class anyway but maybe its important for the empirical analysis (comparing the number of basic operation needed for different input sizes)?
What I plan to do now is to choose assignment as basic operation which is performed 5 times in Algo1 and 6 times in Algo 2 (dependant on the loops of course). Is there a downside to this approach?
Typical choices for "basic operation" would be to look at number of comparisons, or swaps.
Consider a system with a memory hierarchy, where "hot" items are in cache and "cold" items result in an L2-miss followed by RAM reference, or result in a disk I/O. Then the cache hit cost might be essentially zero, and the basic operation boils down to cost of cache misses, leading to a new expression for time complexity.
Mostly-ordered lists get sorted more often than you might think. A stable sort may be more cache-friendly than an unstable sort. If it is easy to reason about how a sort's comparison order interacts with cache evictions, that can lead to a good big-O description of its expected running time.
EDIT: "Reading an element of A[]" seems a fair operation to talk about. Fancier analyses would look at how many "cache miss on A[]" operations happen.
I was going through my Data Structures and Algorithms notes, and came across the following examples regarding Time Complexity and Big-O Notation: The columns on the left count the number of operations carried out in each line. I didn't understand why almost all the lines in the first example have a multiple of 2 in front of them, whereas the other two examples don't. Obviously this doesn't affect the resulting O(n), but I would still like to know where the 2 came form.
I can only find one explanation for this: the sloppiness of the author of the slides.
In a proper analysis one have to explain what kind of operations are performed at which time for what input (like for example this book on page 21). Without this you can not even be sure whether we count multiplication of 2 numbers as 1 operation or 2 or something else?
These slides are inconsistent. For example:
In slide1 currentMax = A[0] takes 2 operations. Kind of makes sense if you take finding 0-th element in array as 1 operation and assigning as another one. But in the slide3 n iterations of s = s + X[i] takes n operations. Which means that s = s + X[i] takes 1 operation. Also kind of makes sense we just increase one counter.
But it is totally inconsistent with each other, because it doesn't makes sense that a = X[0] is 2 operations, and a = a + X[0] where you do more takes only 1.
For the sake of security, I probably can't post any of our files' code, but I can describe what's going on. Basically, we have standalone items and others that are composed of smaller parts. The current system we have in place works like this. Assume we have n items and m parts for each of the kits, where m is not constant and less than n in all cases.
for(all items){
if(standalone){
process item, record available quantity and associated costs
write to database
}
if(kit){
process item, get number of pre-assembled kits
for(each part){
determine how many are used to produce one kit
divide total number of this specific part by number required, keep track of smallest result
add cost of this item to total production cost of item
}
use smallest resulting number to determine total available quantity for this kit
write record to database
}
}
At first, I wanted to say that the total time taken for this is O(n^2) but I'm not convinced that's correct given that about n/3 of all items are kits and m generally ranges between 3 to 8 parts. What would this come out to? I've tested it a few times and it feels like it's not optimized.
From the pseudo-code that you have posted it is fairly easy to work out the cost. You have a loop over n items (thus this is O(n)), and inside this loop have another loop of O(m). As you worked out nested loops mean that the orders are multiplied: if they were both of Order n then this would give O(n^2); instead it is O(mn).
This has assumed that the processing that you have mentioned runs in constant time (i.e. is independent of the size of the inputs). If those descriptions hide some other processing time then this analysis will be incorrect.
The setup: I have two arrays which are not sorted and are not of the same length. I want to see if one of the arrays is a subset of the other. Each array is a set in the sense that there are no duplicates.
Right now I am doing this sequentially in a brute force manner so it isn't very fast. I am currently doing this subset method sequentially. I have been having trouble finding any algorithms online that A) go faster and B) are in parallel. Say the maximum size of either array is N, then right now it is scaling something like N^2. I was thinking maybe if I sorted them and did something clever I could bring it down to something like Nlog(N), but not sure.
The main thing is I have no idea how to parallelize this operation at all. I could just do something like each processor looks at an equal amount of the first array and compares those entries to all of the second array, but I'd still be doing N^2 work. But I guess it'd be better since it would run in parallel.
Any Ideas on how to improve the work and make it parallel at the same time?
Thanks
Suppose you are trying to decide if A is a subset of B, and let len(A) = m and len(B) = n.
If m is a lot smaller than n, then it makes sense to me that you sort A, and then iterate through B doing a binary search for each element on A to see if there is a match or not. You can partition B into k parts and have a separate thread iterate through every part doing the binary search.
To count the matches you can do 2 things. Either you could have a num_matched variable be incremented every time you find a match (You would need to guard this var using a mutex though, which might hinder your program's concurrency) and then check if num_matched == m at the end of the program. Or you could have another array or bit vector of size m, and have a thread update the k'th bit if it found a match for the k'th element of A. Then at the end, you make sure this array is all 1's. (On 2nd thoughts bit vector might not work out without a mutex because threads might overwrite each other's annotations when they load the integer containing the bit relevant to them). The array approach, atleast, would not need any mutex that can hinder concurrency.
Sorting would cost you mLog(m) and then, if you only had a single thread doing the matching, that would cost you nLog(m). So if n is a lot bigger than m, this would effectively be nLog(m). Your worst case still remains NLog(N), but I think concurrency would really help you a lot here to make this fast.
Summary: Just sort the smaller array.
Alternatively if you are willing to consider converting A into a HashSet (or any equivalent Set data structure that uses some sort of hashing + probing/chaining to give O(1) lookups), then you can do a single membership check in just O(1) (in amortized time), so then you can do this in O(n) + the cost of converting A into a Set.