Consider a class of type doubles
class path_cost {
double length;
double time;
};
If I want to lexicographically order a list of path_costs, I have a problem. Read on :)
If I use exact equal for the equality test like so
bool operator<(const path_cost& rhs) const {
if (length == rhs.length) return time < rhs.time;
return length < rhs.length;
}
the resulting order is likely to be wrong, because a small deviation (e.g. due to numerical inaccuracies in the calculation of the length) may cause the length test to fail, so that e.g.
{ 231.00000000000001, 40 } < { 231.00000000000002, 10 }
erroneously holds.
If I alternatively use a tolerance like so
bool operator<(const path_cost& rhs) const {
if (std::fabs(length-rhs.length)<1-e6)) return time < rhs.time;
return length < rhs.length;
}
then the sorting algorithm may horribly fail since the <-operator is no longer transitive (that is, if a < b and b < c then a < c may not hold)
Any ideas? Solutions? I have thought about partitioning the real line, so that numbers within each partition is considered equal, but that still leaves too many cases where the equality test fails but should not.
(UPDATE by James Curran, hopefully explaining the problem):
Given the numbers:
A = {231.0000001200, 10}
B = {231.0000000500, 40}
C = {231.0000000100, 60}
A.Length & B.Length differ by 7-e7, so we use time, and A < B.
B.Length & C.Length differ by 4-e7, so we use time, and B < C.
A.Length & C.Length differ by 1.1-e6, so we use length, and A > C.
(Update by Esben Mose Hansen)
This is not purely theoretical. The standard sort algorithms tends to crash or worse when given a non-transitive sort operator. And this is exactly what I been contending with (and boy was that fun to debug ;) )
Do you really want just a compare function?
Why don't you sort by length first, then group the pairs into what you think are the same length and then sort within each group by time?
Once sorted by length, you can apply whatever heuristic you need, to determine 'equality' of lengths, to do the grouping.
I don't think you are going to be able to do what you want. Essentially you seem to be saying that in certain cases you want to ignore the fact that a>b and pretend a=b. I'm pretty sure that you can construct a proof that says if a and b are equivalent when the difference is smaller than a certain value then a and b are equivalent for all values of a and b. Something along the lines of:
For a tolerance of C and two numbers A and B where without loss of generality A > B then there exist D(n) = B+n*(C/10) where 0<=n<=(10*(A-B))/(C) such that trivially D(n) is within the tolerance of D(n-1) and D(n+1) and therefore equivalent to them. Also D(0) is B and D((10*(A-B))/(C))=A so A and B can be said to be equivalent.
I think the only way you can solve that problem is using a partitioning method. Something like multiplying by 10^6 and then converting to an int shoudl partition pretty well but will mean that if you have 1.00001*10^-6 and 0.999999*10^-6 then they will come out in different partitions which may not be desired.
The problem then becomes looking at your data to work out how to best partition it which I can't help with since I don't know anything about your data. :)
P.S. Do the algorithms actually crash when given the algorithm or just when they encounter specific unsolvable cases?
I can think of two solutions.
You could carefully choose a sorting algorithm that does not fail when the comparisons are intransitive. For example, quicksort shouldn't fail, at least if you implement it yourself. (If you are worried about the worst case behavior of quicksort, you can first randomize the list, then sort it.)
Or you could extend your tolerance patch so that it becomes an equivalence relation and you restore transitivity. There are standard union-find algorithms to complete any relation to an equivalence relation. After applying union-find, you can replace the length in each equivalence class with a consensus value (such as the average, say) and then do the sort that you wanted to do. It feels a bit strange to doctor floating point numbers to prevent spurious reordering, but it should work.
Actually, Moron makes a good point. Instead of union and find, you can sort by length first, then link together neighbors that are within tolerance, then do a subsort within each group on the second key. That has the same outcome as my second suggestion, but it is a simpler implementation.
I'm not familiar with your application, but I'd be willing to bet that the differences in distance between points in your graph are many orders of magnitude larger than the rounding errors on floating point numbers. Therefore, if two entries differ by only the round-off error, they are essentially the same, and it makes no difference in which order they appear in your list. From a common-sense perspective, I see no reason to worry.
You will never get 100% precision with ordinary doubles. You say that you are afraid that using tolerances will affect the correctness of your program. Have you actually tested this? What level of precision does your program actually need?
In most common applications I find a tolerance of something like 1e-9 suffices. Of course it all depends on your application. You can estimate the level of accuracy you need and just set the tolerance to an acceptable value.
If even that fails, it means that double is simply inadequate for your purposes. This scenario is highly unlikely, but can arise if you need very high precision calculations. In that case you have to use an arbitrary precision package (e.g. BigDecimal in Java or something like GMP for C). Again, only choose this option when there is no other way.
Related
I am using PARI/GP which is a mathematics program with some helpful functionality for number theory, especially because it supports very large integers out of the box. For a previous C++ project I had to use a library called BigInt.
At the moment, using PARI/GP I am utilising the gcd() function to calculate the greatest common divisor (GCD) for numbers ranging from 0 to 255 digits in length, so as you can imagine the numbers do get very large! I set a=0 then my loop iterates upwards, each time calculating gcd(a,b) where the b is a long fixed number that never changes.
I was wondering, if perhaps I should use Euler's approach to calculating GCD, which I believe is the following simple formula: gcd(b, a % b) where the % symbol means modulo. Hopefully I got the variables in the correct order!
Is there a rough and quick way to approximate which approach shown above for calculating GCD is quickest? I would, of course, be open minded to other approaches which are quicker.
I do not expect my algorithm to ever finish, this is just an experiment to see how far it can reach based on which approach I use to calculating GCD.
Binary GCD should generally be better than naive Euclid, but a being very small compared to b is a special circumstance that may trigger poor performance from Binary GCD. I’d try one round of Euclid, i.e., gcd(b, a%b) where gcd is Binary GCD.
(But without knowing the underlying problem here, I’m not sure that this is the best advice.)
The best approach is to let pari do the work for you.
first, you can compute the gcd of a large number of inputs stored in a vector v as gcd(v).
? B=10^255; v = vector(10^6,i,random(B));
? gcd(v);
time = 22 ms.
? a = 0; for(i = 1, #v, a = gcd(a,v[i]))
time = 232 ms. \\ much worse
There are 2 reasons for this to be much faster on such small inputs: loop overhead and variable assignments on the one hand and early abort on the other hand (as soon as the intermediate answer is 1, we can stop). You can multiply v by 2, say, to prevent the second optimization; the simple gcd(v) will remain faster [because loop and assignments overhead still occurs, but in C rather than in interpreted GP; for small inputs this overhead is very noticeable, it will become negligible as the sizes increase]
similarly, it should be always faster on average to let the gcd function work out by itself how best to compute gcd(a,b) that to try an "improve" things by using tricks such as gcd(b, a % b) [Note: the order doesn't matter, and this will error out if b = 0, which gcd is clever enough to check]. gcd(a, b-a) will not error out but slow down things on average. For instance, gcd(a,b) will try an initial Euclidean step in case a and b have vastly differing sizes, it shouldn't help to try and add it yourself.
finally, the exact algorithms used depend on the underlying multiprecision library; either native PARI or GNU's GMP, the latter being faster due to a highly optimized implementation. In both cases, as operands sizes increase, this includes Euclid's algorithm, binary plus/minus [ dividing out powers of 2, we can assume a, b odd, then use gcd(b,(a-b)/4) if a = b mod 4 and gcd(b, (a+b)/4) otherwise; divisions are just binary shifts ], and asymptotically fast half-gcd (almost linear in the bit size). The latter is almost surely not being used in your computations since the threshold should be over 10.000 decimal digits. On the other hand, Euclid's algorithm will only be used for tiny (word-size) operands, but since all algorithms are recursive it will eventualy be used, when the size has become tiny enough.
If you want to investigate the speed of the gcd function, try it with integers around 100.000 decimal digits (then double that size, say), you should observe the almost linear complexity.
Just wondering why the base case for Karatsuba multiplication ( shown here: http://www.sanfoundry.com/java-program-karatsuba-multiplication-algorithm/) is chosen to be "N<= 10"? I found "N<= 4, 3, 2 ,1 " will not give me a correct result. Anyone can explain?
Karatsuba's algorithm will work correctly with any "sufficiently large" base case, where "sufficiently large" means "large enough that when it's divided into smaller subproblems, those subproblems are indeed smaller and produce the right answer." In that sense, there isn't "a" base case for Karatsuba as much as a general rule for what base cases might look like.
Honestly, the code you linked doesn't seem like it's a very reasonable implementation of the algorithm. It works with longs, which can already be multiplied in O(1) time on any reasonable system, and their base case is to stop when the numbers are less than 1010, meaning that with 64-bit numbers the recursion always terminates after a single step. A better implementation would likely use something like a BigInteger type that can support arbitrary-precision multiplication. At that point, choosing the optimal base case is a matter of performance tuning. Make the base case have a number of digits that's too small and the recursion to handle smaller numbers will take dramatically more time than just doing a naive multiply. Make the base too high and you'll start to see slowdown as cases better handled by the recursive step instead get spend using naive multiplications.
If you included the source code in your post, you might well have gotten a to-the-point answer sooner.
If you used something like BigInteger.divideAndRemainder(dividend, divisor) to "split" your numbers, you wouldn't run the risk to code something like
long d = y / m;
long c = y - (d * N);
(using a multiplier different from the divisor).
Note that the product of two 10 digit numbers doesn't always fit into Java's long.
Numbers with less than 10 digits can be multiplied natively (x*y), because the result will always fit in a signed 64-bit integer.
Using the long datatype doesn't make much sense though, since most number combinations that doesn't overflow, will just get evaluated natively. You would have to change to BitInteger or something similar, and use much larger numbers to get any gains from the algorithm.
As for why it is failing for lower limits of N, I am not sure. The algorithm have to be able to split both numbers into two similarly sized parts. I guess it ends up with zeros or negative numbers in some cases.
Given is an array of n distinct objects (not integers), where n is between 5 and 15. I have a comparison function cmp(a, b) which is true if a < b and false otherwise, but it's very expensive to call. I'm looking for a sorting algorithm with the following properties:
It calls cmp(a, b) as few times as possible (subject to constraints below). Calls to cmp(a, b) can't be parallelized or replaced. The cost is unavoidable, i.e. think of each call to cmp(a, b) as costing money.
Aborting the algorithm should give good-enough results (best-fit sort of the array). Ideally the algorithm should attempt to produce a coarse order of the whole array, as opposed partially sorting one subset at a time. This may imply that the overall number of calls is not as small as theoretically possible to sort the entire array.
cmp(a, b) implies not cmp(b, a) => No items in the array are equal => Stability is not required. This is always true, unless...
In rare cases cmp(a, b) violates transitivity. For now I'll ignore this, but ultimately I would like this to be handled as well. Transitivity could be violated in short chains, i.e. x < y < z < x, but not in longer chains. In this case the final order of x y z doesn't matter.
Only the number of calls to cmp() needs to be optimized; algorithm complexity, space, speed and other factors are irrelevant.
Back story
Someone asked where this odd problem arose. Well, despite at my shallow attempt at formalism, the problem is actually not formal at all. A while back a friend of mine found a web page on the internets, that allowed him to put some stuff in a list, and make comparisons on that list in order to get it sorted. He since lost that web page, and asked me to help him out. Sure, I said and smashed my keyboard arriving at this implemtation. You are welcome to peruse the source code to see how i pretended to solve the problem above. Since I was quite inebriated when all this happened, I decided to outsource the real thinking to stack overflow.
Your best bet to start with would be Chp 5 of Knuth's TAOCP Vol III..it is about optimal sorting (ie with minimal number of comparisons). OTOH, since the number of objects you are sorting is very small I doubt there will be any noticeable difference between an optimal algorithm vs, say, bubble sort. So perhaps you will need to focus on making the comparisons cheaper. Strange problem though...would you mind giving details? Where does it arise?
I have a very large positive integer number (million digits). I need represent it with the smallest possible function, this number is variable, it means, I need an algorithm that generates the smallest possible function to get the given number.
Example: For the number 29512665430652752148753480226197736314359272517043832886063884637676943433478020332709411004889 the algorithm must return "9^99". It must be able to analyze numbers and always return a math function that represent the number. Example the number 21847450052839212624230656502990235142567050104912751880812823948662932355202 must return "9^5^16+1".
Heard of Kolmogorov complexity?
To answer your question: unless you restrict yourself to some specific set of functions, it's impossible.
EDIT: Even in your example, how do you know that the shortest representation of 21847450052839212624230656502990235142567050104912751880812823948662932355202 is actually 9^5^16+1? Isn't it a quite hard to prove even in this specific case?
If you restrict yourself to some set of functions then you can use the following algorithm:
For i = 1 to n
enumerate all strings s of length i
if s represents a valid expression according to rules chosen a priori,
and evaluates to the number in the input,
return s
It is guaranteed to halt because on the last iteration of the outer loop (i = n) you will get eventually to a string contains the input verbatim.
Of course, this is not very efficient. Specifically O(bn) where n is the length of the input and b is the size of the alphabet.
Expanding on #ybungalobill's terse answer, your function is equivalent to a function that computes the Kolmogorov complexity of an arbitrary string. (The equivalence is obvious if you treat each digit of your very large numbers as characters, and the numbers as sequences of characters.)
According to the Wikipedia page on Kolmogorov complexity, the K(s) function that gives the complexity of a string s is not a computable function. (The page includes a proof.)
In other words, the algorithm you want simply does not exist.
#BlueRaja - Danny Pflughoeft: yes, it is. I'm trying to create some compression that uses this algorithm, but by the way this is impossible.
That's because it's technically impossible to compress arbitrary data, for the same reason, but that doesn't stop us from doing it :)
There are much better ways of compressing data, however. Take a look at, for instance, LZ. It is so ubiquitous that you can almost certainly find a library to do the compression for you, regardless of what language you're writing in. DEFLATE is another popular one.
Hope that helps!
If you're not looking for optimality, just a reasonably good job, then there are a bunch of heuristics you can use. For example, try to decompose n using all of the following
n = a^k + b
for k = 2, 3, ..., log n, and pick the one with the smallest a + b, say. You can compute a and b using a = floor(n^(1/k)) and b = n-a^k. Then recurse on a and b.
Of course, this uses only exponentiation and addition to find a good compression. If you allow subtraction as well, use a=round(n^(1/k)) instead and let b be negative.
Allowing multiplication as well makes it quite a bit harder because you would probably need to factor n.
I haven't been able to find any satisfactory coverage of this topic all in one place, so I was wondering:
What are the fastest set intersect, union, and disjoin algorithms?
Are there any interesting ones with limited domains?
Can anyone beat O(Z) where Z is the actual size of intersection?
If your approach relies on sorted sets, please note that, but don't consider it a disqualifying factor. It seems to me that there must be a veritable storehouse of subtle optimizations to be shared, and I don't want to miss any of them.
A few algorithms I know rely on bitwise operations beyond the vanilla, so you may assume the presence of SSE4 and access to intrinsics like popcount. Please note this assumption.
Of interest:
An Implementation of B-Y Intersect
Update
We've got some really good partial answers, but I'm still hoping for some more complete attacks on the problem. I'm particularly interested in seeing a more fully articulated use of bloom filters in attacking the problem.
Update
I've done some preliminary work on combining bloom filters with a cuckoo hash table. It's looking almost obnoxiously promising, because they have very similar demands. I've gone ahead and accepted an answer, but I'm not really satisfied at the moment.
If you're willing to consider set-like structures then bloom filters have trivial union and intersect operations.
For reasonably dense sets, interval lists can beat O(n) for the operations you specify, where n is the number of elements in the set.
An interval list is essentially a strictly increasing list of numbers, [a1, b1, a2, b2, ..., an, bn], where each pair ai, bi denotes the interval [ai, bi). The strictly increasing constraint ensures that every describable set has a unique representation. Representing a set as an ordered collection of intervals allows your set operations to deal with multiple consecutive elements on each iteration.
If set is actually a hashed set and both sets have the same hash function and table size then we can skip all buckets that exist only in one set. That could narrow search a bit.
The following paper presents algorithms for union, intersection and difference on ordered sets that beat O(Z) if the intersection is larger than the difference (Z > n/2):
Confluently Persistent Sets and Maps
there is no optimal solution than O(Z) because if you think of the problem logically each of the intersect, union and disjoin algorithms must at least read all of the input elements once, so Z reads is a must. also since the set is not sorted by default, no further optimizations could beat O(Z)
Abstractly, a set is something that supports an operation, "is X a member?". You can define that operation on the intersection A n B in terms of it on A and B. An implementation would look something like:
interface Set { bool isMember(Object X); };
class Intersection {
Set a, b;
public Intersection(Set A, Set B) { this.a = A; this.b = B; }
public isMember(Object X) {
return a.isMember(X) and b.isMember(Y);
}
}
A and B could be implemented using an explicit set type, like a HashSet. The cost of that operation on each is quite cheap, let's approximate it with O(1); so the cost on the intersection is just 2 O(n). ;-)
Admittedly if you build a large hierarchy of intersections like this, checking for a member can be more expensive, up to O(n) for n sets in the hierarchy. A potential optimisation for this could be to check the depth of the hierarchy against a threshold, and materialise it into a HashSet if it exceeds it. This will reduce the member operation cost, and perhaps amortise the construction cost when many intersections are applied.