Related
Given two lists, say A = [1, 3, 2, 7] and B = [2, 3, 6, 3]
Find set of all products that can be formed by multiplying a number in A with a number in B. (By set, I mean I do not want duplicates). I looking for the fastest running time possible. Hash functions are not allowed.
First approach would be brute force, where the we multiple every number from A with every number in B and if we find a product that is not already in the list, then add it to the list. Finding all possible products will cost O(n^2) and to verify if the product is already present in the list, it will cost me O(n^2). So the total comes to O(n^4).
I am looking to optimize this solution. First thing that comes to my mind is to remove duplicates in list B. In my example, I have 3 as a duplicate. I do not need to compute the product of all elements from A with the duplicate 3 again. But this doesn't still help reducing the overall runtime though.
I am guessing the fastest possible run time can be O(n^2) if all the numbers in A and B combined are unique AND prime. That way it is guaranteed that there will be no duplicates and I do not need to verify if my product is already present in the list. So I am thinking if we can pre-process our input list such that it will guarantee unique product values (One way to pre-process is to remove duplicates in list B like I mentioned above).
Is this possible in O(n^2) time and will it make a difference if I only care about the number of unique possible products instead of the actual products?
for i = 1 to A.length:
for j = 1 to B.length:
if (A[i] * B[j]) not already present in list: \\ takes O(n^2) time to verify this
Add (A[i] * B[j]) to list
end if
end for
end for
print list
Expected result for the above input: 2, 3, 6, 9, 18, 4, 12, 14, 21, 42
EDIT:
I can think of a O(n^2 log n) solution:
1) I generate all possible product values without worrying about duplicates \ This is O(n^2)
2) Sort these product values \ this will be O(n^2 log n) because we have n^2 numbers to sort
3) Remove the duplicates in linear time since the elements are now sorted
Use sets to eliminate duplicates.
A=[3, 6, 6, 8]
B=[7, 8, 56, 3, 2, 8]
setA = set(A)
setB = set(B)
prod=set() #empty set
[prod.add(i*j) for i in setA for j in setB]
print(prod)
{64, 448, 6, 168, 9, 42, 12, 16, 48, 18, 336, 21, 24, 56}
Complexity is O(n^2).
Another way is the following.
O(n^3) complexity
prod=[]
A=[1,2,2,3]
B=[5,6,6,7]
for i in A:
for j in B:
if prod==[]:
prod.append(i*j)
continue
for k in range(len(prod)):
if i*j < prod[k]:
prod.insert(k,i*j)
break
elif i*j == prod[k]:
break
if k==len(prod)-1:
prod.append(i*j)
print(prod)
Yet another way. This could be using hash functions internally.
from toolz import unique
A=[1,2,2,3]
B=[5,5,7,8]
print(list(unique([i*j for i in A for j in B])))
Okay so I have a huge array of unsorted elements of an unknown data type (all elements are of the same type, obviously, I just can't make assumptions as they could be numbers, strings, or any type of object that overloads the < and > operators. The only assumption I can make about those objects is that no two of them are the same, and comparing them (A < B) should give me which one should show up first if it was sorted. The "smallest" should be first.
I receive this unsorted array (type std::vector, but honestly it's more of an algorithm question so no language in particular is expected), a number of objects per "group" (groupSize), and the group number that the sender wants (groupNumber).
I'm supposed to return an array containing groupSize elements, or less if the group requested is the last one. (Examples: 17 results with groupSize of 5 would only return two of them if you ask for the fourth group. Also, the fourth group is group number 3 because it's a zero-indexed array)
Example:
Received Array: {1, 5, 8, 2, 19, -1, 6, 6.5, -14, 20}
Received pageSize: 3
Received pageNumber: 2
If the array was sorted, it would be: {-14, -1, 1, 2, 5, 6, 6.5, 8, 19, 20}
If it was split in groups of size 3: {{-14, -1, 1}, {2, 5, 6}, {6.5, 8, 19}, {20}}
I have to return the third group (pageNumber 2 in a 0-indexed array): {6.5, 8, 19}
The biggest problem is the fact that it needs to be lightning fast. I can't sort the array because it has to be faster than O(n log n).
I've tried several methods, but can never get under O(n log n).
I'm aware that I should be looking for a solution that doesn't fill up all the other groups, and skips a pretty big part of the steps shown in the example above, to create only the requested group before returning it, but I can't figure out a way to do that.
You can find the value of the smallest element s in the group in linear time using the standard C++ std::nth_element function (because you know it's index in the sorted array). You can find the largest element S in the group in the same way. After that, you need a linear pass to find all elements x such that s <= x <= S and return them. The total time complexity is O(n).
Note: this answer is not C++ specific. You just need an implementation of the k-th order statistics in linear time.
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Mapping two integers to one, in a unique and deterministic way
I'm trying to create unique identificator for pair of two integers (Ruby) :
f(i1,i2) = f(i2, i1) = some_unique_value
So, i1+i2, i1*i2, i1^i2 -not unique as well as (i1>i2) ? "i1" + "i2" : "i2" + "i1".
I think following solution will be ok:
(i1>i2) ? "i1" + "_" + "i2" : "i2" + "_" + "i1"
but:
I have to save result in DB and index it. So I prefer it to be an integer and as small as it possible.
Is Zlib.crc32(f(i1,i2)) can guaranty uniqueness?
Thanks.
UPD:
Actually, I'm not sure the result MUST be integer. Maybe I can convert it to decimal:
(i1>i2) ? i1.i2 : i2.i1
?
What you're looking for is called a Pairing function.
The following illustration from the German wikipedia page clearly shows how it works:
Implemented in Ruby:
def cantor_pairing(n, m)
(n + m) * (n + m + 1) / 2 + m
end
(0..5).map do |n|
(0..5).map do |m|
cantor_pairing(n, m)
end
end
=> [[ 0, 2, 5, 9, 14, 20],
[ 1, 4, 8, 13, 19, 26],
[ 3, 7, 12, 18, 25, 33],
[ 6, 11, 17, 24, 32, 41],
[10, 16, 23, 31, 40, 50],
[15, 22, 30, 39, 49, 60]]
Note that you will need to store the result of this pairing in a datatype with as many bits as both your input numbers put together. (If both input numbers are 32-bit, you will need a 64-bit datatype to be able to store all possible combinations, obviously.)
No, Zlib.crc32(f(i1,i2)) is not unique for all integer values of i1 and i2.
If i1 and i2 are also 32bit numbers then there are many more combinations of them than can be stored in a 32bit number, which is returned by CRC32.
CRC32 is not unique, and wouldn't be good to use as a key. Assuming you know the maximum value of your integers i1 and i2:
unique_id = (max_i2+1)*i1 + i2
If your integers can be negative, or will never be below a certain positive integer, you'll need the max and min values:
(max_i2-min_i2+1) * (i1-min_i1) + (i2-min_i2)
This will give you the absolute smallest number possible to identify both integers.
Well, no 4-byte hash will be unique when its input is an arbitrary binary string of more than 4 bytes. Your strings are from a highly restricted symbol set, so collisions will be fewer, but "no, not unique".
There are two ways to use a smaller integer than the possible range of values for both of your integers:
Have a system that works despite occasional collisions
Check for collisions and use some sort of rehash
The obvious way to solve your problem with a 1:1 mapping requires that you know the maximum value of one of the integers. Just multiply one by the maximum value and add the other, or determine a power of two ceiling, shift one value accordingly, then OR in the other. Either way, every bit is reserved for one or the other of the integers. This may or may not meet your "as small as possible" requirement.
Your ###_### string is unique per pair; if you could just store that as a string you win.
Here's a better, more space efficient solution:. My answer on it here
i have an array with integers, which i need to sort. however the result should not contain the integer values, but the indices. i.e. the new order of the old array.
for example: [10, 20, 30]
should result in: [2, 1, 0]
what is an optimized algorithm to achieve this?
You can achieve this with any sorting algorithm, if you convert each element to a tuple of (value, position) and sort this.
That is, [10, 20, 30] would become [(10, 0), (20, 1), (30, 2)]. You'd then sort this array using a comparator that looks at the first element of the tuples, giving you [(30, 2), (20, 1), (10, 0)]. From this, you can simply grab the second element of each tuple to get what you want, [2, 1, 0]. (Under the assumption you want reverse sorting.)
Won't be different from any other sorting algorithm, just modify it so that it builds or takes in an array of indices and then manipulates both the data and array of indices instead of just the data.
You could create a array of pointers to the original array of integers, perform a merge sort or what ever sorting algorithm you find most suiting (uses the value at the pointer) then just run down the list calculating the indicies based on each pointers relative address to the beginning of the allocated block containing the original array of integers.
I've been searching Google (and stack overflow of course!) for a way to sort a list of integers by value, but also by an extra factor. I'm looking for some sort of algorithm to implement I suppose. So far, I have an array in Delphi 2007 that sorts the values from largest to smallest, but now I would like it to sort values that are only X number more than the previous number in the list.
For example, the values 5, 7, 25, 15 are currently sorted to 25, 15, 7, 5. The order I am trying to get now, with the X value being 5, is 25, 15, 5, 7. As you can see, the 5 and 7 haven't switched positions because there isn't a difference of more than 5 between them.
Not sure if I'm explaining this particularly well, but that's the general idea.
One more example would be the values 10, 40, 18, 20, 16, 28. Sorted, they should be 40, 28, 18, 20, 16, 10. 18, 20 and 16 haven't moved, because again, there is not more than 5 between each of the numbers.
The idea behind it is so that the item linked to the number (for example the number of times something has been ordered) isn't changing all the time because of a difference of only 1 or 2. For example, if the list of most frequently ordered paper is displayed on a web page by frequency purchased, then the order of a particular type of paper will only change for the user if it has been ordered more than five times more than the next most frequent.
Hope this makes sense, and thanks for your time!
I think your requirement leads to really strange results. You can, ultimately, have a sort order where items sorted exactly the wrong way round, and how they are sorted depends on how they change.
I think you need to establish "classes" of values (use percentiles?) and then sort the newspapers alphabetically within each class.
For example: barely ordered (90% of papers are ordered more than this one), lower than median (50% of newspapers are ordered more than these), higher than median, top 10 ordered (sorted by number of orders of course).
At the very least you need to use a stable sort, i.e., a sort that preserves ordering of equivalent values.
Once you have that, you should probably define your comparison as if abs(a - b)<5, equals, otherwise, do the normal comparison. That makes compare(a, b)==compare(b, a), which is something that most good sorting implementations should assert.
If you are using C++, you can use std::stable_sort to do this.
Having thought about this some more I think a bubble sort is the only type of sort that will work. This is because in a bubble sort a number must be explicitly larger (or in this case 5 larger) than the numbers above it in order for them to change places. This is exactly what you have asked for.
From a high level point of view here is what I think you will need:
A bubble sorting implementation that takes in a custom compare function.
a compare function that returns equal if the difference is less than 5.
Merging these 2 things should leave you with what you are after.
There are two things that are important for this to work:
1) You need a sorting algorithm that is stable(it does not need to be bubble sort though and it probably shouldn't), meaning it preserves the original order if the comparator returns 0(=equal). See wikipedia's comparison of sorting algorithms for a good overview.
2) You need a custom comparator that returns 0 in exactly those cases you mentioned.
Try sorting with custom comparing function , you can quickly try this if you fill TList with numbers and call MyList.Sort(myCompareFunc);
function myCompareFunc(item1,item2:Pointer):Integer;
var
v1,v2:Integer;
begin
v1 := (integer)item1;
v2 := (integer)item2;
if v1 > v2 then
result := 1
else if (v1 < v2 )
result := -1
else
result := 0
//most important
if abs(v1-v2) < 5 result := 0 ;
end;
It might make sense to apply your sorting principle to an almost sorted list, but you should be aware of the fact that it violates a basic assumption on general sorting algorithms: That the elements to be sorted form a total order. This roughtly means that the < operator behaves the way it does for real numbers. If you violate this rule, then you can get strange effects, and library algorithms might not even terminate (or even crash, I've seen this happen for std::sort in C++).
One way to modify your idea to make it a total order is to sort with respect to each number rounded down to the closest multiple of 5.
I think this needs two passes. On the first pass, identify any values which are subject to being sorted, and on the second pass actually move only the ones which were such identified. In your second example, using an array of Y/N with Y being to sort, you would end up with [ Y, Y, N, N, N, Y ], then by only sorting the values Y, the N's would be "ignored not eligible for sorting" you would get your resulting list of [40, 28, 18, 20, 16, 10]. The group of N (will always be a group based on the definition) should be compared to as the value of the highest in its group.
TList.Sort won't work because it uses quick sort algoritham as mghie commented.
Bubble sort algoritham is not that fast, here is implementation so you can try, this time in pascal.
var I,N,T,D:Integer ;
A:array of Integer;
Change:Boolean;
begin
SetLength(A, 4);
A[0] := 5;
A[1] := 7;
A[2] := 25;
A[3] := 15;
Change := True;
N := High(A);
while Change do
begin
Change := false;
for I := 1 to N do
begin
D := A[I]-A[I-1];
if (D > 5) then
begin
T := A[I-1];
A[I-1] := A[I];
A[I] := T;
Change := true;
end;
end;
end;
end;