list1 -->aaa,bbb,ddd,xyxz,...
list2-->bbb,ccc,ccc,glk,hkp,..
list3> ddd,eee,ffff,lmn,..
Inside a list the words are sorted
I want to remove words which are repeated across the list and print
in sorted order
If the words are repeated in same list its valid.
In the above case it should print
aaa-->ccc--> ccc-->eee-->fff-->glk-->hkp-->lmn-->xyxz
In this case ccc is in same list hence it is printed and bbb and ddd is removed since its across list.
I am not looking for code looking for better way to solve this.Tried searching for 3 hrs so just wanted to know the approach
Get an empty list for results
Get 3 pointers (or indices) pointing to the beginning of the 3 sorted list
Compare the words pointed by the 3 pointers, find the smallest and add it to the result list
Move each of the 3 pointers until the word pointed is larger than the last added result
Do so (3. and 4.) until all the pointers reach the end of the lists
For each list, make a copy of list and store the list in set to remove duplicate string in the same list.
e.g.
list2-->bbb,ccc,ccc,glk,hkp
copy as
set2-->bbb,ccc,glk,hkp,...
(This step is for building the following frequency table, you can skip it if you have other way to build the table)
Then use hashtable to make a frequency table which map string s to number of set that contain s.
Using the table, you can check whether a string appear in more than one list or not.
Then you just concat the input word lists, remove those string that appear in more than one list.
I have two groups of objects where each group consists of 4 objects. The goal is to compute the degree of similarity between thes two groups. The comparison between two objects results to an int number. The lowest this number is the more similar the objects are. The order of these objects withing the group doesn't matter to the group equality.
So what i must do is compare each object of group 1 with each object of group 2 and this will give me 16 different comparison result between objects. I store these in a 4x4 int table called costs.
int[][] costs= new int[4][4];
for(int i=0;i<4;i++){
for(int j=0;j<4;j++){
costs[i][j]=compare(objectGroup1[i],objectGroup2[j]);
}
}
Now i have 4 sets of 4 comparison results and I must choose one result from each set, in order to add them and compute the total distance metric between the groups. This is the point where i got stuck.
I must try all combinations of four and get the minimum sum but there is the restrition of using an object only once.
Example: if the first of four values to add is the comparison result between objectGroup1[1] - objectGroup2[1] then I can't use in this foursome any other comparison results that came using objectGroup1[1] and same goes for objectGroup2[1].
valid example: group1[1]-group2[2], group1[2]-group2[1], group1[3]-group2[3],group1[4]-group2[4]---->each object from each group appears only once
What kind of algorithm can I use here?
It sounds like you're trying to find the permutation of group 1's items that make it most similar to group 2's items when pairing the items off.
Eric Lippert has a good series of blog posts on producing permutations. So basically all you have to do is iterate over them, computing the score by pairing items, and return the best score. Basically just Zip-ing and MinBy-ing:
groupSimilarity =
item1.Groups
// (you have to implement Permutations)
.Permutations()
// we want to compute the best score, but we don't know which permutation will win
// so we MinBy a function computing the permutation's score
.MinBy(permutation =>
// pair up the items and combine them, using the Similarity function
permutation.Zip(item2.Groups, SimilarityFunction)
// add up the similarity scores
.Sum()
)
The above code is C#, written in a "Linqy" functional style (sorry if you're not familiar with that). MinBy is a useful function from MoreLinq, Zip is a standard Linq operator.
For example, I am given a list [3;5;6;2;10;4;9;1;3] that maps out to a matrix like this:
3 5 6
2 10 4
9 1 3
I have already coded a function "find_size" that will find the size of the matrix given a list (for this example, it would give me the int 3)
as well a function "find_position" that will tell me where in the matrix 10 is (so here, it would give me the tuple (1, 1) because it is in the second row and second column)
I think the best way to approach this would be to iterate over the list using a fold function that applies a given function to each element and keeps track of the results. Therefore, my goal is to create a function that will tell me if a certain element in the matrix is adjacent to 10 in OCaml (if it is, I would tell the fold function add that element to the answer list).
The final answer for this would be [5;2;4;1] because those elements are adjacent to 10.
You can fairly easily translate your size and position values into a list of the adjacent positions as tuples. You can translate the size and a tuple into a single list index. With a sorted list of indices, you can fold over the list and extract the elements at the given indices.
This would probably be a lot easier if you used an array of numbers, for what it's worth.
Without seeing some code you've tried, and hearing how it did or didn't work for you, it's difficult to say more than this.
I want to remember the last n unique numbers, in order.
Here is what I mean: Let's say n = 4.
My current list is 5 3 4 2 If I add 6, it turns into 3 4 2 6. If I add 3 instead, the list turns into 5 4 2 3, where 3 moves to the front.
I would do it like this: Store the numbers in a queue. When adding a new number, search through the queue for the number. If the number is not found, pop the number at the end, and push the new number in the front. If the number is found, remove the number at that position, then push the new number in front.
Now obviously, removing a number from an arbitrary position in a queue, optimized for queue operations (like std::deque in C++) will be quite slow. Using a linked list, though will be slower to search through the list. Is there a better combination of algorithm + data structure to accomplish this sort of task?
If it makes any difference, I don't necessarily care about "remembering the last n unique numbers, in order." I specifically need to know, what element has been removed from the list upon an addition (if any).
You could use a doubly linked list. You can add your n numbers to be remembered in a hash table where the key is the number itself and the value a pointer that points to the node of the linked list that contains that number.
Then in the step you describe search through the queue for the number you change it for look if the number is in the hash table which will be constant time instead of liner time using the queue.
The pop and push operations you describe can be performed in constant time if you store a pointer p that points to the first element of the doubly linked list and a pointer q that points to the last element of your list.
Your step If the number is found, remove the number at that position can be performed in constant time since you already have the position of the number to be removed.(by position I mean the pointer you get from the hash table).
UPDATE:
Be careful that you must update your hash table to remove and add new numbers accordingly.
I'm looking for an algorithm that can find/assign order and overlap given a list of ordered elements and a list of unordered elements. (of which overlap might or might not exist).
For this example I'll use the integers but they could just as well be peoples names, ID codes etc. IE the number can't be used to solve the real problem but to help explain the problem I used the ordered set (1,2,3,4,5,6,7,8,9,10) as the holy grail answer.
Input:
Ordered List of Lists: (1,2,3,4), (8,9,10), (3,4,5)
UnOrdered List of Lists: (3,4,2), (6,4,5,7), (10,9)
Thought process in how I do this algorithm in my head:
list 3,4,5 and 1,2,3,4 are ordered and have 3,4 in common therefore the 2 ordered lists overlap to form: 1,2,3,4,5 in that order.
The unordered list 3,4,2 is a subset of ordered list 1,2,3,4,5 therefor it could be reordered as 2,3,4 and said to overlap the ordered list 1,2,3,4,5
Same idea (as step 2) for the ordered list 8,9,10 when compared with the unordered 10,9. It should be 9,10 overlapped with 8,9,10.
Now comparing ordered list 1,2,3,4,5 and the unordered 6,4,5,7 they have an intersection set of 4,5 so you could conclude that its 1,2,3,4,5,(6,7|7,6) where (6,7|7,6) means that its either a 6 followed by a 7 or a 7 followed by a 6 (but its unknown which is correct)
Output:
I would like to beable to parse a matrix/tree/whatever kind of data structure to see what overlapped where and in what order
and a summarized list containing sets of partially known order
set1: 1,2,3,4,5,(6,7|7,6)
set2: 2: 8,9,10
Does anyone know of a similar problem or algorithm I could use? Ideally it would be in Perl but pseudo code or algorithms from another language would be fine.
Thanks
If I understand this right, you need one ordered list from a set of ordered and unordered lists. A possible solution would be to iterate over all the values in all the sets and add them to a hash table structure. In implementation terms that could be a c++ map, java hashmap, python dictionary etc. That would look like:
for i over all sets S //(Ordered and
Unordered)
for j over all values in S[i]
H.insert(S[i][j]) //H is the hash table
Now iterate over the hash table entries to get the required ordered list. This is quite practical and optimal.
A not-so-practical-solution but worth mentioning for its niceness is this:
Assign every unique number a corresponding prime number. For example, in your example case, map the numbers as follows:
p[1] = 2, p[2]=3, p[3]=5, p[4]=7, p[5]=11, p[6]= 13, p[7]=17, p[8]=19, p[9]=23, p[10]=29;
Now, each set Si can be represented by a value Vi- the product of the corresponding primes. So a set Si=(1,2,3) (or for that matter (2,1,3)) would have value Vi=p[1]*p[2]*p[3]
Find the LCM of all the Vi 's. Call this V.
V=For all i LCM{Vi}
Factorise V into its prime factors. Each prime number represents an element in your final ordered list.
This second solution is neat but breaks down for practical purposes because we enter bignum space very quickly.
Hope at least one of these work for you!