I'm looking to implement an algorithm that picks items from a list randomly, albeit in a biased way. Say I have a value priority for each element in that list. I would like an element with higher priority appear more often in the selection.
Another thing is that the priority of these elements would be changed. If someone particularly picks an element rather than letting that element come from a random selection, the priority of that element should be increased by a certain factor.
I'm afraid I'm not quite sure how to phrase my question mathematically. Also, I don't know where to begin either. I found an article here which deals with the first part but doesn't deal with dynamically increasing the priority of elements.
One approach I thought of is that more the priority of an element, the more copies we make of it. However, this is very unfeasible to implement in computer code
I've also posted this question on math.stackexchange hoping for some mathematical insight.
NOTE: Just to be clear, I'm not looking for any implementation of sort. I'm just looking for a clear direction/few insights so I can go ahead and code up my own algorithm.
One of the possible ways I can think of right now is,
Suppose there are two arrays, priority[k] and values[k]
priority[]={2,1,5,7}
values[]={"apple","banana","oranges","lollipop"}
To get a random value in a biased way:
1. Let the sum of the priority array be S. In this example S=15.
2. Have a cumulative array of priority array,
`cumulative[]= {2,3,8,15}
3. Now generate a random number r <=S. Suppose it is 7
4. 7 is <8, the third element in cumulative array. So you return "orange"
If r=3, return "banana", if r=10, return "lollipop".
If someone particularly chooses an element simply increase the priority value of that element by a factor k.
Suppose k=2, and someone chooses oranges, the new arrays are
priority[]={2,1,7,7}
values[]={"apple","banana","oranges","lollipop"}
cumulative={2,3,10,17}
Using the same algorithm as described above you will get the desired results.
Related
So, I work in industrial automation, and normally program with ladder logic. So its rather odd compared to what I would consider normal programing. Anyway I needed to sort a list of numbers from smallest to biggest. I was looking through sorting algorithms trying to find one I could easily implement using ladder logic. I was having a hard time, but after some thinking I came up with something that wasn't even on the Wikipedia list of sorting algorithms. Well, It might be but I can't find it. I know this isn't very efficient sorting algorithm, but it does work. I want to know the name of it if it has one.
The basic version of this is, imagine an array of numbers. Take the first number in the list and compare it to all other numbers in the list, count the number of times that it is bigger than any of the other numbers. This accumulated value is the index number for where it goes in the output array. To place it in the array, check if there is already something written to that spot, if there is add one to the index and check again until there isn't anything in its spot. When the empty spot is found write it to the output array. Once you have done that to every number in the list you will have an output array with the same size as the input, but with it sorted smallest to biggest. I should note that this is assuming the language uses zero based indexing.
If this wasn't clear enough, I'm happy to elaborate further if needed.
I would say it's a worse version of counting sort:
It operates by counting the number of objects that possess distinct key values, and applying prefix sum on those counts to determine the positions of each key value in the output sequence
So it basically does the same thing you're doing: put each element in its final position by using counts. Counting sort uses an array to store the needed counts, you iterate the array to find them at each step for the current element.
I don't think there's a name for your exact algorithm.
This problem is 4-11 of Skiena. The solution to finding majority elements - repeated more than half times is majority algorithm. Can we use this to find all numbers repeated n/4 times?
Misra and Gries describe a couple approaches. I don't entirely understand their paper, but a key idea is to use a bag.
Boyer and Moore's original majority algorithm paper has a lot of incomprehensible proofs and discussion of formal verification of FORTRAN code, but it has a very good start of an explanation of how the majority algorithm works. The key concept starts with the idea that if the majority of the elements are A and you remove, one at a time, a copy of A and a copy of something else, then in the end you will have only copies of A. Next, it should be clear that removing two different items, neither of which is A, can only increase the majority that A holds. Therefore it's safe to remove any pair of items, as long as they're different. This idea can then be made concrete. Take the first item out of the list and stick it in a box. Take the next item out and stick it in the box. If they're the same, let them both sit there. If the new one is different, throw it away, along with an item from the box. Repeat until all items are either in the box or in the trash. Since the box is only allowed to have one kind of item at a time, it can be represented very efficiently as a pair (item type, count).
The generalization to find all items that may occur more than n/k times is simple, but explaining why it works is a little harder. The basic idea is that we can find and destroy groups of k distinct elements without changing anything. Why? If w > n/k then w-1 > (n-k)/k. That is, if we take away one of the popular elements, and we also take away k-1 other elements, then the popular element remains popular!
Implementation: instead of only allowing one kind of item in the box, allow k-1 of them. Whenever you see a group of k different items show up (that is, there are k-1 types in the box, and the one arriving doesn't match any of them), you throw one of each type in the trash, including the one that just arrived. What data structure should we use for this "box"? Well, a bag, of course! As Misra and Gries explain, if the elements can be ordered, a tree-based bag with O(log k) basic operations will give the whole algorithm a complexity of O(n log k). One point to note is that the operation of removing one of each element is a bit expensive (O(k) for a typical implementation), but that cost is amortized over the arrivals of those elements, so it's no big deal. Of course, if your elements are hashable rather than orderable, you can use a hash-based bag instead, which under certain common assumptions will give even better asymptotic performance (but it's not guaranteed). If your elements are drawn from a small finite set, you can guarantee that. If they can only be compared for equality, then your bag gets much more expensive and I'm pretty sure you end up with something like O(nk) instead.
Find the majority element that appears n/2 times by Moore-Voting Algorithm
See method 3 of the given link for Moore's Voting Algo (http://www.geeksforgeeks.org/majority-element/).
Time:O(n)
Now after finding majority element, scan the array again and remove the majority element or make it -1.
Time:O(n)
Now apply Moore Voting Algorithm on the remaining elements of array (but ignore -1 now as it has already been included earlier). The new majority element appears n/4 times.
Time:O(n)
Total Time:O(n)
Extra Space:O(1)
You can do it for element appearing more than n/8,n/16,.... times
EDIT:
There may exist a case when there is no majority element in the array:
For e.g. if the input arrays is {3, 1, 2, 2, 1, 2, 3, 3} then the output should be [2, 3].
Given an array of of size n and a number k, find all elements that appear more than n/k times
See this link for the answer:
https://stackoverflow.com/a/24642388/3714537
References:
http://www.cs.utexas.edu/~moore/best-ideas/mjrty/
See this paper for a solution that uses constant memory and runs in linear time, which will find 3 candidates for elements that occur more than n/4 times. Note that if you assume that your data is given as a stream that you can only go through once, this is the best you can do -- you have to go through the stream one more time to test each of the 3 candidates to see if it occurs more than n/4 times in the stream. However, if you assume a priori that there are 3 elements that occur more than n/4 times then you only need to go through the stream once so you get a linear time online algorithm (only goes through the stream once) that only requires constant storage.
As you didnt mention space complexity , one possible solution is using hashtable for the elements which maps to count then you can just increment count if the element is found.
I'm writing a program for a competition and I need to be faster than all the other competitors. For this I need a little algorithm help; ideally I'd be using the fastest algorithm.
For this problem I am given 2 things. The first is a list of tuples, each of which contains exactly two elements (strings), each of which represents an item. The second is an integer, which indicates how many unique items there are in total. For example:
# of items = 3
[("ball","chair"),("ball","box"),("box","chair"),("chair","box")]
The same tuples can be repeated/ they are not necessarily unique.) My program is supposed to figure out the maximum number of tuples that can "agree" when the items are sorted into two groups. This means that if all the items are broken into two ideal groups, group 1 and group 2, what are the maximum number of tuples that can have their first item in group 1 and their second item in group 2.
For example, the answer to my earlier example would be 2, with "ball" in group 1 and "chair" and "box" in group 2, satisfying the first two tuples. I do not necessarily need know what items go in which group, I just need to know what the maximum number of satisfied tuples could be.
At the moment I'm trying a recursive approach, but its running on (n^2), far too inefficient in my opinion. Does anyone have a method that could produce a faster algorithm?
Thanks!!!!!!!!!!
Speed up approaches for your task:
1. Use integers
Convert the strings to integers (store the strings in an array and use the position for the tupples.
String[] words = {"ball", "chair", "box"};
In tuppls ball now has number 0 (pos 0 in array) , chair 1, box 2.
comparing ints is faster than Strings.
2. Avoid recursion
Recursion is slow, due the recursion overhead.
For example look at binarys search algorithm in a recursive implementatiion, then look how java implements binSearch() (with a while loop and iteration)
Recursion is helpfull if problems are so complex that a non recursive implementation is to complex for a human brain.
An iterataion is faster, but not in the case when you mimick recursive calls by implementing your own stack.
However you can start implementing using a recursiove algorithm, once it works and it is a suited algo, then try to convert to a non recursive implementation
3. if possible avoid objects
if you want the fastest, the now it becomes ugly!
A tuppel array can either be stored in as array of class Point(x,y) or probably faster,
as array of int:
Example:
(1,2), (2,3), (3,4) can be stored as array: (1,2,2,3,3,4)
This needs much less memory because an object needs at least 12 bytes (in java).
Less memory becomes faster, when the array are really big, then your structure will hopefully fits in the processor cache, while the objects array does not.
4. Programming language
In C it will be faster than in Java.
Maximum cut is a special case of your problem, so I doubt you have a quadratic algorithm for it. (Maximum cut is NP-complete and it corresponds to the case where every tuple (A,B) also appears in reverse as (B,A) the same number of times.)
The best strategy for you to try here is "branch and bound." It's a variant of the straightforward recursive search you've probably already coded up. You keep track of the value of the best solution you've found so far. In each recursive call, you check whether it's even possible to beat the best known solution with the choices you've fixed so far.
One thing that may help (or may hurt) is to "probe": for each as-yet-unfixed item, see if putting that item on one of the two sides leads only to suboptimal solutions; if so, you know that item needs to be on the other side.
Another useful trick is to recurse on items that appear frequently both as the first element and as the second element of your tuples.
You should pay particular attention to the "bound" step --- finding an upper bound on the best possible solution given the choices you've fixed.
How to randomize order of approximately 20 elements with lowest complexity? (generating random permutations)
Knuth's shuffle algorithm is a good choice.
Some months ago I blogged about obtaining a random permutation of a list of integers.
You could use that as a permutation of indexes of the set containing your elements, and then you have what you want.
Generating random int list in F#
Uniformly distributed random list permutation in F#
In the first post I explore some possibilities, and finally I obtain "a function to randomly permutate a generic list with O(n) complexity", properly encapsulated to work on immutable data (ie, it is side-effect free).
In the second post, I make it uniformely distributed.
The code is in F#, I hope you don't mind!
Good luck.
EDIT: I don't have a formal proof, but intuition tells me that the complexity of such an algorithm cannot be lower than O(n). I'd really appreciate seeing it done faster!
A simple way to randomise the order is to make a new list of the correct size (20 in your case), iterate over the first list, and add each element in a random position to the second list. If the random position is already filled, put it in the next free position.
I think this pseudocode is correct:
list newList
foreach (element in firstList)
int position = Random.Int(0, firstList.Length - 1)
while (newList[position] != null)
position = (position + 1) % firstList.Length
newList[position] = element
EDIT: so it turns out that this answer isn't actually that good. It is neither particularly fast, nor particularly random. Thankyou for your comments. For a good answer, please scroll back to the top of the page :-)
Probably someone already implemented the shuffling for you. For example, in Python you can use random.shuffle, in C++ random_shuffle, and in PHP shuffle.
I'm taking comp 2210 (Data Structures) next semester and I've been doing the homework for the summer semester that is posted online. Until now, I've had no problems doing the assignments. Take a look at assignment 4 below, and see if you can give me a hint as to how to approach it. Please don't provide a complete algorithm, just an approach. Thanks!
A “costed sort” is an algorithm in which a sequence of values must be arranged in ascending order. The sort is
carried out by interchanging the position of two values one at a time until the sequence is in the proper order. Each
interchange incurs a cost, which is calculated as the sum of the two values involved in the interchange. The total
cost of the sort is the sum of the cost of the interchanges.
For example, suppose the starting
sequence were {3, 2, 1}. One possible
series of interchanges is
Interchange 1: {3, 1, 2} interchange cost = 0
Interchange 2: {1, 3, 2} interchange cost = 4
Interchange 3: {1, 2, 3} interchange cost = 5,
given a total cost of 9
You are to write a program that determines the minimal cost to arrange a specific sequence of numbers.
Edit: The professor does not allow brute forcing.
If you want to surprise your professor, you could use Simulated Annealing. Then again, if you manage that, you can probably skip a few courses :). Note that this algorithm will only give an approximate answer.
Otherwise: try a Backtracking algorithm, or Branch and Bound. These will both find the optimal answer.
What do you mean "brute forcing?" Do you mean "try all possible combinations and select the cheapest?" Just checking.
I think "branch and bound" is what you're looking for - check any source on algorithms. It is "like" brute force, except as you try a sequence of moves, as soon as that sequence of moves is less optimal than any other sequence of moves tried so far, you can abandon the sequence that got you to that point - the cost. This is one flavor of the "backtracking" mentioned above.
My preferred language for doing this would be Prolog but I'm weird.
Simulated Annealing is a PROBABLISTIC algorithm - if the solution space has local minima, then you may be trapped in one and get what you think is the right answer but isn't. There are ways around that and the literature all about that can be found but I don't agree that it's the tool you want.
You could try the related genetic algorithms too if that's the way you want to go.
Have you learned trees? You could create a tree with all possible changes leading to the desired result. The trick, of course, is to avoid creating the whole tree -- particularly when a part of it is obviously not the best solution, right?
I think the appropriate approach is to think hard about what defining properties a minimal "cost" sort has. Then figure out the cost by simulating this ideal sort. The key element here is you don't have to implement a general minimal cost sorting algorithm.
For example, let's say the defining property of a minimal cost sort is that every exchange puts at least one of the exchanged element in it's sorted position (I don't know if this is true). Every exchange based sort would love to be able to have this property, but it's not easy(possible?) in the general case. However You can easily create a program that takes an unsorted array, takes the sorted version (which itself can be generated by an unoptimal algorithm), and then using this information decides the minimum cost to achieve the sorted array from the unsorted array.
Description
I think the cheapest way to do this is to swap the cheapest misplaced item with the item that belongs in its spot. I believe this reduces cost by moving the most expensive things just once. If there are n-elements that are out of place, then there will be at most n-1 swaps to put them in place, at a cost of n-1 * cost of least item + cost of all other out of place.
If the globally cheapest element is not misplaced and the spread between this cheapest one and the cheapest misplaced one is great enough, it can be cheaper to swap the cheapest one in its correct place with the cheapest misplaced one. The cost then is n-1 * cheapest + cost of all out of place + cost of cheapest out of place.
Example
For [4,1,2,3], this algorithm exchanges (1,2) to produce:
[4,2,1,3]
and then swaps (3,1) to produce:
[4,2,3,1]
and then swaps (4,1) to produce:
[1,2,3,4]
Notice that each misplaced item in [2,3,4] is moved only once and is swapped with the lowest cost item.
Code
Ooops: "Please don't provide a complete algorithm, just an approach." Removed my code.
In an effort to only get you going on it this may not make complete sense.
Determine all possible moves, and the cost for each move and store those somehow, perform the least expensive move, then determine the moves that can be performed from this variation, storing those with the rest of your stored moves, perform the least expensive etc until the array is sorted.
I love solving things like this.
This problem is also known as the Silly Sort problem in some ACM contests. Take a look at this solution using Divide & Conquer.
Try different sorting algorithms on the same input data and print the minimum.