The key-sorting according to values of a dictionary with simple Doubles or Ints works perfectly fine according to the example provided here....
But what about more complex dictionary-structures ?
I have a dictionary with dictionary-values that each consist of an array of Double-Tuples. (pretty complex, I know....).
And I would like to sort the dictionary-values according to the sum of the Second-Tuple-Array. (i.e. all second-tuple elements form an array and this array is summed-up; then sort the array-sums according to the smallest value). But all that still without loosing information on the dictionary-key. The result of the asked method shall return an array of keys according to the sorted result of "second-tuple-summed-up-array-results).
Here my "poor" trial for this problem :
I tried to sort the keys according to the values of the first-Tuple of the array-of-Tuples with the following Playground example (see below). But it does not perform yet....
This works for basic types:
extension Dictionary {
func keysSortedByValue(isOrderedBefore:(Value, Value) -> Bool) -> [Key] {
return sorted(self) {
let (lk, lv) = $0
let (rk, rv) = $1
return isOrderedBefore(lv, rv)
}.map { (k,v) in k }
}
}
let dict = ["a":2, "c":1, "b":3]
dict.keysSortedByValue(<) // result array of keys: ["c", "a", "b"]
dict.keysSortedByValue(>) // result array of keys: ["b", "a", "c"]
But in my more complex case, it doesn't work:
var criteria_array1 = [(Double, Double)]()
var criteria_array2 = [Double]()
var criteria_dict1 = [String:[(Double, Double)]]()
var criteria_dict2 = [String:[Double]]()
// Random creation of two dictionaries with a complex value-structure...
// Dictionary1: keys = Strings, values = array of Double-Tuples
// Dictionary2: keys = Strings, values = array of Doubles
for n in 1...5 {
let currentTopoString: String = "topo_\(n)"
for t in 0...14 {
let a: Double = Double(arc4random_uniform(1000))
let b: Double = Double(Double(arc4random_uniform(1000))/1000)
criteria_array1 += [(a, b)]
criteria_array2 += [b]
}
criteria_dict1[currentTopoString] = criteria_array1
criteria_dict2[currentTopoString] = criteria_array2
criteria_array1.removeAll()
criteria_array2.removeAll()
}
// the two following instruction generate compiler errors....
// why ???????????
// How could a complex dictionary-value-structure be applied to a sortingMethod ??
criteria_dict1.keysSortedByFirstTupleValue(>)
criteria_dict2.keysSortedByFirstTupleValue(>)
This is a question of implementing the isOrderedBefore function appropriately. Just passing in > is not going to cut it (even assuming there was an implementation of > for arrays of tuples, it almost certainly wouldn't do the comparison-of-summation you are looking for).
If I understand your goal correctly, you want to sort the keys based on the value of the sum of one of the tuple entries in an array of tuples?
So something like this:
criteria_dict1.keysSortedByValue { lhs, rhs in
// if you actually want to sort by sum of first element in tuple,
// change next.1 to next.0
let left_sum = reduce(lhs, 0) { total, next in total + next.1 }
let right_sum = reduce(rhs, 0) { total, next in total + next.1 }
return left_sum > right_sum
}
This is quite inefficient, since you're summing the array for every comparison – in practice you may want to memoize it, or maybe rethink the problem in terms of a different data structure if you do this a lot.
Related
With the U.S.'s large $1.5 Billion lottery this week, I wrote a function in Ruby to make Powerball picks. In Powerball, you choose 5 numbers from the range 1..69 (with no duplicates) and 1 number from the range 1..26.
This is what I came up with:
def pball
Array(1..69).shuffle[0..4].sort + [rand(1..26)]
end
It works by creating an array of integers from 1 to 69, shuffling that array, choosing the first 5 numbers, sorting those, and finally adding on a number from 1 to 26.
To do this in Swift takes a bit more work since Swift doesn't have the built-in shuffle method on Array.
This was my attempt:
func pball() -> [Int] {
let arr = Array(1...69).map{($0, drand48())}.sort{$0.1 < $1.1}.map{$0.0}[0...4].sort()
return arr + [Int(arc4random_uniform(26) + 1)]
}
Since there is no shuffle method, it works by creating an [Int] with values in the range 1...69. It then uses map to create [(Int, Double)], an array of tuple pairs that contain the numbers and a random Double in the range 0.0 ..< 1.0. It then sorts this array using the Double values and uses a second map to return to [Int] and then uses the slice [0...4] to extract the first 5 numbers and sort() to sort them.
In the second line, it appends a number in the range 1...26. I tried adding this to the first line, but Swift gave the error:
Expression was too complex to be solved in reasonable time; consider
breaking up the expression into distinct sub-expressions.
Can anyone suggest how to turn this into a 1-line function? Perhaps there is a better way to choose the 5 numbers from 1...69.
Xcode 8.3 • Swift 3.1
import GameKit
var powerballNumbers: [Int] {
return (GKRandomSource.sharedRandom().arrayByShufflingObjects(in: Array(1...69)) as! [Int])[0..<5].sorted() + [Int(arc4random_uniform(26) + 1)]
}
powerballNumbers // [5, 9, 62, 65, 69, 2]
Swift 2.x
import GameKit
var powerballNumbers: [Int] {
return (GKRandomSource.sharedRandom().arrayByShufflingObjectsInArray(Array(1...69)) as! [Int])[0...4].sort() + [Int(arc4random_uniform(26).successor())]
}
powerballNumbers // [21, 37, 39, 42, 65, 23]
I don't find the "one-liner" concept very compelling. Some languages lend themselves to it; others don't. I would suggest giving Swift a shuffle method to start with:
extension Array {
mutating func shuffle () {
for var i = self.count - 1; i != 0; i-- {
let ix1 = i
let ix2 = Int(arc4random_uniform(UInt32(i+1)))
(self[ix1], self[ix2]) = (self[ix2], self[ix1])
}
}
}
But since I made this mutating, we still need more than one line to express the entire operation because we have to have a var reference to our starting array:
var arr = Array(1...69)
(1...4).forEach {_ in arr.shuffle()}
let result = Array(arr[0..<5]) + [Int(arc4random_uniform(26)) + 1]
If you really insist on the one-liner, and you don't count the code needed to implement shuffle, then you can do it, though less efficiently, by defining shuffle more like this:
extension Array {
func shuffle () -> [Element] {
var arr = self
for var i = arr.count - 1; i != 0; i-- {
let ix1 = i
let ix2 = Int(arc4random_uniform(UInt32(i+1)))
(arr[ix1], arr[ix2]) = (arr[ix2], arr[ix1])
}
return arr
}
}
And here's your one-liner:
let result = Array(1...69).shuffle().shuffle().shuffle().shuffle()[0..<5] + [Int(arc4random_uniform(26)) + 1]
But oops, I omitted your sort. I don't see how to do that without getting the "too complex" error; to work around that, I had to split it into two lines:
var result = Array(1...69).shuffle().shuffle().shuffle().shuffle()[0..<5].sort(<)
result.append(Int(arc4random_uniform(26)) + 1)
How about this:
let winningDraw = (1...69).sort{ _ in arc4random_uniform(2) > 0}[0...4].sort() + [Int(arc4random_uniform(26)+1)]
[edit] above formula wasn't random. but this one will be
(1...69).map({Int(rand()%1000*70+$0)}).sort().map({$0%70})[0...4].sort() + [Int(rand()%26+1)]
For the fun of it, a non-GameplayKit (long) one-liner for Swift 3, using the global sequence(state:next:) function to generate random elements from the mutable state array rather than shuffling the array (although mutating the value array 5 times, so some extra copy operations here...)
let powerballNumbers = Array(sequence(state: Array(1...69), next: {
(s: inout [Int]) -> Int? in s.remove(at: Int(arc4random_uniform(UInt32(s.count))))})
.prefix(5).sorted()) + [Int(arc4random_uniform(26) + 1)]
... broken down for readability.
(Possible in future Swift version)
If the type inference weren't broken inout closure parameters (as arguments to closures), we could reduce the above to:
let powerballNumbers = Array(sequence(state: Array(1...69), next: {
$0.remove(at: Int(arc4random_uniform(UInt32($0.count)))) })
.prefix(5).sorted()) + [Int(arc4random_uniform(26) + 1)]
If we'd also allow the following extension
extension Int {
var rand: Int { return Int(arc4random_uniform(UInt32(exactly: self) ?? 0)) }
}
Then, we could go on to reduce the one-line to:
let powerballNumbers = Array(sequence(state: Array(1...69), next: { $0.remove(at: $0.count.rand) }).prefix(5).sorted()) + [26.rand + 1]
Xcode 10 • Swift 4.2
Swift now has added shuffled() to ClosedRange and random(in:) to Int which now makes this easily accomplished in one line:
func pball() -> [Int] {
return (1...69).shuffled().prefix(5).sorted() + [Int.random(in: 1...26)]
}
Further trimmings:
Because of the return type of pball(), the Int can be inferred in the random method call. Also, .prefix(5) can be replaced with [...4]. Finally, return can be omitted from the one-line function:
func pball() -> [Int] {
(1...69).shuffled()[...4].sorted() + [.random(in: 1...26)]
}
This question already has an answer here:
Index of element in sorted()
(1 answer)
Closed 7 years ago.
I'm trying to return the indices of an array which correspond to the sorted values. For example,
let arr = [7, 10, -3]
let idxs = argsort(arr) // [2, 0, 1]
My attempt works but is not pretty, and only functions for CGFloat. I'm looking for some ways in which I can improve the function, make it generic and easier to read. The code just looks ugly,
func argsortCGFloat( a : [CGFloat] ) -> [Int] {
/* 1. Values are wrapped in (index, values) tuples */
let wrapped_array = Array(Zip2(indices(a),a))
/* 2. A comparator compares the numerical value from
two tuples and the array is sorted */
func comparator(a: (index : Int, value : CGFloat), b: (index : Int, value : CGFloat)) -> Bool {
return a.value < b.value
}
var values = sorted(wrapped_array, comparator)
/* 3. The sorted indexes are extracted from the sorted
array of tuples */
var sorted_indexes: [Int] = []
for pair in values {
sorted_indexes.append(pair.0)
}
return sorted_indexes
}
You can do it by creating an array of indexes, and sorting them using the array from the outer context, like this:
func argsort<T:Comparable>( a : [T] ) -> [Int] {
var r = Array(indices(a))
r.sort({ a[$0] > a[$1] })
return r
}
let arr = [7, 10, -3]
let idxs = argsort(arr)
println (idxs)
Given a sentence that is spread over a linked list where each item in the list is a word, for example:
Hello -> Everybody -> How -> Are -> You -> Feeling -> |
Given that this list is sorted, eg:
Are -> Everybody -> Feeling -> Hello -> How -> You -> |
How would you write the recursion that will find the initial letter that appears the most in the sentence (in this example the letter H from Hello & How) ?
Edit: I have update the code to recursion version.
In order to run it you call
GetMostLetterRecursion(rootNode , '0', 0, '0', 0)
The code itself look like this:
public char GetMostLetterRecursion(LinkedListNode<String> node, char currentChar, int currentCount, char maxChar, int maxCount)
{
if (node == null) return maxChar;
char c = node.Value[0];
if (c == currentChar)
{
return GetMostLetterRecursion(node.Next, currentChar, currentCount++, maxChar, maxCount);
}
if(currentCount > maxCount)
{
return GetMostLetterRecursion(node.Next, c, 1, currentChar, currentCount);
}
return GetMostLetterRecursion(node.Next, c, 1, maxChar, maxCount);
}
Solution 1
Loop over the words, keeping a tally of how many words start with each letter. Return the most popular letter according to the tally (easy if you used a priority queue for the tally).
This takes O(n) time (the number of words) and O(26) memory (the number of letters in alphabet).
Solution 2
Sort the words alphabetically. Loop over the words. Keep a record of the current letter and its frequency, as well as the most popular letter so far and its frequency. At the end of the loop, that's the most popular letter over the whole list.
This takes O(n log n) time and O(1) memory.
Keep an array to store the count of occurrences and Go through the linked list once to count it. Finally loop through the array to find the highest one.
Rough sketch in C:
int count[26]={0};
While ( head->next != NULL)
{
count[head->word[0] - 'A']++; // Assuming 'word' is string in each node
head = head->next;
}
max = count[0];
for (i=0;i<26;i++)
{
if(max<a[i])
max = a[i];
}
You can modify it to use recursion and handle lower case letters.
Here is a pure recursive implementation in Python. I haven't tested it, but it should work modulo typos or syntax errors. I used a Dictionary to store counts, so it will work with Unicode words too. The problem is split into two functions: one to count the occurrences of each letter, and another to find the maximum recursively.
# returns a dictionary where dict[letter] contains the count of letter
def count_first_letters(words):
def count_first_letters_rec(words, count_so_far):
if len(words) == 0:
return count_so_far
first_letter = words[0][0]
# could use defaultdict but this is an exercise :)
try:
count_so_far[first_letter] += 1
except KeyError:
count_so_far[first_letter] = 1
# recursive call
return count_first_letters_rec(words[1:], count_so_far)
return count_first_letters(words, {})
# takes a list of (item, count) pairs and returns the item with largest count.
def argmax(item_count_pairs):
def argmax_rec(item_count_pairs, max_so_far, argmax_so_far):
if len(item_count_pairs) == 0:
return argmax_so_far
item, count = item_count_pairs[0]
if count > max_so_far:
max_so_far = count
argmax_so_far = item
# recursive call
return argmax_rec(item_count_pairs[1:], max_so_far, argmax_so_far)
return argmax_rec(item_count_pairs, 0, None)
def most_common_first_letter(words);
counts = count_first_letters(words)
# this returns a dictionary, but we need to convert to
# a list of (key, value) tuples because recursively iterating
# over a dictionary is not so easy
kvpairs = counts.items()
# counts.iteritems() for Python 2
return argmax(kvpairs)
I have an array with the length of 26 (as English letters, so index 1 is for 'a' and 2 for 'b' and so on. ). Each time a letter occurs, I increment it's value in the array. if the value becomes more than max amount, then I update the max and take that letter as most occurred one.then I call the method for the next node.
This is the code in Java:
import java.util.LinkedList;
public class MostOccurance {
char mostOccured;
int maxOccurance;
LinkedList<String> list= new LinkedList<String>();
int[] letters= new int[26];
public void start(){
findMostOccuredChar( 0, '0', 0);
}
public char findMostOccuredChar ( int node, char most, int max){
if(node>=list.size())
return most;
String string=list.get(node);
if (string.charAt(0)== most)
{max++;
letters[Character.getNumericValue(most)-10]++;
}
else{
letters[Character.getNumericValue(most)-10]++;
if (letters[Character.getNumericValue(most)-10]++>max){
max=letters[Character.getNumericValue(most)-10];
most=string.charAt(0);
}
}
findMostOccuredChar( node++, most, max);
return most;
}
}
of course, you have to add each word to your link list. I didn't do that, because I was just showing an example.
This should be easy.
I want to check whether two list are the same in that they contain all the same elements or not, orders not important.
Duplicated elements are considered equal, i.e.e, new[]{1,2,2} is the same with new[]{2,1}
var same = list1.Except(list2).Count() == 0 &&
list2.Except(list1).Count() == 0;
Edit: This was written before the OP added that { 1, 2, 2 } equals { 1, 1, 2 } (regarding handling of duplicate entries).
This will work as long as the elements are comparable for order.
bool equal = list1.OrderBy(x => x).SequenceEqual(list2.OrderBy(x => x));
The SetEquals of HashSet is best suited for checking whether two sets are equal as defined in this question
string stringA = "1,2,2";
string stringB = "2,1";
HashSet<string> setA = new HashSet<string>((stringA.Trim()).Split(',').Select(t => t.Trim()));
HashSet<string> setB = new HashSet<string>((stringB.Trim()).Split(',').Select(t => t.Trim()));
bool isSetsEqual = setA.SetEquals(setB);
REFERENCE:
Check whether two comma separated strings are equal (for Content set)
You need to get the intersection of the two lists:
bool areIntersected = t1.Intersect(t2).Count() > 0;
In response to you're modified question:
bool areSameIntersection = t1.Except(t2).Count() == 0 && t2.Except(t1).Count() == 0;
If the count of list1 elements in list2 equals the count of list2 elements in list1, then the lists both contain the same number of elements, are both subsets of each other - in other words, they both contain the same elements.
if (list1.Count(l => list2.Contains(l)) == list2.Count(l => list1.Contains(l)))
return true;
else
return false;
I have a situation where I need to find the value with the key closest to the one I request. It's kind of like a nearest map that defines distance between keys.
For example, if I have the keys {A, C, M, Z} in the map, a request for D would return C's value.
Any idea?
Most tree data structures use some sort of sorting algorithm to store and find keys. Many implementations of such can locate a close key to the key you probe with (usually it either the closest below or the closest above). For example Java's TreeMap implements such a data structure and you can tell it to get you the closest key below your lookup key, or the closest key above your lookup key (higherKey and lowerKey).
If you can calculate distances (its not always easy - Java's interface only require you to know if any given key is "below" or "above" any other given key) then you can ask for both closest above and closest below and then calculate for yourself which one is closer.
What's the dimensionality of your data? If it's just one dimensional, a sorted array will do it - a binary search will locate the exact match and/or reveal betweeen which two keys your search key lies - and a simple test will tell you which is closer.
If you need to locate not just the nearest key, but an associated value, maintain an identically sorted array of values - the index of the retrieved key in the key array is then the index of the value in the value array.
Of course, there are many alternative approaches - which one to use depends on many other factors, such as memory consumption, whether you need to insert values, if you control the order of insertion, deletions, threading issues, etc...
BK-trees do precisely what you want. Here's a good article on implementing them.
And here is a Scala implementation:
class BKTree[T](computeDistance: (T, T) => Int, node: T) {
val subnodes = scala.collection.mutable.HashMap.empty[Int,BKTree[T]]
def query(what: T, distance: Int): List[T] = {
val currentDistance = computeDistance(node, what)
val minDistance = currentDistance - distance
val maxDistance = currentDistance + distance
val elegibleNodes = (
subnodes.keys.toList
filter (key => minDistance to maxDistance contains key)
map subnodes
)
val partialResult = elegibleNodes flatMap (_.query(what, distance))
if (currentDistance <= distance) node :: partialResult else partialResult
}
def insert(what: T): Boolean = if (node == what) false else (
subnodes.get(computeDistance(node, what))
map (_.insert(what))
getOrElse {
subnodes(computeDistance(node, what)) = new BKTree(computeDistance, what)
true
}
)
override def toString = node.toString+"("+subnodes.toString+")"
}
object Test {
def main(args: Array[String]) {
val root = new BKTree(distance, 'A')
root.insert('C')
root.insert('M')
root.insert('Z')
println(findClosest(root, 'D'))
}
def charDistance(a: Char, b: Char) = a - b abs
def findClosest[T](root: BKTree[T], what: T): List[T] = {
var distance = 0
var closest = root.query(what, distance)
while(closest.isEmpty) {
distance += 1
closest = root.query(what, distance)
}
closest
}
}
I'll admit to a certain dirt&uglyness about it, and of being way too clever with the insertion algorithm. Also, it will only work fine for small distance, otherwise you'll search repeatedly the tree. Here's an alternate implementation that does a better job of it:
class BKTree[T](computeDistance: (T, T) => Int, node: T) {
val subnodes = scala.collection.mutable.HashMap.empty[Int,BKTree[T]]
def query(what: T, distance: Int): List[T] = {
val currentDistance = computeDistance(node, what)
val minDistance = currentDistance - distance
val maxDistance = currentDistance + distance
val elegibleNodes = (
subnodes.keys.toList
filter (key => minDistance to maxDistance contains key)
map subnodes
)
val partialResult = elegibleNodes flatMap (_.query(what, distance))
if (currentDistance <= distance) node :: partialResult else partialResult
}
private def find(what: T, bestDistance: Int): (Int,List[T]) = {
val currentDistance = computeDistance(node, what)
val presentSolution = if (currentDistance <= bestDistance) List(node) else Nil
val best = currentDistance min bestDistance
subnodes.keys.foldLeft((best, presentSolution))(
(acc, key) => {
val (currentBest, currentSolution) = acc
val (possibleBest, possibleSolution) =
if (key <= currentDistance + currentBest)
subnodes(key).find(what, currentBest)
else
(0, Nil)
(possibleBest, possibleSolution) match {
case (_, Nil) => acc
case (better, solution) if better < currentBest => (better, solution)
case (_, solution) => (currentBest, currentSolution ::: solution)
}
}
)
}
def findClosest(what: T): List[T] = find(what, computeDistance(node, what))._2
def insert(what: T): Boolean = if (node == what) false else (
subnodes.get(computeDistance(node, what))
map (_.insert(what))
getOrElse {
subnodes(computeDistance(node, what)) = new BKTree(computeDistance, what)
true
}
)
override def toString = node.toString+"("+subnodes.toString+")"
}
object Test {
def main(args: Array[String]) {
val root = new BKTree(distance, 'A')
root.insert('C')
root.insert('E')
root.insert('M')
root.insert('Z')
println(root.findClosest('D'))
}
def charDistance(a: Char, b: Char) = a - b abs
}
With C++ and STL containers (std::map) you can use the following template function:
#include <iostream>
#include <map>
//!This function returns nearest by metric specified in "operator -" of type T
//!If two items in map are equidistant from item_to_find, the earlier occured by key will be returned
template <class T,class U> typename std::map<T,U>::iterator find_nearest(std::map<T,U> map_for_search,const T& item_to_find)
{
typename std::map<T,U>::iterator itlow,itprev;
itlow=map_for_search.lower_bound(item_to_find);
itprev=itlow;
itprev--;
//for cases when we have "item_to_find" element in our map
//or "item_to_find" occures before the first element of map
if ((itlow->first==item_to_find) || (itprev==map_for_search.begin()))
return itlow;
//if "item"to_find" is besides the last element of map
if (itlow==map_for_search.end())
return itprev;
return (itlow->first-item_to_find < item_to_find-itprev->first)?itlow:itprev; // C will be returned
//note that "operator -" is used here as a function for distance metric
}
int main ()
{
std::map<char,int> mymap;
std::map<char,int>::iterator nearest;
//fill map with some information
mymap['B']=20;
mymap['C']=40;
mymap['M']=60;
mymap['Z']=80;
char ch='D'; //C should be returned
nearest=find_nearest<char,int>(mymap,ch);
std::cout << nearest->first << " => " << nearest->second << '\n';
ch='Z'; //Z should be returned
nearest=find_nearest<char,int>(mymap,ch);
std::cout << nearest->first << " => " << nearest->second << '\n';
ch='A'; //B should be returned
nearest=find_nearest<char,int>(mymap,ch);
std::cout << nearest->first << " => " << nearest->second << '\n';
ch='H'; // equidistant to C and M -> C is returned
nearest=find_nearest<char,int>(mymap,ch);
std::cout << nearest->first << " => " << nearest->second << '\n';
return 0;
}
Output:
C => 40
Z => 80
B => 20
C => 40
It is assumed that an operator - is used as a function to evaluate distance. You should implement that operator if class T is your own class, objects of which serve as keys in a map.
You could also change the code to use special class T static member function (say, distance), not operator -, instead:
return (T::distance(itlow->first,item_to_find) < T::distance(item_to_find,itprev->first))?itlow:itprev;
where distance should be smth. like
static distance_type some_type::distance()(const some_type& first, const some_type& second){//...}
and distance_type should support comparison by operator <
You can implement something like this as a tree. A simple approach is to assign each node in the tree a bitstring. Each level of the tree is stored as a bit. All parent information is encoded in the node's bitstring. You can then easily locate arbitrary nodes, and find parents and children. This is how Morton ordering works, for example. It has the extra advantage that you can calculate distances between nodes by simple binary subtraction.
If you have multiple links between data values, then your data structure is a graph rather than a tree. In that case, you need a slightly more sophisticated indexing system. Distributed hash tables do this sort of thing. They typically have a way of calculating the distance between any two nodes in the index space. For example, the Kademlia algorithm (used by Bittorrent) uses XOR distances applied to bitstring ids. This allows Bittorrent clients to lookup ids in a chain, converging on the unknown target location. You can use a similar approach to find the node(s) closest to your target node.
If your keys are strings and your similarity function is Levenshtein distance, then you can use finite-state machines:
Your map is a trie built as a finite-state machine (by unionizing all key/value pairs and determinizing). Then, compose your input query with a simple finite-state transducer that encodes the Levenshtein distance, and compose that with your trie. Then, use the Viterbi algorithm to extract the shortest path.
You can implement all this with only a few function calls using a finite-state toolkit.
in scala this is a technique I use to find the closest Int <= to the key you are looking for
val sMap = SortedMap(1 -> "A", 2 -> "B", 3 -> "C")
sMap.to(4).lastOption.get // Returns 3
sMap.to(-1) // Returns an empty Map