Collapse array of sets to single set in Julia - set

How do I collapse an array of sets to a single set containing all unique set elements in the following way?
Array_of_sets = Set{String}[Set(["EUR", "GBP", "USD"]), Set(["AUD"])]
# do something to Array_of_sets which produces
Set{String}(["EUR", "GBP", "USD", "AUD"])
union, vcat and unique do not seem to work in this case.

So sets are already unique:
julia> S = Set(["a", "b"])
Set(["b", "a"])
julia> push!(S, "a")
Set(["b", "a"])
Therefore, no unique needed:
julia> A = Set{String}[Set(["EUR", "GBP", "USD"]), Set(["EUR", "AUD"])]
julia> reduce(union!, A)
Set(["EUR", "GBP", "AUD", "USD"])

I would convert the sets to arrays, and vcat them:
Array_of_sets = Set{String}[Set(["EUR", "GBP", "USD"]), Set(["AUD"])]
Array_of_arrays = map(collect, Array_of_sets)
Set(vcat(Array_of_arrays...))
> Set{String}(["EUR", "GBP", "USD", "AUD"])
Edit: But then I'm not as clever as #stillearningsomething

Related

In what way are Julia sets immutable?

If I create a set in Julia, then Julia will tell me that the set is immutable.
julia> pets = Set(["dog", "cat", "budgerigar"])
Set{String} with 3 elements:
"cat"
"budgerigar"
"dog"
julia> ismutable(pets)
false
Nonetheless, I can modify the set in place.
julia> push!(pets, "orangutan")
Set{String} with 4 elements:
"orangutan"
"cat"
"budgerigar"
"dog"
And I can check that the set contents have changed.
julia> display(pets)
Set{String} with 4 elements:
"orangutan"
"cat"
"budgerigar"
"dog"
Similarly, I can delete from the set in place
julia> delete!(pets, "dog")
Set{String} with 3 elements:
"orangutan"
"cat"
"budgerigar"
So my question is, in what way are sets immutable? In what way is their mutability different when compared with dictionaries?
julia> ismutable(Dict())
true
What am I not understanding?
If you check the source for Set, you can see that a Set is just a wrapper on a Dict{T,Nothing}, and when you add a new item, say x::T, to the Set, julia just creates a new entry in the Dict of x => nothing. That is, the items are stored in the keys of the Dict, and the values are not relevant so are set to nothing.
Clearly, a Dict needs to be mutable, as you observed in the question. The Set itself does not need to be mutable, since all the mutation is performed within the Dict that is wrapped by the Set. To see what I mean, we can mess around with the internals of a Set.
julia> s = Set(["a", "b"])
Set{String} with 2 elements:
"b"
"a"
julia> s.dict
Dict{String,Nothing} with 2 entries:
"b" => nothing
"a" => nothing
julia> push!(s, "c")
Set{String} with 3 elements:
"c"
"b"
"a"
julia> s.dict["d"] = nothing
julia> s.dict
Dict{String,Nothing} with 4 entries:
"c" => nothing
"b" => nothing
"a" => nothing
"d" => nothing
julia> s.dict = Dict("a new set"=>nothing)
ERROR: setfield! immutable struct of type Set cannot be changed
Stacktrace:
[1] setproperty!(::Set{String}, ::Symbol, ::Dict{String,Nothing}) at ./Base.jl:34
[2] top-level scope at REPL[14]:1
The key insight here is that I can play around with the mutability of s.dict as much as I want. But because Set is immutable, I can't replace s.dict with an entirely new Dict. That triggers the error you see in my session above.
If it isn't clear what it means for an immutable type to have mutable internals, I asked a similar question about this years ago on StackOverflow. It can be found here

How do I check if a dictionary has a key in it in Julia?

Suppose I have a Dict object and a key value and I want to see if there's already an entry in the dictionary for that key? How do I do this?
There are a few ways to do this. Suppose this is your dictionary:
d = Dict(
"aardvark" => 1,
"bear" => 2,
"cat" => 3,
"dog" => 4,
)
If you have a key you can check for its presence using the haskey function:
julia> haskey(d, "cat")
true
julia> haskey(d, "zebra")
false
A slightly fancier way to check this is to check if the key is in the set of keys returned by calling keys(d):
julia> ks = keys(d)
Base.KeySet for a Dict{String,Int64} with 4 entries. Keys:
"aardvark"
"bear"
"cat"
"dog"
julia> "cat" in ks
true
julia> "zebra" in ks
false
Finally, it's fairly common that you want to get the value associated with a key if it is present in the dictionary. You can do that as a separate step by doing d[k] after checking that k is present in keys(d) but that involves an additional dictionary lookup. Instead, if there is some sentinel value that you know cannot be a value in your dictionary, such as nothing, then you can use the get function to look up the key with a default:
v = get(d, k, nothing)
if v !== nothing
# keys(d) contains k
end
If you know nothing about the kinds of values that d can map keys to, this is not a safe option since it could be the case that the pair k => nothing is present in d.
Ok not exactly an answer but still relevant to the discussion, something I learnt recently about julia dictionary accessor method that has made my life a lot easier.
val = get(dict, "key", nothing)
also has a version that creates the dictionary entry and sets it to the default value if it does not already exist
val = get!(dict, "key", nothing)
which does away with writing a lot of blocks like this
if !haskey(dict, "key")
dict["key"] = nothing
end
val = dict["key"]
so now instead of littering your code with key checks and dictionary value initialization blocks you can use get! instead.

Ruby custom sorting of first n elements

I would like to sort an array of string based on my custom ordering. Problem is I dont know all the elements in array but Im sure that it has 3 strings (high/med/low). So I would like those 3 to be first 3 values . Rest at last
Eg:
Incoming arrays
array1 = ["high", "not impt" , "med" , "kind of impt" , "low" ]
array2 = ["low", "rand priority", "med", "high"]
Only high med and low are fixed, rest all keep changing or might not be present at all
required output
["high", "med", "low", rest.(order doesn't matter)]]
I know I can delete and merge, But it will be confusing in code as to why Im doing delete and merge. Any better way?
You can use sort_by method and implement something like this:
["high", "not impt" , "med" , "kind of impt" , "low" ].sort_by do |a|
["high", "med", "low"].index(a) || Float::INFINITY
end
index method returns 0, 1 and 2 for "high", "med" and "low" correspondingly and nil for other values. Thus, "high", "med" and "low" is going to be at the beginning and others at the end since every value is less than Float::INFINITY

redis sorted set highest score

Is any simple method to get the highest score from Redis sorted set? I found this way, may be there is better ways to make this(in ruby):
all_scores = Redis.zrange('foo', 0, -1, with_scores: true) # => [["item 1", 2.5], ["item 2", 3.4]]
all_scores.flatten.last # => 3.4
It seems not the best way.
you can use ZREVRANGE command.
ZREVRANGE foo 0 0 withscores
This will give you the highest score and it's value.
http://redis.io/commands/zrevrange

Julia: Sorting a dict of types

I have a dict filled with Job types
A job has a name(string) and a score(int)
I managed to load the jobs into a Dict, and I want to sort them using the Sort method based on the jobs scores. However, when I sort the dict (call it jobs), it gives me a new vector of the sorted scores.
is there any way to sort the dict while preserving which job has its specific score?
jobs = Dict([(nurse, nurse.score), (construction, construction.score),
(programmer, programmer.score), (retail, retail.score)])
sort(collect(values(jobs)))
so if I have nurse with a score of 3, programmer with a score of 6, retail with a score of 0, and construction with a score of 4, I would want the output to be a dict (or something similar) that would contain:
programmer, 6
construction, 4
nurse, 3
retail, 0
or, even better, could I sort it by the values but get the output as a vector with just the jobs? then reference that vector later in my code?
this works in your specific case:
jobs = Dict("nurse"=>3, "construction"=>4, "programmer"=>6, "retail"=>0)
jobpairs = collect(jobs)
jobvalues = collect(values(jobs))
sind = sort(collect(values(jobs)), rev=true)
julia> sortedNames = [jobpairs[i] for i in indexin(sind, jobvalues)]
4-element Array{Any,1}:
"programmer"=>6
"construction"=>4
"nurse"=>3
"retail"=>0
if two keywords have the same value, we need do more work to deal with indices.
UPDATE:
as Matt suggested in the comment below, we should use sortperm rather than indexin which won't work if the dict has at least two keywords that have the same value.
jobs = Dict("nurse"=>3, "construction"=>4, "foo"=>3, "programmer"=>6, "retail"=>0)
jobpairs = collect(jobs)
jobvalues = collect(values(jobs))
sind = sortperm(collect(values(jobs)), rev=true)
julia> sortedNames = [jobpairs[i].first for i in sind]
5-element Array{Any,1}:
"programmer"
"construction"
"foo"
"nurse"
"retail"
Sorting algorithm with less code, but I don't know about the performance and you would not have a dict as result:
sort(collect(jobs),by=x->x[2],rev=true)
Currently I think the recommended way to do it is:
julia> using DataStructures
julia> jobs = Dict("nurse"=>3, "construction"=>4, "programmer"=>6, "retail"=>0)
Dict{String,Int64} with 4 entries:
"programmer" => 6
"retail" => 0
"construction" => 4
"nurse" => 3
julia> sort!(OrderedDict(jobs), byvalue=true, rev=true)
OrderedDict{String,Int64} with 4 entries:
"programmer" => 6
"construction" => 4
"nurse" => 3
"retail" => 0
In this way you get a dictionary as you wanted, but it is OrderedDict so it can be sorted as you see.

Resources