Is it possible to sort a dictionary in Julia? - sorting

I have created a dictionary out of two arrays using zip() like
list1 = [1,2,3,4,5]
list2 = [6,7,8,9,19]
dictionary1 = Dict(zip(list1,list2))
Now i want to sort this dictionary by key(list1) or by list2. Can somebody show me a way or function, how to realize it?

Sort also takes a by keyword, which means you can do:
julia> sort(collect(dictionary1), by=x->x[2])
5-element Array{Tuple{Int64,Int64},1}:
(1,6)
(2,7)
(3,8)
(4,9)
(5,19)
Also note that there is a SortedDict in DataStructures.jl, which maintains sort order, and there's an OrderedDict which maintains insertion order. Finally, there's a pull request which would allow direct sorting of OrderedDicts (but I need to finish it up and commit it).

While SortedDict may be useful if it is necessary to keep the dictionary sorted, it is often only necessary to sort the dictionary for output, in which case, the following may be what is required:
list1 = [1,2,3,4,5]
list2 = [6,7,8,9,19]
dictionary1 = Dict(zip(list1,list2))
sort(collect(dictionary1))
... which produces:
5-element Array{(Int64,Int64),1}:
(1,6)
(2,7)
(3,8)
(4,9)
(5,19)
We can sort by values with:
sort(collect(zip(values(dictionary1),keys(dictionary1))))
... which gives:
5-element Array{(Int64,Int64),1}:
(6,1)
(7,2)
(8,3)
(9,4)
(19,5)

The byvalue keyword for the sort function (or sort! for mutating/in place sorting) is useful for sorting dictionaries in order of their values (as opposed to their keys). The result will be of type OrderedDict from OrderedCollections.jl (it's also re-exported by DataStructures.jl).
list1 = [2,1,3,4,5]
list2 = [9,10,8,7,6]
dictionary1 = Dict(zip(list1,list2))
Sort by value (i.e. by list2):
sort(dictionary1; byvalue=true)
Output:
OrderedDict{Int64, Int64} with 5 entries:
5 => 6
4 => 7
3 => 8
2 => 9
1 => 10
Sort by key (i.e. by list1):
sort(dictionary1)
Output:
OrderedDict{Int64, Int64} with 5 entries:
1 => 10
2 => 9
3 => 8
4 => 7
5 => 6

Related

Julia - How to sort array and obtain the indexes

I have to arrays, one contain weights, and the other contain the categories (e.g w=[3, 4, 1, 2],x= ["a","b","c","c"]). Now, I'd like to sort the array x using the array of weights. How can one do this with the least amount of code? Is there a way of sorting an array and obtaining the corresponding indexes, so you can use this new sorted order in any other array with the same size?
I know that one can do this using DataFrames, but I'm looking for a way of doing this without resorting to that.
You want the sortperm function.
w = [30, 40, 10, 20]
x = ["a","b","c","d"]
julia> permvec = sortperm(w)
4-element Array{Int64,1}:
3
4
1
2
julia> wsorted = w[permvec]
4-element Vector{Int64}:
10
20
30
40
julia> xsorted = x[permvec]
4-element Array{String,1}:
"c"
"d"
"a"
"b"

Data structure to handle numerous queries on large size array

Given q queries of the following form. A list is there.
1 x y: Add number x to the list y times.
2 n: find the nth number of the sorted list
constraints
1 <= q <= 5 * 100000
1 <= x, y <= 1000000000
1 <= n < length of list
sample.
input
4
1 3 6
1 5 2
2 7
2 4
output
5
3
This is a competitive programming problem that it's too early in the morning for me to solve right now, but I can try and give some pointers.
If you were to store the entire array explicitly, it would obviously blow out your memory. But you can exploit the structure of the array to instead store the number of times each entry appears in the array. So if you got the query
1 3 5
then instead of storing [3, 3, 3], you'd store the pair (3, 5), indicating that the number 3 is in the list 5 times.
You can pretty easily build this, perhaps as a vector of pairs of ints that you update.
The remaining task is to implement the 2 query, where you find an element by its index. A side effect of the structure we've chosen is that you can't directly index into that vector of pairs of ints, since the indices in that list don't match up with the indices into the hypothetical array. We could just add up the size of each entry in the vector from the start until we hit the index we want, but that's O(n^2) in the number of queries we've processed so far... likely too slow. Instead, we probably want some updatable data structure for prefix sums—perhaps as described in this answer.

How can I reverse-sort a list in Red?

I've been playing with Red, and I figured out how to sort a list:
--== Red 0.5.1 ==--
Type HELP for starting information.
red>> list: [1 9 6 8]
== [1 9 6 8]
red>> sort list
== [1 6 8 9]
I'd like to sort this list backwards. How can I do this? I've tried various combinations:
red>> sort !list
*** Script error: !list has no value
*** Where: sort
red>> !sort list
*** Script error: !sort has no value
*** Where: try
red>> sort reverse list
== [1 6 8 9]
red>> sort list reverse
*** Script error: reverse is missing its series argument
*** Where: reverse
SORT has a /reverse refinement, which enables you to achieve what you want:
red>> sort/reverse [1 9 6 8]
== [9 8 6 1]
Also be aware that SORT modifies its argument.
You can find out more about how SORT (or any other function) works, by using the integrated help system:
red>> help sort
USAGE:
sort series /case /skip size /compare comparator /part length /all /reverse /stable
DESCRIPTION:
Sorts a series (modified); default sort order is ascending.
sort is of type: action!
ARGUMENTS:
series [series!]
REFINEMENTS:
/case => Perform a case-sensitive sort.
/skip => Treat the series as fixed size records.
size [integer!]
/compare => Comparator offset, block or function.
comparator [integer! block! any-function!]
/part => Sort only part of a series.
length [number! series!]
/all => Compare all fields.
/reverse => Reverse sort order.
/stable => Stable sorting.
red>> reverse sort list
== [9 8 6 1]
It is stack based, so you need to read it from right to left. You can write it as:
red>> reverse (sort list)
to imagine it better.

Insert item into a sorted list with Julia (with and without duplicates)

Main Question: What is the fastest way to insert an item into a list that is already sorted using Julia?
Currently, I do this:
v = [1, 2, 3, 5] #example list
x = 4 #value to insert
index = searchsortedfirst(v, x) #find index at which to insert x
insert!(v, index, x) #insert x at index
Bonus Question: What if I want to simultaneously ensure no duplicates?
You can use searchsorted to get the range of indices where the value occurs instead of just the first one and then use splice! to replace the values in that range with a new set of values:
insert_and_dedup!(v::Vector, x) = (splice!(v, searchsorted(v,x), [x]); v)
That's a nice little one-liner that does what you want.
julia> v = [1, 2, 3, 3, 5];
julia> insert_and_dedup!(v, 4)
6-element Array{Int64,1}:
1
2
3
3
4
5
julia> insert_and_dedup!(v, 3)
5-element Array{Int64,1}:
1
2
3
4
5
This made me think that splice! should handle the case where the replacement is a single value rather than an array, so I may add that feature.

How do I delete the intersection of sets A and B from A without sorting in MATLAB?

Two matrices, A and B:
A = [1 2 3
9 7 5
4 9 4
1 4 7]
B = [1 2 3
1 4 7]
All rows of matrix B are members of matrix A. I wish to delete the common rows of A and B from A without sorting.
I have tried setdiff() but this sorts the output.
For my particular problem (atomic coordinates in protein structures) maintaining the ordered integrity of the rows is important.
Use ISMEMBER:
%# find rows in A that are also in B
commonRows = ismember(A,B,'rows');
%# remove those rows
A(commonRows,:) = [];
I had to create diff between two arrays without sorting data. I found this great option in matlab docs. Setdiff function
Here is definition of function [C,ia] = setdiff(___,setOrder)
If you do not want output data sorted use 'stable' otherwise 'sorted' or without parameter.
Here was my use case.
yDataSent = setdiff(ScopeDataY, yDataBefore, 'stable')
yDataBefore = ScopeDataY;

Resources