Julia: Self-referential and recursive types - data-structures

What I am trying to do is not very straight forward, maybe it is easier if I start with the result and then explain how I am trying to get there.
I have a struct with two fields:
struct data{T}
point::T
mat::Array
end
What I would like to do is nest this and make the field mat self-referential to get something like this:
data{data{Int64}}(data{Int64}(1, [1]), [1])
The 'outer' type should not store [1] but reference to the innermost mat. I am not sure if this makes sense or is even possible. The field mat should store the same large array repeatedly.
I have tried something like this (n is the number of nested types.
struct data{T}
point::T
g::Array
function D(f, g, n)
for i = 1:n
(x = new{T}(f, g); x.f = x)
end
end
end
Again I am not sure if I understand self-referential constructors enough, or if this is possible. Any help/clarification would be appreciated, thanks!

The exact pattern will depend on what you want to achieve but here is one example:
struct Data{V, A <: AbstractArray{V}, T}
mat::A
point::T
Data(mat::A, point::T = nothing) where {V, A <: AbstractArray{V}, T} =
new{V,A,T}(mat,point)
end
Usage
julia> d0 = Data([1,2,3])
Data{Int64,Array{Int64,1},Nothing}([1, 2, 3], nothing)
julia> d1 = Data([1.0,2.0],d0)
Data{Float64,Array{Float64,1},Data{Int64,Array{Int64,1},Nothing}}([1.0, 2.0], Data{Int64,Array{Int64,1},Nothing}([1, 2, 3], nothing))
Tips:
Never use untyped containers. Hence, when you want to store an Array you need to have its type in your struct defintion.
Use names starting with a capital letter for structs
Provide constructors to have your API readable
Last but not least. If you want to have several nesting levels for such structure the compiling times will hugely increase. In that case it would be usually better to use homogenous types. In such scenarios you could use perhaps type Unions instead (unions of small number of types are fast in Julia).

Based on your description, the data seems like a very general wrapper. Maybe you can try something like this:
mutable struct Wrap{T}
w::Wrap{T}
d::T
function Wrap(d::T) where T
w = new{T}()
w.d = d
w
end
end
function Wrap(d, n::Int)
res = Wrap(d)
cur = res
for _ in 1:n-1
cur.w = Wrap(d)
cur = cur.w
end
res
end
Wrap([1], 4)
# Wrap{Array{Int64,1}}(Wrap{Array{Int64,1}}(Wrap{Array{Int64,1}}(Wrap{Array{Int64,1}}(#undef, [1]), [1]), [1]), [1])

Related

Sorting array of struct in Julia

Suppose if I have the following in Julia:
mutable struct emptys
begin_time::Dict{Float64,Float64}; finish_time::Dict{Float64,Float64}; Revenue::Float64
end
population = [emptys(Dict(),Dict(),-Inf) for i in 1:n_pop] #n_pop is a large positive integer value.
for ind in 1:n_pop
r = rand()
append!(population[ind].Revenue, r)
append!(population[ind].begin_time, Dict(r=>cld(r^2,rand())))
append!(population[ind].finish_time, Dict(r=>r^3/rand()))
end
Now I want to sort this population based on the Revenue value. Is there any way for Julia to achieve this? If I were to do it in Python it would be something like this:
sorted(population, key = lambda x: x.Revenue) # The population in Python can be prepared using https://pypi.org/project/ypstruct/ library.
Please help.
There is a whole range of sorting functions in Julia. The key functions are sort (corresponding to Python's sorted) and sort! (corresponding to Python's list.sort).
And as in Python, they have a couple of keyword arguments, one of which is by, corresponding to key.
Hence the translation of
sorted(population, key = lambda x: x.Revenue)
would be
getrevenue(e::emptys) = e.Revenue
sort(population, by=getrevenue)
Or e -> e.Revenue, but having a getter function is good style anyway.

Julia type instability: Array of LinearInterpolations

I am trying to improve the performance of my code by removing any sources of type instability.
For example, I have several instances of Array{Any} declarations, which I know generally destroy performance. Here is a minimal example (greatly simplified compared to my code) of a 2D Array of LinearInterpolation objects, i.e
n,m=5,5
abstract_arr=Array{Any}(undef,n+1,m+1)
arr_x=LinRange(1,10,100)
for l in 1:n
for alpha in 1:m
abstract_arr[l,alpha]=LinearInterpolation(arr_x,alpha.*arr_x.^n)
end
end
so that typeof(abstract_arr) gives Array{Any,2}.
How can I initialize abstract_arr to avoid using Array{Any} here?
And how can I do this in general for Arrays whose entries are structures like Dicts() where the Dicts() are dictionaries of 2-tuples of Float64?
If you make a comprehension, the type will be figured out for you:
arr = [LinearInterpolation(arr_x, ;alpha.*arr_x.^n) for l in 1:n, alpha in 1:m]
isconcretetype(eltype(arr)) # true
When it can predict the type & length, it will make the right array the first time. When it cannot, it will widen or extend it as necessary. So probably some of these will be Vector{Int}, and some Vector{Union{Nothing, Int}}:
[rand()>0.8 ? nothing : 0 for i in 1:3]
[rand()>0.8 ? nothing : 0 for i in 1:3]
[rand()>0.8 ? nothing : 0 for i in 1:10]
The main trick is that you just need to know the type of the object that is returned by LinearInterpolation, and then you can specify that instead of Any when constructing the array. To determine that, let's look at the typeof one of these objects
julia> typeof(LinearInterpolation(arr_x,arr_x.^2))
Interpolations.Extrapolation{Float64, 1, ScaledInterpolation{Float64, 1, Interpolations.BSplineInterpolation{Float64, 1, Vector{Float64}, BSpline{Linear{Throw{OnGrid}}}, Tuple{Base.OneTo{Int64}}}, BSpline{Linear{Throw{OnGrid}}}, Tuple{LinRange{Float64}}}, BSpline{Linear{Throw{OnGrid}}}, Throw{Nothing}}
This gives a fairly complicated type, but we don't necessarily need to use the whole thing (though in some cases it might be more efficient to). So for instance, we can say
using Interpolations
n,m=5,5
abstract_arr=Array{Interpolations.Extrapolation}(undef,n+1,m+1)
arr_x=LinRange(1,10,100)
for l in 1:n
for alpha in 1:m
abstract_arr[l,alpha]=LinearInterpolation(arr_x,alpha.*arr_x.^n)
end
end
which gives us a result of type
julia> typeof(abstract_arr)
Matrix{Interpolations.Extrapolation} (alias for Array{Interpolations.Extrapolation, 2})
Since the return type of this LinearInterpolation does not seem to be of known size, and
julia> isbitstype(typeof(LinearInterpolation(arr_x,arr_x.^2)))
false
each assignment to this array will still trigger allocations, and consequently there actually may not be much or any performance gain from the added type stability when it comes to filling the array. Nonetheless, there may still be performance gains down the line when it comes to using values stored in this array (depending on what is subsequently done with them).

Does an equivalent function in OCaml exist that works the same way as "set!" in Scheme?

I'm trying to make a function that defines a vector that varies based on the function's input, and set! works great for this in Scheme. Is there a functional equivalent for this in OCaml?
I agree with sepp2k that you should expand your question, and give more detailed examples.
Maybe what you need are references.
As a rough approximation, you can see them as variables to which you can assign:
let a = ref 5;;
!a;; (* This evaluates to 5 *)
a := 42;;
!a;; (* This evaluates to 42 *)
Here is a more detailed explanation from http://caml.inria.fr/pub/docs/u3-ocaml/ocaml-core.html:
The language we have described so far is purely functional. That is, several evaluations of the same expression will always produce the same answer. This prevents, for instance, the implementation of a counter whose interface is a single function next : unit -> int that increments the counter and returns its new value. Repeated invocation of this function should return a sequence of consecutive integers — a different answer each time.
Indeed, the counter needs to memorize its state in some particular location, with read/write accesses, but before all, some information must be shared between two calls to next. The solution is to use mutable storage and interact with the store by so-called side effects.
In OCaml, the counter could be defined as follows:
let new_count =
let r = ref 0 in
let next () = r := !r+1; !r in
next;;
Another, maybe more concrete, example of mutable storage is a bank account. In OCaml, record fields can be declared mutable, so that new values can be assigned to them later. Hence, a bank account could be a two-field record, its number, and its balance, where the balance is mutable.
type account = { number : int; mutable balance : float }
let retrieve account requested =
let s = min account.balance requested in
account.balance <- account.balance -. s; s;;
In fact, in OCaml, references are not primitive: they are special cases of mutable records. For instance, one could define:
type 'a ref = { mutable content : 'a }
let ref x = { content = x }
let deref r = r.content
let assign r x = r.content <- x; x
set! in Scheme assigns to a variable. You cannot assign to a variable in OCaml, at all. (So "variables" are not really "variable".) So there is no equivalent.
But OCaml is not a pure functional language. It has mutable data structures. The following things can be assigned to:
Array elements
String elements
Mutable fields of records
Mutable fields of objects
In these situations, the <- syntax is used for assignment.
The ref type mentioned by #jrouquie is a simple, built-in mutable record type that acts as a mutable container of one thing. OCaml also provides ! and := operators for working with refs.

Erlang binary matching efficiency

What would be the difference between matching like:
fun(Binary) ->
[Value, Rest] = binary:split(Binary, <<101>>)
end
and
fun(Binary) ->
[Value, <<Rest/binary>>] = binary:split(Binary, <<101>>)
end
I am thinking that one may simply increment a counter as it traverses the binary and keep the sub binary pointer and the other will copy a new binary. Any ideas?
I can think of pattern matching in two ways.
Method 1:
[A,B] = [<<"abcd">>,<<"fghi">>]
Method 2:
[A, <<B/binary>>] = [<<"abcd">>,<<"fghi">>]
Unless you need to make it sure B is binary, Method 2 will take it longer, few micro seconds, because it's not just assigning <<"fghi">> to B, but also make it sure it is bianary.
However if you need more parsing than method 2, you can go further, which method 1 can't do.
[A, <<B:8, Rest/binary>>] = [<<"abcd">>,<<"fghi">>].
I think you could test it by timer module's tc/N function.

Populating a list is Scala with random double taking forever

I am new to Scala and am trying to get a list of random double values:
The thing is, when I try to run this, it takes way too long compared to its Java counterpart. Any ideas on why this is or a suggestion on a more efficient approach?
def random: Double = java.lang.Math.random()
var f = List(0.0)
for (i <- 1 to 200000)
( f = f ::: List(random*100))
f = f.tail
You can also achieve it like this:
List.fill(200000)(math.random)
the same goes for e.g. Array ...
Array.fill(200000)(math.random)
etc ...
You could construct an infinite stream of random doubles:
def randomList(): Stream[Double] = Stream.cons(math.random, randomList)
val f = randomList().take(200000)
This will leverage lazy evaluation so you won't calculate a value until you actually need it. Even evaluating all 200,000 will be fast though. As an added bonus, f no longer needs to be a var.
Another possibility is:
val it = Iterator.continually(math.random)
it.take(200000).toList
Stream also has a continually method if you prefer.
First of all, it is not taking longer than java because there is no java counterpart. Java does not have an immutable list. If it did, performance would be about the same.
Second, its taking a lot of time because appending lists have linear performance, so the whole thing has quadratic performance.
Instead of appending, prepend, which had constant performance.
if your using mutable state anyways you should use a mutable collection like buffer which you can add too with += (which then would be the real counterpart to java code).
but why dont u use list comprehension?
val f = for (_ <- 1 to 200000) yield (math.random * 100)
by the way: var f = List(0.0) ... f = f.tail can be replaced by var f: List[Double] = Nil in your example. (no more performance but more beauty ;)
Yet more options! Tail recursion:
def randlist(n: Int, part: List[Double] = Nil): List[Double] = {
if (n<=0) part
else randlist(n-1, 100*random :: part)
}
or mapped ranges:
(1 to 200000).map(_ => 100*random).toList
Looks like you want to use Vector instead of List. List has O(1) prepend, Vector has O(1) append. Since you are appending, but using concatenation, it'll be faster to use Vector:
def random: Double = java.lang.Math.random()
var f: Vector[Double] = Vector()
for (i <- 1 to 200000)
f = f :+ (random*100)
Got it?

Resources