why does Julia uses column major? is it fast - performance

I have read many articles on julia and its performance. but no where, i can find clue about why julia team decided to use column major for matrix operations. is it because thier way of operating on matrix fits on column major or something.
Advance thanks.

"Multidimensional arrays in Julia are stored in column-major order. This means that arrays are stacked one column at a time. This can be verified using the vec function or the syntax [:] ..."
"This convention for ordering arrays is common in many languages like Fortran, Matlab, and R (to name a few). The alternative to column-major ordering is row-major ordering, which is the convention adopted by C and Python (numpy) among other languages."
For examples and discussion of performance differences, see the Performance Tips section of Julia's Manual.

Related

Tuple v/s StaticVectors in Julia

If I understand correctly, then as tuples are immutable in Julia, they must also be stack allocated (similar to StaticVectors). So there should be not any advantage of using StaticVectors in place of Tuples when I am dealing with small vectors say, a length 3 vector for coordinates of a particle. Can someone highlight the advantages of using StaticVectors in such cases. And more broadly what will be the use cases where I would possible want to choose using one over the other?
Thanks.
The raw performance is similar, since StaticArrays are built on tuples. The point of StaticArrays is all the functionality, the linear algebra, the solvers, sorting, the mutable arrays, etc.
Tuples are a barebones data collection with barely any mathematical structure. That's fine as far as it goes, but StaticArrays has done most of the work you would have to do yourself with tuples.

Is there a canonical/performant way to reduce arrays/matrices by removing the border values?

A motivating issue, implemented in Matlab:
N = 1000;
R = zeros(2*N);
for i=0:N-1
R = R(2:end-1, 2:end-1);
end
For this code timeit() gives a time 2.9793 on my machine. It isn't really great.
By canonical way I mean a discussion that isn't just acceptable, but a performant implementation that respects very large matrices reduced. I would be very appreciative of any answer, referrals to other discussions or literature.
As for language, I am not really a programmer, this question is motivated by a mathematics inquiry and I have encountered performance issues implementing any such reduction process in Matlab. Is there a solution to this in Matlab, or must one delve into the scary depths of C/C++?
One note: One may ask, why not just keep the matrix as is and consider parts of it as needed? To clarify, the reduction process in practice of course depends on the actual (nonzero) values of the elements, e.g. by processing the matrix in 2x2 blocks, and the removal of edge-values is needed to prepare the matrix for then next reduction step.
R(2:end-1, 2:end-1) is the correct way of extracting the part of the array that is all values except the ones at the edges. This requires copying the data, so will take some time. There is no legal way around the copy, and no alternative for extracting a part of the array. (subsref might seems like an alternative, but is the function that is internally called for the given syntax.)
As for illegal ways, you could try James Tursa’s sharedchild from the MATLAB FileExchange. It allows to create an array that references subsets of the data of another array. James is well known in the MATLAB user community as one of the people reverse-engineering the system and bending it to his will. This is solid code. But every version of MATLAB introduces new changes to the infrastructure, so upgrading MATLAB might break your program if you use this code.
You don't need the for loop. If you want to remove L elements from the borders, simply do:
R=R(L+1:end-L, L+1:end-L)
I am surprised you didn't get an error with that code. I think you should end up with an empty matrix at the end of the loop.

Automatic probability densities

I have found automatic differentiation to be extremely useful when writing mathematical software. I now have to work with random variables and functions of the random variables, and it seems to me that an approach similar to automatic differentiation could be used for this, too.
The idea is to start with a basic random vector with given multivariate distribution and then you want to work with the implied probability distributions of functions of components of the random vector. The idea is to define operators that automatically combine two probability distributions appropriately when you add, multiply, divide two random variables and transform the distribution appropriately when you apply scalar functions such as exponentiation. You could then combine these to build any function you need of the original random variables and automatically have the corresponding probability distribution available.
Does this sound feasible? If not, why not? If so and since it's not a particularly original thought, could someone point me to an existing implementation, preferably in C
There has been a lot of work on probabilistic programming. One issue is that as your distribution gets more complicated you start needing more complex techniques to sample from it.
There are a number of ways this is done. Probabilistic graphical models gives one vocabulary for expressing these models, and you can then sample from them using various Metropolis-Hastings-style methods. Here is a crash course.
Another model is Probabilistic Programming, which can be done through an embedded domain specific language, directly. Oleg Kiselyov's HANSEI is an example of this approach. Once they have the program they can inspect the tree of decisions and expand them out by a form of importance sampling to gain the most information possible at each step.
You may also want to read "Nonstandard Interpretations of Probabilistic
Programs for Efficient Inference" by Wingate et al. which describes one way to use extra information about the derivative of your distribution to accelerate Metropolis-Hastings-style sampling techniques. I personally use automatic differentiation to calculate those derivatives and this brings the topic back to automatic-differentiation. ;)

What are the differences between genetic algorithms and evolution strategies?

I've read a couple of introductory sections of books as well as a few papers on both topics, and it looks to me that these two methods are pretty much exactly the same. That said, I haven't had the time to actually deeply research the topics yet, so I might be wrong.
What are the distinctions between genetic algorithms and evolution strategies? What makes them different, and where are they similar?
In evolution strategies, the individuals are coded as vectors of real numbers. On reproduction, parents are selected randomly and the fittest offsprings are selected and inserted in the next generation. ES individuals are self-adapting. The step size or "mutation strength" is encoded in the individual, so good parameters get to the next generation by selecting good individuals.
In genetic algorithms, the individuals are coded as integers. The selection is done by selecting parents proportional to their fitness. So individuals must be evaluated before the first selection is done. Genetic operators work on the bit-level (e.g. cutting a bit string into multiple pieces and interchange them with the pieces of the other parent or switching single bits).
That's the theory. In practice, it is sometimes hard to distinguish between both evolutionary algorithms, and you need to create hybrid algorithms (e.g. integer (bit-string) individuals that encodes the parameters of the genetic operators).
Just stumbled on this thread when researching Evolution Strategies (ES).
As Paul noticed before, the encoding is not really the difference here, as this is an implementation detail of specific algorithms, although it seems more common in ES.
To answer the question, we first need to do a small step back and look at internals of an ES algorithm.
In ES there is a concept of endogenous and exogenous parameters of the evolution. Endogenous parameters are associated with individuals and therefore are evolved together with them, exogenous are provided from "outside" (e.g. set constant by the developer, or there can be a function/policy which sets their value depending on the iteration no).
The individual k consists therefore of two parts:
y(k) - a set of object parameters (e.g. a vector of real/int values) which denote the individual genotype
s(k) - a set of strategy parameters (e.g. a vector of real/int values again) which e.g. can control statistical properties of mutation)
Those two vectors are being selected, mutated, recombined together.
The main difference between GA and ES is that in classic GA there is no distinction between types of algorithm parameters. In fact all the parameters are set from "outside", so in ES terms are exogenous.
There are also other minor differences, e.g. in ES the selection policy is usually one and the same and in GA there are multiple different approaches which can be interchanged.
You can find a more detailed explanation here (see Chapter 3): Evolution strategies. A comprehensive introduction
In most newer textbooks on GA, real-valued coding is introduced as an alternative to the integer one, i.e. individuals can be coded as vectors of real numbers. This is called continuous parameter GA (see e.g. Haupt & Haupt, "Practical Genetic Algorithms", J.Wiley&Sons, 1998). So this is practically identical to ES real number coding.
With respect to parent selection, there are many different strategies published for GA's. I don't know them all, but I assume selection among all (not only the best has been used for some applications).
The main difference seems to be that a genetic algorithm represents a solution using a sequence of integers, whereas an evolution strategy uses a sequence of real numbers -- reference: http://en.wikipedia.org/wiki/Evolutionary_algorithm#
As the wikipedia source (http://en.wikipedia.org/wiki/Genetic_algorithm) and #Vaughn Cato said the difference in both techniques relies on the implementation. EA use
real numbers and GA use integers.
However, in practice I think you could use integers or real numbers in the formulation of your problem and in your program. It depends on you. For instance, for protein folding you can say the set of dihedral angles form a vector. This is a vector of real numbers, but the entries
are labeled by integers so I think you can formulate your problem and write you program based
on an integer arithmetic. It is just an idea.

How to speed up 'sort' function in Matlab?

I was using the built-in sort function of Matlab:
[temp, Idx] = sort(M,2);
I would like to have the sorted index of each row of M, which is a matrix of size > 50k.
I searched hard but did not find anything.. It would be greatly appreciated if you have any comments!
To get a sense of how much room for improvement you have, I would suggest writing a test program in C and use qsort or in C++ and user sort and carefully time it on 7000 inputs of size 7000 (or whatever setup you have in Matlab).
I'm going to give you my estimate: probably Matlab's sort runs (on properly vectorized code, like yours) as fast as C++, and you're just seeing the effect of running an algorithm that takes O(n^2 log n). It is reported in Matlab's marketing material that its sort function was faster than C's qsort, but take it with a grain of salt.
The best way to speed up that sort is to get a faster computer. It will speed everything else up too. :)
The fact is, you can rarely speed up a single call to something like a sort. MATLAB is already doing that in an efficient manner, using an optimized code internally. (Reread the carlosdc answer.) The things you can sometimes get a boost on are tools that are written in MATLAB itself.
So, what can you do? Short of buying that new computer, you can look at your overall code. One single sort of that size is never that big of a problem. But the reason for doing that sort over and over again is. Think carefully about the code, about whether you can change the flow or avoid a many times repeated sort. Algorithm change is often a FAR bigger source of improvement than the wee bit you would ever get even if you could improve that sort.
Sorting is fundamentally O(n log n).
As long as you have a reasonably efficient implementation, this is unlikely to change much.
That said, as Andrew Janke's comment suggests, multi-threading can improve things dramatically.
GPU programming can be a way to get massive speedups. If you have R2010b or later, you may be able to use accelerated versions of built-in functions like sort from Mathworks.
Otherwise, write a mex wrapper around the CUDA Thrust library which includes a sort.
You could write your own sort function in C/C++ as MEX. MATLAB documentation has examples for it.
There exist many sort algorithms which are better then other in edge cases, for example almost sorted data or stability (which does not matter in MATLAB because all its types are value types).
Is your data numeric or strings? For strings there are probably special algorithms for ASCII sort, sometimes natural sort is preferable.

Resources