Canonical vec operator in Eigen

Canonical vec operator in Eigen - eigen

What is the best (canonical) way to implement a vec-operator in eigen? I am sure that this ubiquitous operator would be implemented, but I can't find it in the documentation.
Currently, for matrix M, I do:
Eigen::Map<Eigen::VectorXd> tmp(nullptr, 0);
new (&tmp) Eigen::Map<Eigen::VectorXd>(M.data(), M.size());

There is conservativeResize, which is not you're looking for, as that may actually change some of the values. Personally, I use your method (except for the placement new). You can also specify that the Map is aligned. The Map is a way of looking at the same memory with a different view (e.g. reshaping). If you want an actual reshaped object, you can create a copy (via a Map). You can also use resize which is a no-op if the size is the same. I don't think resizing will work of fixed size matrices, but a Map will. Run the code below for an example.
Eigen::MatrixXf mat;
mat.resize(3,3);
mat << 1, 2, 3, 4, 5, 6, 7, 8, 9;
std::cout << mat << "\n\n";
mat.resize(9, 1);
std::cout << mat << "\n\n";
mat.conservativeResize(1, 9);
std::cout << mat << "\n\n";

Related

Java Hipparchus Eigenvector and C++ Eigen Eigenvector have different results

While comparing the eigen vector results from Java Hipparchus and C++ Eigen, I'm getting different results. They are transposed for the most part as well as 2 elements not matching. Why are libraries returning different values?
Java Hipparchus
import org.hipparchus.linear.Array2DRowRealMatrix;
import org.hipparchus.linear.EigenDecomposition;
import org.hipparchus.linear.RealVector;
...
double[][] example_matrix = {
{ 1, 2, 3 },
{ 3, 2, 1 },
{ 2, 1, 3 }
};
RealMatrix P = new Array2DRowRealMatrix(example_matrix , true);
EigenDecomposition eigenDecomposition = new EigenDecomposition(P);
RealVector[] eigenvectors = new RealVector[3];
for (int i = 0; i < 3; i++) {
System.out.println(eigenDecomposition.getEigenvector(i));
}
// prints:
// {-0.7641805281; 0.6105725033; 0.2079166628}
// {0.5776342875; 0.5776342875; 0.5776342875}
// {-0.0235573892; -0.9140029063; 0.6060826741}
C++ Eigen
Eigen::Matrix<double, 3, 3> matrix;
matrix <<
1, 2, 3,
3, 2, 1,
2, 1, 3;
Eigen::EigenSolver<Eigen::Matrix<double, 3, 3>> eigen_decomposition{ matrix };
eigen_decomposition.compute(matrix, true);
const auto eigen_vectors = eigen_decomposition.eigenvectors().real();
std::cout << eigen_vectors.matrix() << "\n"
// prints:
// -0.764181 0.57735 -0.0214754
// 0.610573 0.57735 -0.833224
// 0.207917 0.57735 0.552518

While not identical the results of the two are actually both correct:
In Java Hipparchus the eigenvectors are not normalised (the vector norm is not 1!): This is in particular apparent for the last one where the norm is approximately 1.10. If you normalise it then you will see that it corresponds to the last column of the results Eigen gives you. Physically this does not matter as any scalar multiple of an eigenvector is again an eigenvector. Instead the library seems to normalise the matrix spanned by the eigenvectors: The magnitude of the determinant seems to correspond to unity. A matrix with determinant 1 preserves volume so it seems like a logical choice.
The Eigen eigenvectors on the other hand seem to be normalised. The matrix you plot gives you the eigenvectors as the columns. Normalising the individual eigenvectors seems like an equally reasonable choice to me.

How to convert a vector of arrays into 4D array?

I'm trying to play a little bit with Knet.jl and CNNs. Every example I found requires the input for CNN to be in the form of [dim1, dim2, n_of_channels, N] where N is a number of the actual images.
I'm a bit new to Julia and I don't know how to accomplish this.
I loaded images from some private directory and pushed them to a vector, so that their length is N.
images = Vector()
for img_file in readdir(dir)
img = load("$dir/$img_file")
images = vcat(images, [img])
end
typeof(image)
"320-element Array{Any,1}"
However in the following example xtrn is stored as 28x28x1x60000 Array and that is what I would like to accomplish with the private dataset.
using Knet; include(Knet.dir("data","mnist.jl"))
xtrn,ytrn,_,_= mnist()
typeof(xtrn)
Array{Float32,4}
I'm aware of functions as channelview, reshape and it's seems they should provide solution but I played with them a bit and got DimensionMismatch error all the time. I guess there's something I miss.

I don't have the files you are using in your example. But I would use cat in conjunction with a generator. Here's an example of something you can do:
julia> reduce((x,y)->cat(x, y, dims=4), rand(3,3) for _ in 1:3)
3×3×1×3 Array{Float64,4}:
[:, :, 1, 1] =
0.366818 0.847529 0.209042
0.281807 0.467918 0.68881
0.179162 0.222919 0.348935
[:, :, 1, 2] =
0.0418451 0.256611 0.609398
0.65166 0.281397 0.340405
0.11109 0.387638 0.974488
[:, :, 1, 3] =
0.454959 0.37831 0.554323
0.213613 0.980773 0.743419
0.133154 0.782516 0.669733
In order to do this with your files, this might work (untested):
images = reduce((x,y)->cat(x, y, dims=4), load(joinpath(dir, img_file)) for img_file in readdir(dir))
BTW. You should not initialize vectors like this:
images = Vector()
This makes an untyped container, which will have very bad performance. Write e.g.
images = Matrix{Float32}[]
This initializes an empty vector of Matrix{Float32}s.

Just to fill in the answer of DNF, this code results in Array in the form of [dim1, dim2, 1, N]:
images = reduce((x,y)->cat(x, y, dims=4), load(joinpath(dir, img_file)) for img_file in readdir(dir))
I wanted the 3rd dimension to be the channel and hence, the expected output is produced by:
images = reduce((x, y) -> cat(x, y, dims=4), permutedims(channelview(load(joinpath(dir, img_file))), (2, 3, 1)) for img in readdir(dir))

Which boost library should i use to add the intervals as mentioned in the description?

I am working on intervals in a c++ program. I want something like below:
I want to add intervals iteratively in a for loop. Assume my first interval is (0, 5). I want to add an interval (3,6) such that the resulting interval set should be (0,3), (3,6). If my third interval added is (4,7), my resulting interval set should be (0,3), (3,4), (4,7).
Any idea what type of interval container should i use from boost library? Any sample programs?
This is what I have tried.....
int main()
{
icl::interval_map<double, std::string> add_map;
using ival = icl::interval<double>;
add_map.add({ival::open(1., 2.5), "A1"});
std::cout<<"adding first interval-----"<<"\n";
for(auto iter : add_map)
std::cout << iter.first << ": " << iter.second << ", "; std::cout << "\n";
add_map.add({ival::open(1.5, 5.), "B1"});
std::cout<<"adding second interval-----"<<"\n";
for(auto iter : add_map)
std::cout << iter.first << ": " << iter.second << ", "; std::cout << "\n";
//after adding second interval, i have to get something like (1,1.5]: A1, (1.5,5): B1
return 0;
}
But I am getting the following output :
adding first interval-----
(1,2.5): A1,
adding second interval-----
(1,1.5]: A1, (1.5,2.5): A1B1, [2.5,5): B1,

That is how interval_map works: it splits on overlap and creates new intervals. This makes queries for all items given an interval or an element efficient, see this doc page .
What bothers me is that in the text you mention integers, but in the code you use double.
Keep in mind that double intervals may have problems in the limits. For example [2, 2.5) might intersect with (2.5, 3) or not, depending on the rounding. See this post Basic use of function "contains" in Boost ICL: Are some combinations of interval types and functions not implemented?.
If you just want to store intervals, the data structure I think you need is an interval set.

Strange Eigen map behavior with zero stride

From Eigen::Stride docs:
The inner stride is the pointer increment between two consecutive
entries within a given row of a row-major matrix or within a given
column of a column-major matrix.
The outer stride is the pointer increment between two consecutive rows
of a row-major matrix or between two consecutive columns of a
column-major matrix.
Let say I want to create a matrix that consists repeated vector as a row. In python numpy I can use zero length stride to do this. Eigen docs say nothing about zero strides but behavior looks very strange:
typedef Matrix<float, Dynamic, Dynamic, RowMajor> MatrixType;
MatrixType M1(3, 3);
M1 << 1, 2, 3,
4, 5, 6,
7, 8, 9;
Map<MatrixType, 0, Stride<0, 0>> zeroMap(M1.data(), 2, 2);
Map<MatrixType, 0, Stride<2, 0>> oneMap(M1.data(), 2, 2);
cout << "Row stride = 0:" << "\n";
cout << zeroMap << "\n" << "Row stride = 2:" << "\n";
cout << oneMap;
cout << "\n";
Returns the same result in both cases:
Row stride = 0:
1 2
3 4
Row stride = 2:
1 2
3 4
Why are results the same with stride 0 and stride 2?

A stride of 0 at compile time means "natural stride" in Eigen. If you want to repeat a vector multiple times, you should use the .replicate() function:
M1.row(0).replicate<3,1>();
Also have a look at .rowwise().replicate(), .colwise().replicate(), each with template arguments or runtime arguments (depending on what you actually need).

Algorithm to generate all multiset size-n partitions

I've been trying to figure out a way to generate all distinct size-n partitions of a multiset, but so far have come up empty handed. First let me show what I'm trying to archieve.
Let's say we have an input vector of uint32_t:
std::vector<uint32_t> input = {1, 1, 2, 2}
An let's say we want to create all distinct 2-size partitions. There's only two of these, namely:
[[1, 1], [2, 2]], [[1, 2], [1, 2]]
Note that order does not matter, i.e. all of the following are duplicate, incorrect solutions.
Duplicate because order within a permutation group does not matter:
[[2, 1], [1, 2]]
Duplicate because order of groups does not matter:
[[2, 2], [1, 1]]
Not homework of some kind BTW. I encountered this while coding something at work, but by now it is out of personal interest that I'd like to know how to deal with this. The parameters for the work-related problem were small enough that generating a couple thousand duplicate solutions didn't really matter.
Current solution (generates duplicates)
In order to illustrate that I'm not just asking without having tried to come up with a solution, let me try to explain my current algorithm (which generates duplicate solutions when used with multisets).
It works as follows: the state has a bitset with n bits set to 1 for each partition block. The length of the bitsets is size(input) - n * index_block(), e.g. if the input vector has 8 elements and n = 2, then the first partition block uses an 8-bit bitset with 2 bits set to 1, the next partition block uses a 6-bit bitset with 2 bits set to 1, etc.
A partition is created from these bitsets by iterating over each bitset in order and extracting the elements of the input vector with indices equal to the position of 1-bits in the current bitset.
In order to generate the next partition, I iterate over the bitsets in reverse order. The next bitset permutation is calculated (using a reverse of Gosper's hack). If the first bit in the current bitset is not set (i.e. vector index 0 not selected), then that bitset is reset to its starting state. Enforcing that the first bit is always set prevents generating duplicates when creating size-n set partitions (duplicates of the 2nd kind shown above). If the current bitset is equal to its starting value, this step is then repeated for the previous (longer) bitset.
This works great (and very fast) for sets. However, when used with multisets it generates duplicate solutions, since it is unaware that both elements appear more than once in the input vector. Here's some example output:
std::vector<uint32_t> input = {1, 2, 3, 4};
printAllSolutions(myCurrentAlgo(input, 2));
=> [[2, 1], [4, 3]], [[3, 1], [4, 2]], [[4, 1], [3, 2]]
std::vector<uint32_t> input = {1, 1, 2, 2};
printAllSolutions(myCurrentAlgo(input, 2));
=> [[1, 1], [2, 2]], [[2, 1], [2, 1]], [[2, 1], [2, 1]]
That last (duplicate) solution is generated simply because the algorithm is unaware of duplicates in the input, it generates the exact same internal states (i.e. which indices to select) in both examples.
Wanted solution
I guess it's pretty clear by now what I'm trying to end up with. Just for the sake of completeness, it would look somewhat as follows:
std::vector<uint32_t> multiset = {1, 1, 2, 2};
MagicClass myGenerator(multiset, 2);
do {
std::vector<std::vector<uint32_t> > nextSolution = myGenerator.getCurrent();
std::cout << nextSolution << std::endl;
} while (myGenerator.calcNext());
=> [[1, 1], [2, 2]]
[[1, 2], [1, 2]]
I.e. the code would work somewhat like std::next_permutation, informing that is has generated all solutions and has ended back at the "first" solution (for whatever definition of first you want to use, probably lexicographically, but doesn't need to be).
The closest related algorithm I found is Algorithm M from Knuth's The Art of Computer Programming, Volume 4 Part 1, section 7.2.1.5 (p. 430). However, that generates all possible multiset partitions. There is also an exercise in the book (7.2.1.5.69, solution on p. 778) about how to modify Alg. M in order to generate only solutions with at most r partitions. However, that still allows partitions of different sizes (e.g. [[1, 2, 2], [1]] would be a valid output for r = 2).
Any ideas/tricks/existing algorithms on how to go about this? Note that the solution should be efficient, i.e. keeping track of all previously generated solutions, figuring out if the currently generated one is a permutation and if so skipping it, is infeasible because of the rate by which the solution space explodes for longer inputs with more duplicates.

A recursive algorithm to distribute the elements one-by-one could be based on a few simple rules:
Start by sorting or counting the different elements; they don't have to be in any particular order, you just want to group identical elements together. (This step will simplify some of the following steps, but could be skipped.)
{A,B,D,C,C,D,B,A,C} -> {A,A,B,B,D,D,C,C,C}
Start with an empty solution, and insert the elements one by one, using the following rules:
{ , , } { , , } { , , }
Before inserting an element, find the duplicate blocks, e.g.:
{A, , } { , , } { , , }
^dup^
{A, , } {A, , } {A, , }
^dup^ ^dup^
Insert the element into every non-duplicate block with available space:
partial solution: {A, , } {A, , } { , , }
^dup^
insert element B: {A,B, } {A, , } { , , }
{A, , } {A, , } {B, , }
If an identical element is already present, don't put the new element before it:
partial solution: {A, , } {B, , } { , , }
insert another B: {A,B, } {B, , } { , , } <- ILLEGAL
{A, , } {B,B, } { , , } <- OK
{A, , } {B, , } {B, , } <- OK
When inserting an element of which there are another N identical elements, make sure to leave N open spots after the current element:
partial solution: {A, , } {A, , } {B,B, }
insert first D: {A,D, } {A, , } {B,B, } <- OK
{A, , } {A, , } {B,B,D} <- ILLEGAL (NO SPACE FOR 2ND D)
The last group of identical elements can be inserted in one go:
partial solution: {A,A, } {B,B,D} {D, , }
insert C,C,C: {A,A,C} {B,B,D} {D,C,C}
So the algorithm would be something like this:
// PREPARATION
Sort or group input. // {A,B,D,C,C,D,B,A,C} -> {A,A,B,B,D,D,C,C,C}
Create empty partial solution. // { , , } { , , } { , , }
Start recursion with empty partial solution and index at start of input.
// RECURSION
Receive partial solution, index, group size and last-used block.
If group size is zero:
Find group size of identical elements in input, starting at index.
Set last-used block to first block.
Find empty places in partial solution, starting at last-used block.
If index is at last group in input:
Fill empty spaces with elements of last group.
Store complete solution.
Return from recursion.
Mark duplicate blocks in partial solution.
For each block in partial solution, starting at last-used block:
If current block is not a duplicate, and has empty places,
and the places left in current and later blocks is not less than the group size:
Insert element into copy of partial solution.
Recurse with copy, index + 1, group size - 1, current block.
I tested a simple JavaScript implementation of this algorithm, and it gives the correct output.

Here's my pencil and paper algorithm:
Describe the multiset in item quantities, e.g., {(1,2),(2,2)}
f(multiset,result):
if the multiset is empty:
return result
otherwise:
call f again with each unique distribution of one element added to result and
removed from the multiset state
Example:
{(1,2),(2,2),(3,2)} n = 2
11 -> 11 22 -> 11 22 33
11 2 2 -> 11 23 23
1 1 -> 12 12 -> 12 12 33
12 1 2 -> 12 13 23
Example:
{(1,2),(2,2),(3,2)} n = 3
11 -> 112 2 -> 112 233
11 22 -> 113 223
1 1 -> 122 1 -> 122 133
12 12 -> 123 123
Let's solve the problem commented below by m69 of dealing with potential duplicate distribution:
{A,B,B,C,C,D,D,D,D}
We've reached {A, , }{B, , }{B, , }, have 2 C's to distribute
and we'd like to avoid `ac bc b` generated along with `ac b bc`.
Because our generation in the level just above is ordered, the series of identical
counts will be continuous. When a series of identical counts is encountered, make
the assignment for the whole block of identical counts (rather than each one),
and partition that contribution in descending parts; for example,
| identical |
ac b b
ac bc b // descending parts [1,0]
Example of longer block:
| identical block | descending parts
ac bcccc b b b // [4,0,0,0]
ac bccc bc b b // [3,1,0,0]
ac bcc bcc b b // [2,2,0,0]
...

Here's a working solution that makes use of the next_combination function presented by Hervé Brönnimann in N2639. The comments should make it pretty self-explanatory. The "herve/combinatorics.hpp" file contains the code listed in N2639 inside the herve namespace. It's in C++11/14, converting to an older standard should be pretty trivial.
Note that I only quickly tested the solution. Also, I extracted it from a class-based implementation just a couple of minutes ago, so some extra bugs might have crept in. A quick initial test seems to confirm it works, but there might be corner cases for which it won't.
#include <cstdint>
#include <iterator>
#include "herve/combinatorics.hpp"
template <typename BidirIter>
bool next_combination_partition (BidirIter const & startIt,
BidirIter const & endIt, uint32_t const groupSize) {
// Typedefs
using tDiff = typename std::iterator_traits<BidirIter>::difference_type;
// Skip the last partition, because is consists of the remaining elements.
// Thus if there's 2 groups or less, the start should be at position 0.
tDiff const totalLength = std::distance(startIt, endIt);
uint32_t const numTotalGroups = std::max(static_cast<uint32_t>((totalLength - 1) / groupSize + 1), 2u);
uint32_t curBegin = (numTotalGroups - 2) * groupSize;
uint32_t const lastGroupBegin = curBegin - 1;
uint32_t curMid = curBegin + groupSize;
bool atStart = (totalLength != 0);
// Iterate over combinations from back of list to front. If a combination ends
// up at its starting value, update the previous one as well.
for (; (curMid != 0) && (atStart);
curMid = curBegin, curBegin -= groupSize) {
// To prevent duplicates, first element of each combination partition needs
// to be fixed. So move start iterator to the next element. This is not true
// for the starting (2nd to last) group though.
uint32_t const startIndex = std::min(curBegin + 1, lastGroupBegin + 1);
auto const iterStart = std::next(startIt, startIndex);
auto const iterMid = std::next(startIt, curMid);
atStart = !herve::next_combination(iterStart, iterMid, endIt);
}
return !atStart;
}
Edit Below is my quickly thrown together test code ("combopart.hpp" obviously being the file containing the above function).
#include "combopart.hpp"
#include <algorithm>
#include <cstdint>
#include <iostream>
#include <iterator>
#include <vector>
int main (int argc, char* argv[]) {
uint32_t const groupSize = 2;
std::vector<uint32_t> v;
v = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9};
v = {0, 0, 0, 1, 1, 1, 2, 2, 2, 3};
v = {1, 1, 2, 2};
// Make sure contents are sorted
std::sort(v.begin(), v.end());
uint64_t count = 0;
do {
++count;
std::cout << "[ ";
uint32_t elemCount = 0;
for (auto it = v.begin(); it != v.end(); ++it) {
std::cout << *it << " ";
elemCount++;
if ((elemCount % groupSize == 0) && (it != std::prev(v.end()))) {
std::cout << "| ";
}
}
std::cout << "]" << std::endl;
} while (next_combination_partition(v.begin(), v.end(), groupSize));
std::cout << std::endl << "# elements: " << v.size() << " - group size: " <<
groupSize << " - # combination partitions: " << count << std::endl;
return 0;
}
Edit 2 Improved algorithm. Replaced early exit branch with combination of conditional move (using std::max) and setting atStart boolean to false. Untested though, be warned.
Edit 3 Needed an extra modification so as not to "fix" the first element in the 2nd to last partition. The additional code should compile as a conditional move, so there should be no branching cost associated with it.
P.S.: I am aware that the code to generate combinations by #Howard Hinnant (available at https://howardhinnant.github.io/combinations.html) is much faster than the one by Hervé Brönnimann. However, that code can not handle duplicates in the input (because as far as I can see, it never even dereferences an iterator), which my problem explicitly requires. On the other hand, if you know for sure your input won't contain duplicates, it is definitely the code you want use with my function above.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Canonical vec operator in Eigen - eigen

Related

Java Hipparchus Eigenvector and C++ Eigen Eigenvector have different results

How to convert a vector of arrays into 4D array?

Which boost library should i use to add the intervals as mentioned in the description?

Strange Eigen map behavior with zero stride

Algorithm to generate all multiset size-n partitions

Categories

Resources