vectorized indexing of matrices with other matrices (in octave) - matrix

Suppose we have a 2D (5x5) matrix:
test =
39 13 90 5 71
60 78 38 4 11
87 92 46 45 35
40 96 61 17 1
90 50 46 89 63
And a second 2D (5x2) matrix:
tidx =
1 3
2 4
2 3
2 4
4 5
And now we want to use tidx as an idex into test, so that we get the following output:
out =
39 90
78 4
92 46
96 17
89 63
One way to do this is with a for loop...
for i=1:size(test,1)
out(i,:) = test(i,tidx(i,:));
end
Question:
Is there a way to vectorize this so the same output is generated without a for loop?

Here is one way:
test(repmat([1:rows(test)]',1,columns(tidx)) + (tidx-1)*rows(test))
What you describe is an index problem. When you place a matrix all in one dimension, you get
test(:) =
39
60
87
40
90
13
78
92
96
50
90
38
46
61
46
5
4
45
17
89
71
11
35
1
63
This can be indexed using a single number. Here is how you figure out how to transform tidx into the correct format.
First, I use the above reference to figure out the index numbers which are:
outinx =
1 11
7 17
8 13
9 19
20 25
Then I start trying to figure out the pattern. This calculation gives a clue:
(tidx-1)*rows(test) =
0 10
5 15
5 10
5 15
15 20
This will move the index count to the correct column of test. Now I just need the correct row.
outinx-(tidx-1)*rows(test) =
1 1
2 2
3 3
4 4
5 5
This pattern is created by the for loop. I created that matrix with:
[1:rows(test)]' * ones(1,columns(tidx))
*EDIT: This does the same thing with a built in function.
repmat([1:rows(test)]',1,columns(tidx))
I then add the 2 together and use them as the index for test.

Related

How to subset rows from one dataframe based on matching values from a second smaller data frame in R

I want to select a control group from one data frame based of matching the age from a second data frame. As an example I have subject.df
subject.df
id age
1 1 55
2 2 62
3 3 73
4 4 54
5 5 66
I'd like to subset control.df based off of matching the age directly on a 1 to 1 matching from the subject.df dataframe.
control.df
id age
6 6 66
7 7 71
8 8 80
9 9 51
10 10 55
11 11 56
12 12 77
13 13 62
14 14 64
15 15 73
16 16 67
17 17 54
18 18 75
19 19 77
20 20 78
21 21 53
22 22 64
23 23 83
24 24 61
25 25 77
I'm fairly new to R. In the past I've used Matlab and in this instance would use a for loop to iterate over the control.df dataframe, but I've been told that R doesn't always like for loops and that it can be computationally difficult in R.
In the end I'll be doing this on a much larger data set where the subject group is around 250 and the control group is more than 40K so I know that 1:1 matching is possible.

Julia: How to insert a specific row of matrix inside a specific row of another one

I have the following matrix:
L = [3 6 18 92 2
2 24 39 59 3];
I intend to enter the first row of matrix L into the 2nd row of the following matrix:
X = [2 7 43 52 1
4 21 14 97 4
3 17 27 85 5];
And the result should be:
Xnew = [2 7 43 52 1
3 6 18 92 2
4 21 14 97 4
3 17 27 85 5];
How can I do that in Julia?
This is a way to do it:
julia> #views [X[1:1, :]; L[1:1, :]; X[2:end, :]]
4×5 Matrix{Int64}:
2 7 43 52 1
3 6 18 92 2
4 21 14 97 4
3 17 27 85 5
You could get the same without #views but it would be less efficient as it would create intermediate copies of data.

Is there a way to sum pairwise in Octave, vectorized (ie. mapping and reducing matrices)?

Is there a way to sum pairwise in Octave?
If for example, I have a 10-row by 4 column. I want a new 10 row by 2 column, where each column is the sum of the pairs.
ex.
[ 1 2 3 4
2 3 4 5
...
]
=> [ 3 7
5 9
...
]
I know how to accomplish this using for loops and accumarray etc, but I'm just not sure if there's a way to do it that is completely vectorized.
Here are a few more options.
Given:
a = reshape(1:40, 10, 4)
a =
1 11 21 31
2 12 22 32
3 13 23 33
4 14 24 34
5 15 25 35
6 16 26 36
7 17 27 37
8 18 28 38
9 19 29 39
10 20 30 40
Keep it simple
b = [sum(a(:,1:2),2) sum(a(:,3:4),2)]
b =
12 52
14 54
16 56
18 58
20 60
22 62
24 64
26 66
28 68
30 70
Squeeze a little
b = squeeze(sum(reshape(a, [], 2, 2), 2))
b =
12 52
14 54
16 56
18 58
20 60
22 62
24 64
26 66
28 68
30 70
Or, my personal favorite...
Mathemagic
b = a * [1 1 0 0; 0 0 1 1].'
b =
12 52
14 54
16 56
18 58
20 60
22 62
24 64
26 66
28 68
30 70
Perhaps someone comes with a better idea:
a = [1 2 3 4; 2 3 4 5]
b = reshape (sum (reshape (a.', 2, [])), [], rows(a)).'
gives
b =
3 7
5 9

How does the "successive passes in opposite direction" improvement work for bubble sort?

According to Data Structures Using C by Tenenbaum, one of the improvements of bubble sort is to have successive passes go in opposite direction so that the small elements move quickly to the front which will reduce the required number of passes [pg 336].
I worked out two examples, one which supports this statement and other which is against this one.
Supports: 25 48 37 12 57 86 33 92
iterations using usual Bubble sort :
25 48 37 12 57 86 33 92
25 37 12 48 57 33 86 92
25 12 37 48 33 57 86 92
12 25 37 33 48 57 86 92
12 25 33 37 48 57 86 92
iterations using improvement:
25 48 37 12 57 86 33 92
25 37 12 48 57 33 86 92
12 25 37 33 48 57 86 92
12 25 33 37 48 57 86 92
against: 3 4 1 2 5
iterations using usual Bubble sort:
3 4 1 2 5
3 1 2 4 5
1 2 3 4 5
iterations using improvement:
3 4 1 2 5
3 1 2 4 5
1 3 2 4 5
1 2 3 4 5
So is the statement incorrect that this improvement will always help? Or I am doing something wrong here ?
The example you gave above shows that this algorithm isn't a strict improvement over a standard bubble sort.
The advantage of this approach (sometimes called "cocktail sort," by the way) is that in cases where there are a lot of small elements at the end of the array, it rapidly pulls them to the front compared against normal bubble sort. For example, consider this array:
2 3 4 5 6 7 8 9 10 11 12 ... 10,000,000 1
With a normal bubble sort, it would take 9,999,999 passes over this array to sort it because the element 1, which is way out of place, only gets swapped one step forward on each iteration. On the other hand, with a cocktail sort, this would take just two passes - one initial pass and then a reverse pass.
While the above example is definitely contrived, in a randomly-shuffled array, there are likely going to be some smaller elements toward the end of the array and the number of passes of bubblesort is going to have to be large to move them back. Going in both directions helps speed this up.
That said, bubblesort is a pretty poor choice of a sorting algorithm, so hopefully this is just a theoretical discussion. :-)

Selecting the "P" in Prune and Search Algorithm

Note: the diagram above shows a partition into groups of 5 (the columns). The horizontal box denotes the median values of each partition. The 'P' item indicates the median of medians.
Most of the researches that I saw have this picture in Selecting their "P" and it always have an odd numbers of elements. But What if the numbers elements you have are even?
ex.
1 2 3 4 5 6 7 8 9 10
11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30
31 32 33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48 49 50
51 52 53 54 55 56 57 58 59 60
how do you get your "P" in an even set of elements?
This explanation gives the detail I think you're looking for:
https://www.cs.duke.edu/courses/summer10/cps130/files/Edelsbrunner_Median.pdf
The median of the set plays a special role in this algorithm, and it
is defined as the i-smallest item where i = (n+1)/2 if n is odd and i =
n/2 or (n+2)/2 if n is even.

Resources