Stata: Transposing panel rows to column - panel

I am trying to rearrange the following panel data set into a form where I can merge with another. I would like to transform this:
Gender Year IndA IndB IndC
1 2008 0.22 0.34 0.45
2 2008 0.78 0.66 0.55
1 2009 0.25 0.36 0.49
2 2009 0.75 0.64 0.51
1 2010 0.28 0.38 0.48
2 2010 0.72 0.62 0.52
Into:
(ID) Year Industry 1 2
1 2008 A 0.22 0.78
2 2009 A 0.25 0.75
3 2010 A 0.28 0.72
4 2008 B 0.34 0.66
5 2009 B 0.36 0.64
6 2010 B 0.38 0.62
7 2008 C 0.45 0.55
8 2009 C 0.49 0.51
9 2010 C 0.38 0.62
I am new to Stata and am having difficulties reshaping both the columns and the genders.

See help reshape. One way to do this is consecutive reshapes. You can execute the first line, look at the data in the data browser, then execute the second line to see how this works. You will also need to choose a name other than 1 and 2 for the final variables.
reshape long Ind, i(Year Gender) j(Industry) string
reshape wide Ind, i(Year Industry) j(Gender)

You can also replace the first reshape with a stack (less legible, but can sometimes be faster than a reshape):
stack Gender Year IndA Gender Year IndB Gender Year IndC, into(Gender Year Y) clear
rename _stack Industry
lab define Industry 1 "A" 2 "B" 3 "C"
lab val Industry Industry
reshape wide Y, i(Industry Year) j(Gender)
sort Industry Year
gen id = _n
order id Year Industry
list, sepby(Industry) noobs

As a third variation on the same theme, note that proportions for the two Genders sum to 1, so we only need one.
clear
input Gender Year IndA IndB IndC
1 2008 0.22 0.34 0.45
2 2008 0.78 0.66 0.55
1 2009 0.25 0.36 0.49
2 2009 0.75 0.64 0.51
1 2010 0.28 0.38 0.48
2 2010 0.72 0.62 0.52
end
drop if Gender == 1
drop Gender
reshape long Ind , i(Year) j(Type) string
list , sepby(Year)
+-------------------+
| Year Type Ind |
|-------------------|
1. | 2008 A .78 |
2. | 2008 B .66 |
3. | 2008 C .55 |
|-------------------|
4. | 2009 A .75 |
5. | 2009 B .64 |
6. | 2009 C .51 |
|-------------------|
7. | 2010 A .72 |
8. | 2010 B .62 |
9. | 2010 C .52 |
+-------------------+

Related

Fi score -Sklearn

What is the F1-score of the model in the following? I used scikit learn package.
print(classification_report(y_true, y_pred, target_names=target_names))
precision recall f1-score support
<BLANKLINE>
class 0 0.50 1.00 0.67 1
class 1 0.00 0.00 0.00 1
class 2 1.00 0.67 0.80 3
<BLANKLINE>
accuracy 0.60 5
macro avg 0.50 0.56 0.49 5
weighted avg 0.70 0.60 0.61 5
This article explains it pretty well
Basically it's
F1 = 2 * precision * recall / (precision + recall)

How to transform a correlation matrix into a single row?

I have a 200x200 correlation matrix text file that I would like to turn into a single row.
e.g.
a b c d e
a 1.00 0.33 0.34 0.26 0.20
b 0.33 1.00 0.40 0.48 0.41
c 0.34 0.40 1.00 0.59 0.35
d 0.26 0.48 0.59 1.00 0.43
e 0.20 0.41 0.35 0.43 1.00
I want to turn it into:
a_b a_c a_d a_e b_c b_d b_e c_d c_e d_e
0.33 0.34 0.26 0.20 0.40 0.48 0.41 0.59 0.35 0.43
I need a code that can:
1. Join the variable names to make a single row of headers (e.g. turn "a" and "b" into "a_b") and
2. Turn only one half of the correlation matrix (bottom or top triangle) into a single row
A bit of extra information: I have around 500 participants in a study and each of them has a correlation matrix file. I want to consolidate these separate data files into one file where each row is one participant's correlation matrix.
Does anyone know how to do this?
Thanks!!

What's the fastest way to unroll a matrix in MATLAB?

How do I turn a matrix:
[ 0.12 0.23 0.34 ;
0.45 0.56 0.67 ;
0.78 0.89 0.90 ]
into a 'coordinate' matrix with a bunch of rows?
[ 1 1 0.12 ;
1 2 0.23 ;
1 3 0.34 ;
2 1 0.45 ;
2 2 0.56 ;
2 3 0.67 ;
3 1 0.78 ;
3 2 0.89 ;
3 3 0.90 ]
(permutation of the rows is irrelevant, it only matters that the data is in this structure)
Right now I'm using a for loop but that takes a long time.
Here is an option using ind2sub:
mat= [ 0.12 0.23 0.34 ;
0.45 0.56 0.67 ;
0.78 0.89 0.90 ] ;
[I,J] = ind2sub(size(mat), 1:numel(mat));
r=[I', J', mat(:)]
r =
1.0000 1.0000 0.1200
2.0000 1.0000 0.4500
3.0000 1.0000 0.7800
1.0000 2.0000 0.2300
2.0000 2.0000 0.5600
3.0000 2.0000 0.8900
1.0000 3.0000 0.3400
2.0000 3.0000 0.6700
3.0000 3.0000 0.9000
Note that the indices are reversed compared to your example.
A = [ .12 .23 .34 ;
.45 .56 .67 ;
.78 .89 .90 ];
[ii jj] = meshgrid(1:size(A,1),1:size(A,2));
B = A.';
R = [ii(:) jj(:) B(:)];
If you don't mind a different order (according to your edit), you can do it more easily:
[ii jj] = ndgrid(1:size(A,1),1:size(A,2));
R = [ii(:) jj(:) A(:)];
In addition to generating the row/col indexes with meshgrid, you can use all three outputs of find as follows:
[II,JJ,AA]= find(A.'); %' note the transpose since you want to read across
M = [JJ II AA]
M =
1 1 0.12
1 2 0.23
1 3 0.34
2 1 0.45
2 2 0.56
2 3 0.67
3 1 0.78
3 2 0.89
3 3 0.9
Limited application because zeros get lost. Nasty, but correct workaround (thanks user664303):
B = A.'; v = B == 0; %' transpose to read across, otherwise work directly with A
[II, JJ, AA] = find(B + v);
M = [JJ II AA-v(:)];
Needless to say, I would recommend one of the other solutions. :) In particular, ndgrid is the most natural solution to obtaining the row,col inds.
I find ndgrid to be the most natural solution, but here's a fun way to do it manually with the odd couple of kron and repmat:
M = [kron(1:size(A,2),ones(1,size(A,1))).' ... %' row indexes
repmat((1:size(A,1))',size(A,2),1) ... %' col indexes
reshape(A.',[],1)] %' matrix values, read across
Simple adjustment to read down, as is natural in MATLAB:
M = [repmat((1:size(A,1))',size(A,2),1) ... %' row indexes (still)
kron(1:size(A,2),ones(1,size(A,1))).' ... %' column indexes
A(:)] % matrix values, read down
(Also since my first answer was obscenely hackish.)
I also find kron to be a nice tool to replicate each element at a time rather than than the entire array at a time, as repmat does. For example:
>> 1:size(A,2)
ans =
1 2 3
>> kron(1:size(A,2),ones(1,size(A,1)))
ans =
1 1 1 2 2 2 3 3 3
Taking this a bit further, we can generate a new function called repel to replicate elements of an array as opposed to the whole array:
>> repel = #(x,m,n) kron(x,ones(m,n));
>> repel(1:4,1,2)
ans =
1 1 2 2 3 3 4 4
>> repel(1:3,2,2)
ans =
1 1 2 2 3 3
1 1 2 2 3 3

Filter Data In a Cleaner/More Efficient Way

I have a set of data with a bunch of columns. Something like the following (in reality my data has about half a million rows):
big = [
1 1 0.93 0.58;
1 2 0.40 0.34;
1 3 0.26 0.31;
1 4 0.40 0.26;
2 1 0.60 0.04;
2 2 0.84 0.55;
2 3 0.53 0.72;
2 4 0.00 0.39;
3 1 0.27 0.51;
3 2 0.46 0.18;
3 3 0.61 0.01;
3 4 0.07 0.04;
4 1 0.26 0.43;
4 2 0.77 0.91;
4 3 0.49 0.80;
4 4 0.40 0.55;
5 1 0.77 0.40;
5 2 0.91 0.28;
5 3 0.80 0.65;
5 4 0.05 0.06;
6 1 0.41 0.37;
6 2 0.11 0.87;
6 3 0.78 0.61;
6 4 0.87 0.51
];
Now, let's say I want to get rid of the rows where the first column is a 3 or a 6.
I'm doing that like so:
filterRows = [3 6];
for i = filterRows
big = big(~ismember(1:size(big,1), find(big(:,1) == i)), :);
end
Which works, but the loop makes me think I'm missing a more efficient trick. Is there a better way to do this?
Originally I tried:
big(find(big(:,1) == filterRows ),:) = [];
but of course that doesn't work.
Use logical indexing:
rows = (big(:, 1) == 3 | big(:, 1) == 6);
big(rows, :) = [];
In the general case, where the values of the first column are stored in filterRows, you can generate the logical vector rows with ismember:
rows = ismember(big(:, 1), filterRows);
or with bsxfun:
rows = any(bsxfun(#eq, big(:, 1), filterRows(:).'), 2);

Pivot Table in Oracle 11g

Could you please help me to figure out the pivot table? here is the first table :
Date 1 2 3 4 5
-----------------------------------------
20130101 0.12 0.13 0.43 0.32 0.22
20130102 0.22 0.31 0.13 0.31 0.29
20130103 0.32 0.12 0.33 0.12 0.34
I want this table to be like this :
Date Number Values
---------------------------
20130101 1 0.12
20130101 2 0.13
20130101 3 0.43
20130101 4 0.32
20130102 5 0.22
20130102 1 0.22
20130102 2 0.31
20130102 3 0.13
20130102 4 0.31
20130102 5 0.29
20130103 1 0.32
20130103 2 0.12
20130103 3 0.33
20130103 4 0.12
20130103 5 0.34
I've tried to find the specific query for this like using "decode", but it didn't work for me.
here is a website that I've tried :
Advice Using Pivot Table in Oracle.
Could you please help me to figure this out?
Thank you so much for your help.
You don't need a PIVOT but an UNPIVOT
SELECT *
FROM table1
unpivot
(
"Values" FOR "Number" IN ("1","2","3","4","5")
);
Here is a sqlfiddle demo

Resources