Confused: would correlation be "--" in Statsample? - ruby

I am very new to statsample and having some basic questions. With this sample data:
[[1, 2, 3, 3],[2, 3, 3, 5],[4, 1, 3, 4]]
I create a 4x4 statsample dataaset called ds and get the following output for each call:
puts ds.summary
gets
= Dataset 1
Cases: 3
Element:[actuals]
== Vector 3
n :3
n valid:3
factors:3
mode: 3
Distribution
+---+---+---------+
| 3 | 3 | 100.00% |
+---+---+---------+
Element:[mids]
== Vector 2
n :3
n valid:3
factors:1,2,3
mode: 2
Distribution
+---+---+--------+
| 1 | 1 | 33.33% |
| 2 | 1 | 33.33% |
| 3 | 1 | 33.33% |
+---+---+--------+
Element:[predicteds]
== Vector 4
n :3
n valid:3
factors:3,4,5
mode: 3
Distribution
+---+---+--------+
| 3 | 1 | 33.33% |
| 4 | 1 | 33.33% |
| 5 | 1 | 33.33% |
+---+---+--------+
Element:[prediction_error]
== Vector 5
n :3
n valid:3
factors:0,1,2
mode: 0
Distribution
+---+---+--------+
| 0 | 1 | 33.33% |
| 1 | 1 | 33.33% |
| 2 | 1 | 33.33% |
+---+---+--------+
Element:[uids]
== Vector 1
n :3
n valid:3
factors:1,2,4
mode: 1
Distribution
+---+---+--------+
| 1 | 1 | 33.33% |
| 2 | 1 | 33.33% |
| 4 | 1 | 33.33% |
+---+---+--------+
Which seems reasonable but then:
cm = ds.correlation_matrix
puts cm.summary
gets this, which is confusing:
Correlation Matrix
+------------------+---------+-------+------------+------------------+-------+
| | actuals | mids | predicteds | prediction_error | uids |
+------------------+---------+-------+------------+------------------+-------+
| actuals | 1.000 | -- | -- | -- | -- |
| mids | -- | 1.000 | -- | -- | -- |
| predicteds | -- | -- | 1.000 | -- | -- |
| prediction_error | -- | -- | -- | 1.000 | -- |
| uids | -- | -- | -- | -- | 1.000 |
+------------------+---------+-------+------------+------------------+-------+

You created a dataset with nominal vectors, not scalar ones. So, correlations between not numeric vectors is always 0.

Related

Subtract value row by row in matlab

I have a 1 column matrix with the following values:
*-------*
| 6 |
| 4 |
| 3 |
| 1 |
| 1 |
*-------*
With this function, starting from the first value, I subtract the value in the following row and place 0 at the end. This is the result:
Delta = Ctv_ds_universal(1:(end-1),1)-Ctv_ds_universal(2:end,1);
Delta(end+1)=0;
*-----------*
| 2 (6-4) |
| 1 (4-3) |
| 2 (3-1) |
| 0 (1-1) |
| 0 |
*-----------*
Now, I would like to reverse the order and start subtracting from down to the top, placing 0 at the beginning. How can I modify the function?
*------------*
| 0 |
| -2 (4-6) |
| -1 (3-4) |
| -2 (1-3) |
| 0 (1-1) |
*------------*
Delta = 0;
Delta = [Delta; Ctv_ds_universal(2:end,1)-Ctv_ds_universal(1:end-1,1)];

Combinate of values in a table to get the sum of each combination

I have a table with numeric data that i need make diferent combinations itself.
For example:
| A |
|---|
| 1 |
| 2 |
| 3 |
| 4 |
I need to combine this single column to get the next result:
| A | B | C | D |
| - | - | - | - |
| 1 | | | |
| 1 | 2 | | |
| 1 | 2 | 3 | |
| 1 | 2 | 3 | 4 |
| 1 | 2 | | 4 |
| 1 | | 3 | |
| 1 | | 3 | 4 |
| 1 | | | 4 |
| | 2 | | |
| | 2 | 3 | |
| | 2 | 3 | 4 |
| | 2 | | 4 |
| | | 3 | |
| | | 3 | 4 |
| | | | 4 |
At the end of the table, i have to create a column with the Count of every column that has data and another column that contains the sums of number of each columns.
Maybe it sound very difficult or impossible, but I haven't a way to make it work.
I have try to "Cross Join" from SQL but didn't got the expected result.
Help!
In this case, you can solve this by counting in binary ending with the digits being the number of numbers in the set. etc. the starting set 2568 would end with 1111. this binary number would decide if you show that number in each row. Heres a table of how it would work.
| A |
|---|
| 2 |
| 5 |
| 6 |
| 8 |
A
B
C
D
Binary
Row number
8
0001
1
6
0010
2
6
8
0011
3
5
0100
4
5
8
0101
5
5
6
0110
6
5
6
8
0111
7
2
1000
8
2
8
1001
9
2
6
1010
10
2
6
8
1011
11
2
5
1100
12
2
5
8
1101
13
2
5
6
1110
14
2
5
6
8
1111
15

LINQ Code that counts employee gender in each position and group by department and place in a matrix table

I just want to ask on how to create an LINQ code that can fill up my html table.
Please look at my Tables below
Table EMP: note* my "Male" is boolean
+----+---------------+--------+--------+
| id | Male| JS_REF |DEPT_ID | POS_ID |
+----+---------------+--------+--------+
| 1 | 1 | 1 | 2 | 3 |
| 2 | 0 | 2 | 2 | 3 |
| 3 | 1 | 3 | 1 | 2 |
| 4 | 1 | 2 | 4 | 2 |
| 5 | 1 | 1 | 5 | 5 |
| 6 | 0 | 4 | 6 | 1 |
| 7 | 1 | 1 | 1 | 1 |
| 8 | 0 | 2 | 2 | 3 |
+----+---------------+--------+--------+
Table:JOB_STATUS
+----+--------------------+
| id | JS_REF| JS_TITLE |
+----+--------------------+
| 1 | 1 |Undefined |
| 2 | 2 |Regular |
| 3 | 3 |Contructual |
| 4 | 4 |Probationary|
+----+--------------------+
Table:DEPTS
+----+--------------------+
| id | DEPT_ID| DEPT_NAME |
+----+--------------------+
| 1 | 1 |Admin |
| 2 | 2 |Accounting |
| 3 | 3 |Eginnering |
| 4 | 4 |HR |
+----+--------------------+
Table: POSITIONS
+----+--------------------+
| id | POS_ID| DEPT_NAME |
+----+--------------------+
| 1 | 1 |Clerk |
| 2 | 2 |Accountant |
| 3 | 3 |Bookeeper |
| 4 | 4 |Assistant |
| 5 | 5 |Mechanic |
| 6 | 6 |Staff |
+----+--------------------+
I'd made a static table on what will be the outcome of the LINQ code
Here's the picture:
Here's what i've tried so far:
SELECT tb.DEPT_NAME,TB.JS_TITLE, TB.Male, TB.Female, (TB.Male + TB.Female) AS 'Total Employees' FROM
(
SELECT JS_TITLE,DEPT_NAME,
SUM(CASE WHEN MALE = 1 THEN 1 ELSE 0 END) AS Male,
SUM(CASE WHEN MALE = 0 THEN 1 ELSE 0 END) AS Female
FROM EMP
left join JOB_STATUS on JOB_STATUS.JS_REF = EMP.JS_REF
left join DEPTS on DEPTS.DEPT_ID = EMP.DEPT_ID
GROUP BY JS_TITLE,DEPT_NAME
) AS TB
ORDER BY CASE WHEN TB.MALE IS NULL THEN 1 ELSE 0 END
If anyone can help me or give me some tips on how can I implement this im stuck in this part.
101 is total count for male, 23 for female. (the values are just copy and pasted, that's why the values are the same)
(Actual data result)

Fast algorithm for simple data group

There are several billions rows like this
id | type | groupId
---+------+--------
1 | a |
1 | b |
2 | a |
2 | c |
1 | a |
2 | d |
2 | a |
1 | e |
5 | a |
1 | f |
4 | a |
1 | b |
4 | a |
1 | t |
8 | a |
3 | c |
6 | a |
I need to add groupId for these data, if id same or type same, then its a same groupId, the result like this:
id | type | group
---+------+--------
1 | a | 1
1 | b | 1
2 | a | 1
2 | c | 1
1 | a | 1
2 | d | 1
2 | a | 1
1 | e | 1
5 | a | 1
1 | f | 1
4 | a | 1
1 | b | 1
4 | a | 1
7 | t | 2
8 | g | 3
3 | c | 1
6 | a | 1
I try to use a loop to do this, but its very inefficiency, its need server weeks to finish all this.
This is a classic example where you can use a Quick-Union algorithm.
Computational Limits
Time complexity for grouping N rows : O(N log* N) where log* N is the "number of times needed to take the lg of a number until reaching 1" . eg Log* 10^100 = 3 (approx)
Space complexity : O(N)
Read more on this algorithm:
https://www.youtube.com/watch?v=MaNCMWhYIHo ,
https://www.cs.princeton.edu/~rs/AlgsDS07/01UnionFind.pdf

How to build pivot table through linq using c#

I have got this datatable in my c# code:
Date | Employee | Job1 | Job2 | Job3 |
---------|----------|------|------|-------|
1/1/2012 | A | 1.00 | 1 | 1 |
1/1/2012 | B | 2.5 | 2 | 2 |
1/1/2012 | C | 2.89 | 1 | 4 |
1/1/2012 | D | 4.11 | 2 | 1 |
1/2/2012 | A | 3 | 2 | 5 |
1/2/2012 | B | 2 | 2 | 2 |
1/2/2012 | C | 3 | 3 | 3 |
1/2/2012 | D | 1 | 1 | 1 |
1/3/2012 | A | 5 | 5 | 5 |
1/3/2012 | B | 2 | 2 | 6 |
1/3/2012 | C | 1 | 1 | 1 |
1/3/2012 | D | 2 | 3 | 4 |
2/1/2012 | A | 2 | 2 | 2 |
2/1/2012 | B | 5 | 5 | 2 |
2/1/2012 | D | 2 | 2 | 2 |
2/2/2012 | A | 3 | 3 | 3 |
2/2/2012 | B | 2 | 3 | 3 |
3/1/2012 | A | 4 | 4 | 2 |
Now I want to create another DataTable which would look like this:
Job1
Employee | 1/1/2012 | 1/2/2012 | 1/3/2012 | 2/1/2012 | 2/2/2012 |
---------|----------|----------|----------|----------|----------|
A | 1.00 | 3 | 5 | 2 | 3 |
B | 2.50 | 2 | 2 | 5 | 2 |
C | 2.89 | 3 | 1 | - | |
D | 4.11 | 1 | 2 | 2 | |
Total | 10.50 | 9 | 10 | 9 | 5 |
Please suggest how to make this pivot table using Linq and C#.
var query = from foo in db.Foos
group foo by foo.Date into g
select new {
Date = g.Key,
A = g.Where(x => x.Employee == "A").Sum(x => x.Job1),
B = g.Where(x => x.Employee == "B").Sum(x => x.Job1),
C = g.Where(x => x.Employee == "C").Sum(x => x.Job1),
D = g.Where(x => x.Employee == "D").Sum(x => x.Job1),
Total = g.Sum(x => x.Job1)
};
You can also apply OrderBy(x => x.Date) to query.
You won't be able to do this with LINQ due to the dynamic nature of the columns. LINQ to SQL needs a static way to map result set fields to property values. Instead, you can look into the PIVOT SQL statement and fill the results into a DataTable
http://msdn.microsoft.com/en-us/library/ms177410(v=sql.105).aspx

Resources