Can I calculate TP,TN, FPR and FNR in multiclass - performance

If i classify data in 5 class, I get confusion matrix in 5 class classification but I can not calculate it
4822 18 9 0 40
0 1106 0 0 0
0 2 1990 0 0
0 0 1 2000 0
0 0 0 0 12
Can I calculate TP, TN, FPR and FNR in multiclass problem?
Thank you!

You can calculate these values per class and then aggregate them if you wish to do so. In the calculation for one class you treat the class as the "true" and the union of the other classes as "false". To aggregate for an overall value I would suggest to use the median, which is less sensitive to outliers.

Yes, you can calculate these metrics by using the following steps:-
1- Convert your matrix to 2 x 2 matrix as below
a. suppose your first class is A and second class is B for the new 2 x 2 matrix
b. the new 2 x2 matrix should be like this
Predicted Class
A B
A 4822 67 // 67 comes from the summation of 18+9+0+40
B 0 5111 // 0 comes from the summation of 0+0+0+0 under 4822
// 5111 comes from the summation of the remaining numbers
2- Calculate the TP, TN, FP and FN rates using the equations in this URL page: http://www2.cs.uregina.ca/~dbd/cs831/notes/confusion_matrix/confusion_matrix.html

Related

size reduction of matrix whose rank is not full in julia

I have a N×N general matrix H with rank n(<N).
Is there any way to get a n×n matrix with rank n from H?
For example,
|1 2 3|
H = |4 8 6|
|0 0 1|
has three eigenvalues 0,1,9 and its rank is 2. I want to get a 2×2 matrix with rank 2 which corresponds to the eigenspace sappaned by eigenvectors of 1,9.
We are given a 3x3 matrix H that is known to have rank r < 3:
1 2 3
4 8 6
0 0 1
One can obtain an nxn matrix comprised of the intersection of rows and columns of H that has rank n by computing the reduced row echelon form (RREF) of H (also called the row canonical form).
After doing so, for each of n row indices i there will be a column in the RREF that contains a 1 in row i (i.e., the row having index i) and zeroes in all other rows. It is seen here that the RREF of H is the following.
1 2 0
0 0 1
0 0 0
As column 0 (i.e., the column having index 0) in the RREF has a 1 in row 0 and zeroes in all other rows, and column 2 has a 1 in row 1 and zeroes in all other rows, and no other column has a 1 in one row and zeroes in all other rows, we conclude that:
H has rank 2; and
the nxn matrix comprised of elements in H that are in rows 0 and 1 and columns 0 and 2 has rank n.
Here an nxn matrix with rank n is therefore found to be
1 3
4 6
The same procedure is followed regardless of the size of H (which need not be square) and the rank of H need not be known in advance.
Using the RowEchelon.jl package, we can apply the method described in #CarySwoveland's answer pretty easily. (This is not my area of expertise though, so any corrections to it are welcome; specifically, the choice of rows as 1 to number of pivots is an educated guess based on some trials.)
julia> H = [1 2 3
4 8 6
0 0 1];
julia> using RowEchelon
julia> _, pivotcols = rref_with_pivots(H)
([1.0 2.0 0.0; 0.0 0.0 1.0; 0.0 0.0 0.0], [1, 3])
julia> result = H[1:length(pivotcols), pivotcols]
2×2 Matrix{Int64}:
1 3
4 6
The package is just a home for code that used to be in Base Julia, so you can even just copy the code if you don't want to add it as a dependency.

Evaluating the model in WEKA

I have applied classification algorithm on dataset and came out with below stats:
Correctly Classified Instances 684 76.1693 %
Incorrectly Classified Instances 214 23.8307 %
Kappa statistic 0
Mean absolute error 0.1343
Root mean squared error 0.2582
Relative absolute error 100 %
Root relative squared error 100 %
Total Number of Instances 898
=== Detailed Accuracy By Class ===
TP Rate FP Rate Precision Recall F-Measure ROC Area Class
0 0 0 0 0 0.5 1
0 0 0 0 0 0.5 2
1 1 0.762 1 0.865 0.5 3
0 0 0 0 0 ? 4
0 0 0 0 0 0.5 5
0 0 0 0 0 0.5 U
Weighted Avg. 0.762 0.762 0.58 0.762 0.659 0.5
=== Confusion Matrix ===
a b c d e f <-- classified as
0 0 8 0 0 0 | a = 1
0 0 99 0 0 0 | b = 2
0 0 684 0 0 0 | c = 3
0 0 0 0 0 0 | d = 4
0 0 67 0 0 0 | e = 5
0 0 40 0 0 0 | f = U
I can understand much of the data however there is a problem interpreting the values since i am new to Weka:
1. Which error rate to report overall?
2. How to interpret if something interesting about the model?
1) Overall error measure
The triplet Precision, Recall and F-Measure together is reported quite often because each number represents a different aspect of the model.
If would like to have a single number only then take Percent (In)correctly Classified Instances or Weighted Avg. F-Measure.
The other error measures are also useful but they require deeper knowledge of statistics (which I'm lacking :-)
2) Something interesting about the model
From Detailed Accuracy By Class and Confusion Matrix you can see that the model is quite simple. It classifies everything as class 3. The error measures looks quite successful, but it is just because 76% of instances in the dataset have the class 3. The model corresponds with often used baseline algorithm called "most common class".
The ROC area is also useful in terms of evaluating accuracy and interpreting how interesting a model is. Simply speaking, the true positive rate is plotted against the false positive rate and the ROC area is calculated as the area underneath this curve. A high ROC area, say 0.9 to 1, indicates that the model is very good at classifying instances, whereas a ROC area of 0.5 (as in your model) means that the model is no better at classification than a random method like flipping coins.

How to Shuffle an Array with Fixed Row/Column Sum?

I need to assign random papers to students of a class, but I have the constraints that:
Each student should have two papers assigned.
Each paper should be assigned to (approximately) the same number of students.
Is there an elegant way to generate a matrix that has this property? i.e. it is shuffled but the row and column sums are constant? As an illustration:
Student A 1 0 0 1 1 0 | 3
Student B 1 0 1 0 0 1 | 3
Student C 0 1 1 0 1 0 | 3
Student D 0 1 0 1 0 1 | 3
----------------
2 2 2 2 2 2
I thought of first building an "initial matrix" with the right row/column sum, then randomly permuting first the rows, then the colums, but how do I generate this initial matrix? The problem here is that I'd be choosing between (e.g.) the following alternatives, and the fact that there are two students with the same pair of papers assigned (in the left setup) won't change through row/column shuffling:
INITIAL (MA): OR (MB):
A 1 1 1 0 0 0 || 1 1 1 0 0 0
B 1 1 1 0 0 0 || 0 1 1 1 0 0
C 0 0 0 1 1 1 || 0 0 0 1 1 1
D 0 0 0 1 1 1 || 1 0 0 0 1 1
I know I could come up with something quick/dirty and just tweak where necessary but it seemed like a fun exercise.
If you want to make permutations, what about:
Chose randomly a student, say student 1
For this student, chose a random paper he has, say paper A
Chose randomly another student
For this student, chose a random paper he has, say paper B (different from A)
Give paper B to student 1 and paper A to student 2.
That way, you preserve both the number of different papers and the number of papers per student. Indeed, both students give one paper and receive one back. Moreover, no paper is created nor deleted.
In term of table, it means finding two pairs of indices(i1,i2) and (j1,j2) such that A(i1,j1) = 1, A(i2,j2)=1, A(i1,j2)=0 and A(i2,j1)=0 and changing the 0s for 1s and the 1s for 0s => The sums of the rows and columns do not change.
Remark 1: If you do not want to proceed by permutations, you can simply put in a vector all the paper (put 2 times paper A, 2 times paper B,...). Then, random shuffle the vector and attribute the k first to the first student, the k next ones to student 2, ... However, you can end with a student having several times the same paper. In this case, make some permutations starting with the surnumerary papers.
You can generate the initial matrix as follows (pseudo-Python syntax):
column_sum = [0] * n_students
for i in range(n_students):
if column_sum[i] < max_allowed:
for j in range(i + 1, n_students):
if column_sum[j] < max_allowed:
generate_row_with_ones_at(i, j)
column_sum[i] += 1
column_sum[j] += 1
if n_rows == n_wanted:
return
This is a straightforward iteration over all n choose 2 distinct rows, but with the constraint on column sums enforced as early as possible.

3d Hill generating algorithm?

Supposing you have a 3d box of cubes, with each cube having 3 indices: (x,y,z), and 1 additional attribute to specify if it represents land or air.
Let's say that we have a 3d array to represent this box of cubes, with each cube being an element in the 3d array.
The following array, for example, would represent a bowl shaped piece of land:
y=0:
0 0 0 0 0
0 0 0 0 0
1 1 1 1 1
1 1 1 1 1
y=1:
0 0 0 0 0
0 0 0 0 0
1 0 0 0 1
1 1 1 1 1
y=2:
0 0 0 0 0
0 0 0 0 0
1 0 0 0 1
1 1 1 1 1
y=3:
0 0 0 0 0
0 0 0 0 0
1 1 1 1 1
1 1 1 1 1
What is an algorithm such that given a selection box it would generate hills with f frequency and with average height of h, with v average variation in height?
We can assume that the lowest level of the bonding box is the "baseline", or "sea-level".
function makeTrees(double frequency, int height, double variation)
{
//return 3d array.
}
I'm writing a minecraft MCEdit filter plugin :P
Simplest way is to decompose the problem into three parts:
Write a routine to generate the cubes for a single hill of height h. Start off by making this a simple cone (play with apex angles till you find something that looks pleasing)
Generate a set of n heights between h-v and h+v, using the random number generator of your choice
Place n mountains randomly on your cube. It doesn't matter if they intersect - indeed, it will lead to a better-looking range.
However, I'd also suggest abandoning this approach, and simply generate a fractal terrain within your bounding cube, then discretize it. You can play with the paramaters to your fractal generator to bound the height and variance.
Assuming you would like sinusoidal hills of frequency f (or rather, wavenumber f, since "frequency" is usually used for temporal quantities) as a function of radius r = sqrt(x^2+y^2) from the center:
Define a threshold function like this:
Any element (x,y,z) with z < z_m will be land, and the rest will be air.

Sorting a binary 2D matrix?

I'm looking for some pointers here as I don't quite know where to start researching this one.
I have a 2D matrix with 0 or 1 in each cell, such as:
1 2 3 4
A 0 1 1 0
B 1 1 1 0
C 0 1 0 0
D 1 1 0 0
And I'd like to sort it so it is as "upper triangular" as possible, like so:
4 3 1 2
B 0 1 1 1
A 0 1 0 1
D 0 0 1 1
C 0 0 0 1
The rows and columns must remain intact, i.e. elements can't be moved individually and can only be swapped "whole".
I understand that there'll probably be pathological cases where a matrix has multiple possible sorted results (i.e. same shape, but differ in the identity of the "original" rows/columns.)
So, can anyone suggest where I might find some starting points for this? An existing library/algorithm would be great, but I'll settle for knowing the name of the problem I'm trying to solve!
I doubt it's a linear algebra problem as such, and maybe there's some kind of image processing technique that's applicable.
Any other ideas aside, my initial guess is just to write a simple insertion sort on the rows, then the columns and iterate that until it stabilises (and hope that detecting the pathological cases isn't too hard.)
More details: Some more information on what I'm trying to do may help clarify. Each row represents a competitor, each column represents a challenge. Each 1 or 0 represents "success" for the competitor on a particular challenge.
By sorting the matrix so all 1s are in the top-right, I hope to then provide a ranking of the intrinsic difficulty of each challenge and a ranking of the competitors (which will take into account the difficulty of the challenges they succeeded at, not just the number of successes.)
Note on accepted answer: I've accepted Simulated Annealing as "the answer" with the caveat that this question doesn't have a right answer. It seems like a good approach, though I haven't actually managed to come up with a scoring function that works for my problem.
An Algorithm based upon simulated annealing can handle this sort of thing without too much trouble. Not great if you have small matrices which most likely hae a fixed solution, but great if your matrices get to be larger and the problem becomes more difficult.
(However, it also fails your desire that insertions can be done incrementally.)
Preliminaries
Devise a performance function that "scores" a matrix - matrices that are closer to your triangleness should get a better score than those that are less triangle-y.
Devise a set of operations that are allowed on the matrix. Your description was a little ambiguous, but if you can swap rows then one op would be SwapRows(a, b). Another could be SwapCols(a, b).
The Annealing loop
I won't give a full exposition here, but the idea is simple. You perform random transformations on the matrix using your operations. You measure how much "better" the matrix is after the operation (using the performance function before and after the operation). Then you decide whether to commit that transformation. You repeat this process a lot.
Deciding whether to commit the transform is the fun part: you need to decide whether to perform that operation or not. Toward the end of the annealing process, you only accept transformations that improved the score of the matrix. But earlier on, in a more chaotic time, you allow transformations that don't improve the score. In the beginning, the algorithm is "hot" and anything goes. Eventually, the algorithm cools and only good transforms are allowed. If you linearly cool the algorithm, then the choice of whether to accept a transformation is:
public bool ShouldAccept(double cost, double temperature, Random random) {
return Math.Exp(-cost / temperature) > random.NextDouble();
}
You should read the excellent information contained in Numerical Recipes for more information on this algorithm.
Long story short, you should learn some of these general purpose algorithms. Doing so will allow you to solve large classes of problems that are hard to solve analytically.
Scoring algorithm
This is probably the trickiest part. You will want to devise a scorer that guides the annealing process toward your goal. The scorer should be a continuous function that results in larger numbers as the matrix approaches the ideal solution.
How do you measure the "ideal solution" - triangleness? Here is a naive and easy scorer: For every point, you know whether it should be 1 or 0. Add +1 to the score if the matrix is right, -1 if it's wrong. Here's some code so I can be explicit (not tested! please review!)
int Score(Matrix m) {
var score = 0;
for (var r = 0; r < m.NumRows; r++) {
for (var c = 0; c < m.NumCols; c++) {
var val = m.At(r, c);
var shouldBe = (c >= r) ? 1 : 0;
if (val == shouldBe) {
score++;
}
else {
score--;
}
}
}
return score;
}
With this scoring algorithm, a random field of 1s and 0s will give a score of 0. An "opposite" triangle will give the most negative score, and the correct solution will give the most positive score. Diffing two scores will give you the cost.
If this scorer doesn't work for you, then you will need to "tune" it until it produces the matrices you want.
This algorithm is based on the premise that tuning this scorer is much simpler than devising the optimal algorithm for sorting the matrix.
I came up with the below algorithm, and it seems to work correctly.
Phase 1: move rows with most 1s up and columns with most 1s right.
First the rows. Sort the rows by counting their 1s. We don't care
if 2 rows have the same number of 1s.
Now the columns. Sort the cols by
counting their 1s. We don't care
if 2 cols have the same number of
1s.
Phase 2: repeat phase 1 but with extra criterions, so that we satisfy the triangular matrix morph.
Criterion for rows: if 2 rows have the same number of 1s, we move up the row that begin with fewer 0s.
Criterion for cols: if 2 cols have the same number of 1s, we move right the col that has fewer 0s at the bottom.
Example:
Phase 1
1 2 3 4 1 2 3 4 4 1 3 2
A 0 1 1 0 B 1 1 1 0 B 0 1 1 1
B 1 1 1 0 - sort rows-> A 0 1 1 0 - sort cols-> A 0 0 1 1
C 0 1 0 0 D 1 1 0 0 D 0 1 0 1
D 1 1 0 0 C 0 1 0 0 C 0 0 0 1
Phase 2
4 1 3 2 4 1 3 2
B 0 1 1 1 B 0 1 1 1
A 0 0 1 1 - sort rows-> D 0 1 0 1 - sort cols-> "completed"
D 0 1 0 1 A 0 0 1 1
C 0 0 0 1 C 0 0 0 1
Edit: it turns out that my algorithm doesn't give proper triangular matrices always.
For example:
Phase 1
1 2 3 4 1 2 3 4
A 1 0 0 0 B 0 1 1 1
B 0 1 1 1 - sort rows-> C 0 0 1 1 - sort cols-> "completed"
C 0 0 1 1 A 1 0 0 0
D 0 0 0 1 D 0 0 0 1
Phase 2
1 2 3 4 1 2 3 4 2 1 3 4
B 0 1 1 1 B 0 1 1 1 B 1 0 1 1
C 0 0 1 1 - sort rows-> C 0 0 1 1 - sort cols-> C 0 0 1 1
A 1 0 0 0 A 1 0 0 0 A 0 1 0 0
D 0 0 0 1 D 0 0 0 1 D 0 0 0 1
(no change)
(*) Perhaps a phase 3 will increase the good results. In that phase we place the rows that start with fewer 0s in the top.
Look for a 1987 paper by Anna Lubiw on "Doubly Lexical Orderings of Matrices".
There is a citation below. The ordering is not identical to what you are looking for, but is pretty close. If nothing else, you should be able to get a pretty good idea from there.
http://dl.acm.org/citation.cfm?id=33385
Here's a starting point:
Convert each row from binary bits into a number
Sort the numbers in descending order.
Then convert each row back to binary.
Basic algorithm:
Determine the row sums and store
values. Determine the column sums
and store values.
Sort the row sums in ascending order. Sort the column
sums in ascending order.
Hopefully, you should have a matrix with as close to an upper-right triangular region as possible.
Treat rows as binary numbers, with the leftmost column as the most significant bit, and sort them in descending order, top to bottom
Treat the columns as binary numbers with the bottommost row as the most significant bit and sort them in ascending order, left to right.
Repeat until you reach a fixed point. Proof that the algorithm terminates left as an excercise for the reader.

Resources