Ceres Solver Evaluate() "successful step 1" Jacobian and residual evaluate only once - ceres-solver

I use Evaluate() to build Jacobian. The problems that exist after the optimization iteration is over. That is, only the 0th iteration calculates the Jacobian, and the rest of the iterations are not calculated, and the Initial cost is equal to the Final cost. Only one of the iterative steps succeeds and the rest fail. I don't know what is causing this error.
summary.breifReport()
tier cost cost_change |gradient| |step| tr_ratio tr_radius
0 1.2e+05 0 6.9e+04 0 0. 1e+4
1 4.4e+05 -3.2e+05 0 3.83 -2.64 5e+3
2 4.4e+05 -3.2e+05 0 3.83 -2.64 2.5e+3
3 4.4e+05 -3.2e+05 0 3.83 -2.64 1.25e+3
4 4.4e+05 -3.2e+05 0 3.83 -2.64 6.25e+2
5 4.4e+05 -3.2e+05 0 3.31 -2.70 3.12e+2
6 4.6e+05 -3.4e+05 0 3.13 -2.84 1.56e+2
7 2.8e+05 -1.6e+05 0 1.93 -1.74 7.81e+1
8 1.9e+05 -7.3e+04 0 9.7e-1 -1.30 3.91e+1
9 1.5e+05 -3.4e+04 0 3.8e-1 -1.14 1.95e+1
10 ...
summary.FullReport()
Solver Summary (v 1.14.0-eigen-(3.2.9)-lapack-suitesparse-(5.7.1)-cxsparse-(3.2.0)-eigensparse-openmp-no_tbb)
Original Reduced
Parameter blocks 15 15
Parameters 564 564
Effective parameters 561 561
Residual blocks 6 6
Residuals 80 80
Minimizer TRUST_REGION
Sparse linear algebra library SUITE_SPARSE
Trust region strategy LEVENBERG_MARQUARDT
Given Used
Linear solver SPARSE_NORMAL_CHOLESKY SPARSE_NORMAL_CHOLESKY
Threads 1 1
Linear solver ordering AUTOMATIC 15
Cost:
Initial 3.682558e+04
Final 3.682558e+04
Change 0.000000e+00
Minimizer iterations 13
Successful steps 1
Unsuccessful steps 12
Time (in seconds):
Preprocessor 0.000045
Residual only evaluation 0.030806 (13)
Jacobian & residual evaluation 0.004554 (1)
Linear solver 0.181772 (13)
Minimizer 0.217993
Postprocessor 0.000007
Total 0.218046
Termination: CONVERGENCE (Function tolerance reached. |cost_change|/cost: 0.000000e+00 <= 1.000000e-16)

Related

H2o: Is there a way to fix threshold in H2ORandomForestEstimator performance during training and testing?

I have built a model with H2ORandomForestEstimator and the results shows something like this below.
The threshold keeps changing (0.5 from traning and 0.313725489027 from validation) and I like to fix the threshold in H2ORandomForestEstimator for comparison during fine tuning. Is there a way to set the threshold?
From http://h2o-release.s3.amazonaws.com/h2o/master/3484/docs-website/h2o-py/docs/modeling.html#h2orandomforestestimator, there is no such parameter.
If there is no way to set this, how do we know what threshold our model is built on?
rf_v1
** Reported on train data. **
MSE: 2.75013548238e-05
RMSE: 0.00524417341664
LogLoss:0.000494320913199
Mean Per-Class Error: 0.0188802936476
AUC: 0.974221763605
Gini: 0.948443527211
Confusion Matrix (Act/Pred) for max f1 # threshold = 0.5:
0 1 Error Rate
----- ------ --- ------- --------------
0 161692 1 0 (1.0/161693.0)
1 3 50 0.0566 (3.0/53.0)
Total 161695 51 0 (4.0/161746.0)
Maximum Metrics: Maximum metrics at their respective thresholds
metric threshold value idx
--------------------------- ----------- -------- -----
max f1 0.5 0.961538 19
max f2 0.25 0.955056 21
max f0point5 0.571429 0.983936 18
max accuracy 0.571429 0.999975 18
max precision 1 1 0
max recall 0 1 69
max specificity 1 1 0
max absolute_mcc 0.5 0.961704 19
max min_per_class_accuracy 0.25 0.962264 21
max mean_per_class_accuracy 0.25 0.98112 21
Gains/Lift Table: Avg response rate: 0.03 %
** Reported on validation data. **
MSE: 1.00535766226e-05
RMSE: 0.00317073755183
LogLoss: 4.53885183426e-05
Mean Per-Class Error: 0.0
AUC: 1.0
Gini: 1.0
Confusion Matrix (Act/Pred) for max f1 # threshold = 0.313725489027:
0 1 Error Rate
----- ----- --- ------- -------------
0 53715 0 0 (0.0/53715.0)
1 0 16 0 (0.0/16.0)
Total 53715 16 0 (0.0/53731.0)
Maximum Metrics: Maximum metrics at their respective thresholds
metric threshold value idx
--------------------------- ----------- ------- -----
max f1 0.313725 1 5
max f2 0.313725 1 5
max f0point5 0.313725 1 5
max accuracy 0.313725 1 5
max precision 1 1 0
max recall 0.313725 1 5
max specificity 1 1 0
max absolute_mcc 0.313725 1 5
max min_per_class_accuracy 0.313725 1 5
max mean_per_class_accuracy 0.313725 1 5
The threshold is max-F1.
If you want to apply your own threshold, you will have to take the probability of the positive class and compare it yourself to produce the label you want.
If you use your web browser to connect to the H2O Flow Web UI inside of H2O-3, you can mouse over the ROC curve and visually browse the confusion matrix for each threshold, which is convenient.

Evaluating the model in WEKA

I have applied classification algorithm on dataset and came out with below stats:
Correctly Classified Instances 684 76.1693 %
Incorrectly Classified Instances 214 23.8307 %
Kappa statistic 0
Mean absolute error 0.1343
Root mean squared error 0.2582
Relative absolute error 100 %
Root relative squared error 100 %
Total Number of Instances 898
=== Detailed Accuracy By Class ===
TP Rate FP Rate Precision Recall F-Measure ROC Area Class
0 0 0 0 0 0.5 1
0 0 0 0 0 0.5 2
1 1 0.762 1 0.865 0.5 3
0 0 0 0 0 ? 4
0 0 0 0 0 0.5 5
0 0 0 0 0 0.5 U
Weighted Avg. 0.762 0.762 0.58 0.762 0.659 0.5
=== Confusion Matrix ===
a b c d e f <-- classified as
0 0 8 0 0 0 | a = 1
0 0 99 0 0 0 | b = 2
0 0 684 0 0 0 | c = 3
0 0 0 0 0 0 | d = 4
0 0 67 0 0 0 | e = 5
0 0 40 0 0 0 | f = U
I can understand much of the data however there is a problem interpreting the values since i am new to Weka:
1. Which error rate to report overall?
2. How to interpret if something interesting about the model?
1) Overall error measure
The triplet Precision, Recall and F-Measure together is reported quite often because each number represents a different aspect of the model.
If would like to have a single number only then take Percent (In)correctly Classified Instances or Weighted Avg. F-Measure.
The other error measures are also useful but they require deeper knowledge of statistics (which I'm lacking :-)
2) Something interesting about the model
From Detailed Accuracy By Class and Confusion Matrix you can see that the model is quite simple. It classifies everything as class 3. The error measures looks quite successful, but it is just because 76% of instances in the dataset have the class 3. The model corresponds with often used baseline algorithm called "most common class".
The ROC area is also useful in terms of evaluating accuracy and interpreting how interesting a model is. Simply speaking, the true positive rate is plotted against the false positive rate and the ROC area is calculated as the area underneath this curve. A high ROC area, say 0.9 to 1, indicates that the model is very good at classifying instances, whereas a ROC area of 0.5 (as in your model) means that the model is no better at classification than a random method like flipping coins.

Can I calculate TP,TN, FPR and FNR in multiclass

If i classify data in 5 class, I get confusion matrix in 5 class classification but I can not calculate it
4822 18 9 0 40
0 1106 0 0 0
0 2 1990 0 0
0 0 1 2000 0
0 0 0 0 12
Can I calculate TP, TN, FPR and FNR in multiclass problem?
Thank you!
You can calculate these values per class and then aggregate them if you wish to do so. In the calculation for one class you treat the class as the "true" and the union of the other classes as "false". To aggregate for an overall value I would suggest to use the median, which is less sensitive to outliers.
Yes, you can calculate these metrics by using the following steps:-
1- Convert your matrix to 2 x 2 matrix as below
a. suppose your first class is A and second class is B for the new 2 x 2 matrix
b. the new 2 x2 matrix should be like this
Predicted Class
A B
A 4822 67 // 67 comes from the summation of 18+9+0+40
B 0 5111 // 0 comes from the summation of 0+0+0+0 under 4822
// 5111 comes from the summation of the remaining numbers
2- Calculate the TP, TN, FP and FN rates using the equations in this URL page: http://www2.cs.uregina.ca/~dbd/cs831/notes/confusion_matrix/confusion_matrix.html

What heuristic cost is correct? Why my is incorrect? Finding optimal path on graph

I have simple graph and I need to find heuristic costs for this graph.
Graph is (matrix representation):
0 1 2 0 0 0 0
1 0 0 3 3 2 0
3 0 0 2 0 0 0
0 3 1 0 1 0 0
0 3 0 1 0 6 0
0 2 0 0 6 0 2
0 0 0 0 0 2 0
Image:
Values in brackets means heuristic costs of the vertex for current goal vertex.
Green vertex is start and red vertex is goal.
I created this heristic costs matrix:
0 2 6 3 1 9 5
9 0 2 4 6 4 1
1 3 0 5 2 9 4
3 1 5 0 1 7 8
0 6 2 1 0 10 14
2 1 6 3 7 0 5
1 4 3 2 1 3 0
I have to explain this. This matrix represents this: for example goal vertex is 7; we find 7th row in matrix; value in 1st col means heuristic cost from 1 vertex to 7 vertex (7 is goal); value in 5nd col means heurisitc cost from 5 vertex to 7 vertex (7 is goal); if 5 is goal, we will work with 5 row, etc...
This heusristic costs based on nothing. I don't know how to find good heuristic costs. That is the question.
To summarize:
Firstly, my algorithm found wrong path (because of wrong heuristics probably). It found 1-3-4-5 (length 5), but best is 1-2-5 (length 4).
Also, teacher said, that my heuristic costs prevents the algorithm to find good path, but not helps him. I have problems with translating what he said into english, but he said somethink like: "your heuristic mustn't overestimate best path". What does it mean?
So the question: how to find good heuristic costs in my case?
I am going to wrap my comments as an answer.
First, note that "overestimate best path" means that your shortest path from some node v to the goal is of length k, but h(v)=k' such that k'>k. In this case, the heuristic is overestimating the length of the path. A heuristic that does it for 1 or more nodes is called "inadmissible", and A* is not guaranteed to find the shortest path with such a heuristic.
An admissible heuristic function (never overestimating) is guaranteed to provide the optimal path for A*.
The simplest admissible heuristic is h(v) = 0 for all v. Note that in this case, A* will actually behave like Dijsktra's Algorithm (which is basically an uniformed A*).
You can find more informative heuristics, one example is to first pre-process the graph and find the shortest unweighted path from each node to the goal. This can be done efficiently by BFS. Denote this unweighted distance from some v to the goal as uwd(v).
Now, you can create a heuristic which is uwd(v) * MIN_WEIGHT, where MIN_WEIGHT is the smallest edge weight in the graph.

Looking for failing test case to DP solution to MARTIAN on SPOJ

I am trying to solve the MARTIAN problem on SPOJ
My algorithm is as follows:
Define dp[i][j]=max amount of minerals that can be mined in the rectangle form 0,0 to i,j.
Use the recurrence
dp[i][j] = max(dp[i-1][j] + total amount of yeyenum
in the i-th row up to the j-th column,
dp[i][j-1] + total amount of bloggium
in the j-th column up to the cell i-th row)
However such an approach yields a WA (Wrong Answer). Can someone please provide me with a test case where such and approach will not work?
I am not looking for the correct algorithm just a test case where this approach fails as. I've been unable to find the bug myself.
Try this on your code(modified from the example given):
4 4
0 0 10 60
1 3 10 0
4 2 1 3
1 1 20 0
10 0 0 0
1 1 1 10
0 0 5 3
5 10 10 10
0 0
If you start by looking at [4][4], you'll choose Bloggium, because you can get 23 bloggium by going up, and only 22 Yeyenum from going left. However, you're going to miss a huge amount of Yeyenum.
Using your algorithm, you'll get 23 + 22 + 7 + 14 + 10 = 76.
If you choose the large Yeyenum, you'll get 70 + 14 + 10 + 22 = 116(all Yeyenum, since the bloggium gets blocked).

Resources