I want to plot multiple ROC curves with a matrix of predictions and labels. I have > 100 samples with a matrix of predictions and labels for each sample. The length of the samples is different. How could I get design a single matrix for all the samples and get multiple ROC curves in a single plot? I would appreciate any suggestions. Thanks
Is it possible to use AUC value as a threshold to build the confusion matrix?
I am comparing a boxplot and violin plot in seaborn and the violin plot has a different inner box than the boxplot itself. I would like to use a violin plot to visualize the distribution of the data at the same time, is it possible the violin plot has a different median due to the kernel density estimation? The boxplots have the correct median from the data.
EDIT:
when I set inner='quartiles' the violin plots show the appropriate scale. I would like to use inner='box' though for appearances if I can.
Here are the graphs themselves:
and now the boxplot:
I have a dataset which has 9 attributes out of which 2 are numerical and the rest are categorical.
I wish to plot as many attributes as possible within a scatter plot matrix. D3 examples have shown me scatter plot matrices with a majority of numerical values.
Are there any ways to plot multidimensional categorical data as scatter plots?
IF yes, are there any samples available on the wbe?
I'm learning Machine Learning. I was reading a topic called Linear Regression with one variable and I got confused while understanding Gradient Descent Algorithm.
Suppose we have given a problem with a Training Set such that pair $(x^{(i)},y^{(i)})$ represents (feature/Input Variable, Target/ Output Variable). Our goal is to create a hypothesis function for this training set, Which can do prediction.
Hypothesis Function:
$$h_{\theta}(x)=\theta_0 + \theta_1 x$$
Our target is to choose $(\theta_0,\theta_1)$ to best approximate our $h_{\theta}(x)$ which will predict values on the training set
Cost Function:
$$J(\theta_0,\theta_1)=\frac{1}{2m}\sum\limits_{i=1}^m (h_{\theta}(x^{(i)})-y^{(i)})^2$$
$$J(\theta_0,\theta_1)=\frac{1}{2}\times Mean Squared Error$$
We have to minimize $J(\theta_0,\theta_1)$ to get the values $(\theta_0,\theta_1)$ which we can put in our hypothesis function to minimize it. We can do that by applying Gradient Descent Algorithm on the plot $(\theta_0,\theta_1,J(\theta_0,\theta_1))$.
My question is how we can choose $(\theta_0,\theta_1)$ and plot the curve $(\theta_0,\theta_1,J(\theta_0,\theta_1))$. In the online lecture, I was watching. The instructor told everything but didn't mentioned from where the plot will come.
At each iteration you will have some h_\theta, and you will calculate the value of 1/2n * sum{(h_\theta(x)-y)^2 | for each x in train set}.
At each iteration h_\theta is known, and the values (x,y) for each train set sample is known, so it is easy to calculate the above.
For each iteration, you have a new value for \theta, and you can calculate the new MSE.
The plot itself will have the iteration number on x axis, and MSE on y axis.
As a side note, while you can use gradient descent - there is no reason. This cost function is convex and it has a singular minimum that is well known: $\theta = (X^T*X)^{-1)X^Ty$, where yis the values of train set (1xn dimension for train set of size n), and X is 2xn matrix where each line X_i=(1,x_i).