incorrect confusion matrix plot - seaborn

The ends on y-axis cuts halfway while plotting confusion matrix using pandas dataframe?
This is what I get:
I used the codes from here How can I plot a confusion matrix? using pandas dataframe:
import seaborn as sn
import pandas as pd
import matplotlib.pyplot as plt
array = [[13,1,1,0,2,0],
[3,9,6,0,1,0],
[0,0,16,2,0,0],
[0,0,0,13,0,0],
[0,0,0,0,15,0],
[0,0,1,0,0,15]]
df_cm = pd.DataFrame(array, range(6),range(6))
#plt.figure(figsize = (10,7))
sn.set(font_scale=1.4)#for label size
sn.heatmap(df_cm, annot=True,annot_kws={"size": 16})# font size

I solved the problem and I think this post explains why it happens.
Simply speaking, matplotlib 3.1.1 broke seaborn heatmaps; You can solve it by downgrading to matplotlib 3.1.0.

As suggested by sikisis
Following solved my problem
pip install matplotlib==3.1.0

Related

What is Seaborn's default figure size?

I can find lots of answers of how to change Seaborn's figure size. I want to know: what is the default figure size?
I believe seaborn uses matplotlib's default parameters when plotting axes, so you can check the figsize with matplotlib.rcParams.
import matplotlib
matplotlib.rcParams['figure.figsize']
# [12.0, 7.0]
import seaborn as sns
import pandas as pd
df = pd.DataFrame(columns=['x','y'])
ax = sns.scatterplot(data=df, x="x", y="x")
ax.figure.get_size_inches()
# array([12., 7.])
Now for advanced plotting functions that generate figures, this is defined by the function.
For example relplot, displot... have a default parameter of height=5 (inches) and aspect=1, which translates into a figsize of [5,5] for a single subplot/facet, and proportionally more for several cols/rows in the FacetGrid.
pairplot has a default of height=2.5 and jointplot of height=6.

Read values from gauge with image processing

I am trying to read fluid levels in a column from a photograph I took, to see if I can automate the data collection.
So far, I have been able to use some code, below to identify the outline of where the fluid level is.
import numpy as np
import matplotlib.pyplot as plt
import skimage
from skimage import data
from skimage.feature import canny
from scipy import misc
from skimage.filters import roberts, sobel, scharr, prewitt
%matplotlib inline
perm = skimage.io.imread('C:\Users\Spencer\Desktop\perm2_crop.jpg')
edge_roberts = roberts(perm[:, :, 2])
plt.imshow(edge_roberts, cmap=plt.cm.gray)
What I need to figure out now is how to identify the break, and then how to translate that into a data value that I can scale to the values on the column.
Any ideas about what packages or methods I would use to to do this? Any examples would also be appreciated.

math.ceil(0.5) returning different value from math.ceil(1/2)

import math
print(math.ceil(0.5))
Returns
1.0
But
import math
print(math.ceil(1/2))
Returns
0.0
What's going on here? Explanation would be nice.
It seems you're running that code using python 2.x where you need to cast to float explicitely:
import math
print(math.ceil(0.5))
print(math.ceil(float(1) / float(2)))
If you run python 3.x you won't need to do that cast explicitely and you'll get the same output:
import math
print(math.ceil(0.5))
print(math.ceil(1 / 2))
import math
print(math.ceil(1/float(2)))

Cross validation of dataset separated on files

The dataset that I have is separated on different files grouped on samples that know each other, i.e., they were created on similar conditions on a similar time.
The balance of the train-test dataset is important so the samples have to be on train or test, but cannot be separated. So KFold it is not simple to use on my scikit-learn code.
Right now, I am using something similar to LOO making something like:
train ~> cat ./dataset/!(1.txt)
test ~> cat ./dataset/1.txt
Which is not confortable and not very useful if I want to make folds on test of several files and make a "real" CV.
How would be possible to make a good CV to check real overfitting?
Looking to this answer, I've realized that pandas can concatenate dataframes. I checked that the process is 15-20% slower than cat command-line but makes able to do folds as I was expecting.
Anyway, I am quite sure that there should be any other better way than this one:
import glob
import numpy as np
import pandas as pd
from sklearn.cross_validation import KFold
allFiles = glob.glob("./dataset/*.txt")
kf = KFold(len(allFiles), n_folds=3, shuffle=True)
for train_files, cv_files in kf:
dataTrain = pd.concat((pd.read_csv(allFiles[idTrain], header=None) for idTrain in train_files))
dataTest = pd.concat((pd.read_csv(allFiles[idTest], header=None) for idTest in cv_files))

Matplotlib Python Stealing Screen Focus

my code is taking serial data from an arduino, processing it, and then plotting it. I am using matplotlib as the graphics interface. Every time it 'draws' though it forces attention to it, and a user won't be able to look at anything besides that. What is the best way to get this to stop? (The code works fine except for the stealing focus). I tried to use the matplotlib.use('Agg') method after reading that on another post, but it did not work. (Using a MAC OS X).
The Code shown below is a super simple graph of updating data, with which I have the same problem. I'm not showing my code because it is not copy-pastable without the right inputs
Here is my code:
import matplotlib
from matplotlib import *
from pylab import *
# import math
x=[]
y=[]
def function(iteration):
xValue=iteration#Assigns current x value
yValue=(1./iteration)*34#Assigns current y value
x.extend([xValue]) #adds the current x value to the x list
y.extend([yValue]) #adds the current y value to the y list
clf() #clears the plot
plot(x,y,color='green') #tells the plot what to do
draw() #forces a draw
def main():
for i in range(1,25): #run my function 25 times (24 I think actually)
function(i)
pause(.1)
main()
Have you tried using the interactive mode of matplotlib?
You can switch it on using ion() (see Documentation)
If you use interactive mode you do not need to call draw() but you might need to clear your figures using clf() depending on your desired output
I find that using the Tkagg backend works
import matplotlib
matplotlib.use('Tkagg')
credit to 457290092

Resources