I can find lots of answers of how to change Seaborn's figure size. I want to know: what is the default figure size?
I believe seaborn uses matplotlib's default parameters when plotting axes, so you can check the figsize with matplotlib.rcParams.
import matplotlib
matplotlib.rcParams['figure.figsize']
# [12.0, 7.0]
import seaborn as sns
import pandas as pd
df = pd.DataFrame(columns=['x','y'])
ax = sns.scatterplot(data=df, x="x", y="x")
ax.figure.get_size_inches()
# array([12., 7.])
Now for advanced plotting functions that generate figures, this is defined by the function.
For example relplot, displot... have a default parameter of height=5 (inches) and aspect=1, which translates into a figsize of [5,5] for a single subplot/facet, and proportionally more for several cols/rows in the FacetGrid.
pairplot has a default of height=2.5 and jointplot of height=6.
Related
The ends on y-axis cuts halfway while plotting confusion matrix using pandas dataframe?
This is what I get:
I used the codes from here How can I plot a confusion matrix? using pandas dataframe:
import seaborn as sn
import pandas as pd
import matplotlib.pyplot as plt
array = [[13,1,1,0,2,0],
[3,9,6,0,1,0],
[0,0,16,2,0,0],
[0,0,0,13,0,0],
[0,0,0,0,15,0],
[0,0,1,0,0,15]]
df_cm = pd.DataFrame(array, range(6),range(6))
#plt.figure(figsize = (10,7))
sn.set(font_scale=1.4)#for label size
sn.heatmap(df_cm, annot=True,annot_kws={"size": 16})# font size
I solved the problem and I think this post explains why it happens.
Simply speaking, matplotlib 3.1.1 broke seaborn heatmaps; You can solve it by downgrading to matplotlib 3.1.0.
As suggested by sikisis
Following solved my problem
pip install matplotlib==3.1.0
I follow the tutorial on how to train your own data from tensorflow at Github: https://github.com/tensorflow/models/tree/master/inception#how-to-construct-a-new-dataset-for-retraining.
I split my data (Training and validation), created labels suggested and managed to created the TFrecords using bazel-bin. Everything works and now I have my own data as TFrecords.
Now I want to train my image classifier using inception-v3 model from scratch and it seems I should use the script inception_train.py, but I am not sure. Is that right ? https://github.com/tensorflow/models/blob/master/inception/inception/inception_train.py.
If so, I have two questions:
1-) How can I train it using my TFrecords. If you can show me an example would be great.
2-) Can I run on CPU or is only possible on GPUs ?
Thank you very much.
Try the following sample code to read images and labels from your tfrecords,
import os
import glob
import tensorflow as tf
from matplotlib import pyplot as plt
def read_and_decode_file(filename_queue):
# Create an instance of tf record reader
reader = tf.TFRecordReader()
# Read the generated filename queue
_, serialized_reader = reader.read(filename_queue)
# extract the features you require from the tfrecord using their corresponding key
# In my example, all images were written with 'image' key
features = tf.parse_single_example(
serialized_reader, features={
'image': tf.FixedLenFeature([], tf.string),
'labels': tf.FixedLenFeature([], tf.int16)
})
# Extract the set of images as shown below
img = features['image']
img_out = tf.image.resize_image_with_crop_or_pad(img, target_height=128, target_width=128)
# Similarly extract the labels, be careful with the type
label = features['labels']
return img_out, label
if __name__ == "__main__":
tf.reset_default_graph()
# Path to your tfrecords
path_to_tf_records = os.getcwd() + '/*.tfrecords'
# Collect all tfrecords present in the records folder using glob
list_of_tfrecords = sorted(glob.glob(path_to_tf_records))
# Generate a tensorflow readable filename queue by supplying it with
# a list of tfrecords, optionally it is recommended to shuffle your data
# before feeding into the network
filename_queue = tf.train.string_input_producer(list_of_tfrecords, shuffle=False)
# Supply the tensorflow generated filename queue to the custom function above
image, label = read_and_decode_file(filename_queue)
# Create a new tf session to read the data
sess = tf.Session()
tf.train.start_queue_runners(sess=sess)
# Arbitrary number of iterations
for i in range(50):
img =sess.run(image)
# Show image
plt.imshow(img)
Now, you also have a function called tf.train.shuffle_batch to help you spawn multiple CPU threads that perform this function and return images and labels based on user specified batch size. You would need to create simultaneous data and training pipelines so that they work simultaneously.
To answer your second question, yes you can train your model using CPU alone but it would be slow and might take several hours or even days to achieve decent results. Remove the with tf.device('/gpu:{0}'): decorator before the creation of your inception model and tensorflow would create the model on your CPU.
Hope this explanation helps.
I am trying to read fluid levels in a column from a photograph I took, to see if I can automate the data collection.
So far, I have been able to use some code, below to identify the outline of where the fluid level is.
import numpy as np
import matplotlib.pyplot as plt
import skimage
from skimage import data
from skimage.feature import canny
from scipy import misc
from skimage.filters import roberts, sobel, scharr, prewitt
%matplotlib inline
perm = skimage.io.imread('C:\Users\Spencer\Desktop\perm2_crop.jpg')
edge_roberts = roberts(perm[:, :, 2])
plt.imshow(edge_roberts, cmap=plt.cm.gray)
What I need to figure out now is how to identify the break, and then how to translate that into a data value that I can scale to the values on the column.
Any ideas about what packages or methods I would use to to do this? Any examples would also be appreciated.
The dataset that I have is separated on different files grouped on samples that know each other, i.e., they were created on similar conditions on a similar time.
The balance of the train-test dataset is important so the samples have to be on train or test, but cannot be separated. So KFold it is not simple to use on my scikit-learn code.
Right now, I am using something similar to LOO making something like:
train ~> cat ./dataset/!(1.txt)
test ~> cat ./dataset/1.txt
Which is not confortable and not very useful if I want to make folds on test of several files and make a "real" CV.
How would be possible to make a good CV to check real overfitting?
Looking to this answer, I've realized that pandas can concatenate dataframes. I checked that the process is 15-20% slower than cat command-line but makes able to do folds as I was expecting.
Anyway, I am quite sure that there should be any other better way than this one:
import glob
import numpy as np
import pandas as pd
from sklearn.cross_validation import KFold
allFiles = glob.glob("./dataset/*.txt")
kf = KFold(len(allFiles), n_folds=3, shuffle=True)
for train_files, cv_files in kf:
dataTrain = pd.concat((pd.read_csv(allFiles[idTrain], header=None) for idTrain in train_files))
dataTest = pd.concat((pd.read_csv(allFiles[idTest], header=None) for idTest in cv_files))
my code is taking serial data from an arduino, processing it, and then plotting it. I am using matplotlib as the graphics interface. Every time it 'draws' though it forces attention to it, and a user won't be able to look at anything besides that. What is the best way to get this to stop? (The code works fine except for the stealing focus). I tried to use the matplotlib.use('Agg') method after reading that on another post, but it did not work. (Using a MAC OS X).
The Code shown below is a super simple graph of updating data, with which I have the same problem. I'm not showing my code because it is not copy-pastable without the right inputs
Here is my code:
import matplotlib
from matplotlib import *
from pylab import *
# import math
x=[]
y=[]
def function(iteration):
xValue=iteration#Assigns current x value
yValue=(1./iteration)*34#Assigns current y value
x.extend([xValue]) #adds the current x value to the x list
y.extend([yValue]) #adds the current y value to the y list
clf() #clears the plot
plot(x,y,color='green') #tells the plot what to do
draw() #forces a draw
def main():
for i in range(1,25): #run my function 25 times (24 I think actually)
function(i)
pause(.1)
main()
Have you tried using the interactive mode of matplotlib?
You can switch it on using ion() (see Documentation)
If you use interactive mode you do not need to call draw() but you might need to clear your figures using clf() depending on your desired output
I find that using the Tkagg backend works
import matplotlib
matplotlib.use('Tkagg')
credit to 457290092