to_rgba in Seaborn stripplot - Set different transparency for filling color and edgecolor - seaborn

I want to set different transparencies for filling color and edgecolor in seaborn stripplot:
import seaborn as sns
from matplotlib.colors import to_rgba
tips = sns.load_dataset("tips")
sns.stripplot(x="day", y="total_bill", hue="smoker",
data=tips,
palette={'Yes': to_rgba('darkgreen', 0.3), 'No': to_rgba('red', 0.3)},
edgecolor='black', linewidth=1,)
Why doesn't it work? I just want to keep black adgecolor (keep black as alpha = 1.0) but make the filling colors to be transparent (darkgreen and red to be alpha = 0.3). If I use alpha, it will make both to be transparent.
I can use scatterplot to achieve similar thing, but I hope I can use stripplot :
import seaborn as sns
from matplotlib.colors import to_rgba
tips = sns.load_dataset("tips")
color_dict = {'Yes': to_rgba('darkgreen', 0.1),
'No': to_rgba('red', 0.1)}
sns.scatterplot(x="day", y="total_bill", data=tips, hue = "smoker", palette=color_dict, edgecolor='black', linewidth=1)
The figure will be:

One way to set different alpha values based on smoke = Yes/No would to use the below code. You can set different colors and alpha as required.
import seaborn as sns
tips = sns.load_dataset("tips")
ax=sns.stripplot(x="day", y="total_bill", data=tips[tips.smoker == "Yes"], alpha = 0.3, color = 'darkgreen', edgecolor='black', linewidth=1)
sns.stripplot(x="day", y="total_bill", data=tips[tips.smoker == "No"], alpha = 0.3, color = 'red', edgecolor='black', linewidth=1)
Plot
EDIT
To set the colors directly, you can set your custom palette as below and then use smoker as hue.
# Create an array with the colors you want to use
colors = ["red", "darkgreen"]
# Set your custom color palette
sns.set_palette(sns.color_palette(colors))
#Use hue for smoker to differentiate colors
sns.stripplot(x="day", y="total_bill", data=tips, alpha = 0.3, hue = "smoker", edgecolor='black', linewidth=1)
IF your requirement is that HAVE to use to_rgba(), then you can set it in colors and keep the alpha here as well. This is the other code...
# Create an array with the colors you want to use
colors = [to_rgba("red", 0.3), to_rgba("darkgreen", 0.3)]
# Set your custom color palette
sns.set_palette(sns.color_palette(colors))
#Use hue for smoker to differentiate colors
sns.stripplot(x="day", y="total_bill", data=tips, hue = "smoker", edgecolor='black', linewidth=1)
In both cases, plot will be as below.
New Requirement
As you mentioned in the update, you are ok with the colors from scatter plot. You can write a small function to add jitter to your figure, which will achieve what you are looking for. See the update to your scatterplot code below to achieve what you are looking for.
import seaborn as sns
from matplotlib.colors import to_rgba
tips = sns.load_dataset("tips")
color_dict = {'Yes': to_rgba('darkgreen', 0.1), 'No': to_rgba('red', 0.1)}
#The days are strings, convert to numbers for plotting
Numday = {'Thur': 1, 'Fri': 2, 'Sat': 3, 'Sun' : 4}
def addJitter(x): ##Function to add jitter. Adjust numbers to make it thick/thin
return x + random.uniform(0, .3) -.15
## Convert day -> Numday which is a different number for each day
tips['numDays'] = tips['day'].astype("string").apply(lambda x: numDays[x])
## Use jitter function to add jitter to each numDays
tips['jitter'] = tips['numDays'].apply(lambda x: addJitter(x))
sns.scatterplot(x= tips.jitter, y=tips.total_bill, hue = tips.smoker, palette=color_dict, edgecolor='black', linewidth=1)
## You will need to reset the x-axis to show the day
plt.xticks([1,2,3,4])
plt.gca().set_xticklabels(['Thur', 'Fri', 'Sat', 'Sun'])

Related

Seaborn PairGrid: pairplot two data set with different transparency

I'd like to make a PairGrid plot with the seaborn library.
I have two classed data: a training set and one-target point.
I'd like to plot the one-target point as opaque, however, the samples in the training set should be transparent.
And I'd like to plot the one-target point also in lower cells.
Here is my code and image:
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
data = pd.read_csv("data.csv")
g = sns.PairGrid(data, hue='type')
g.map_upper(sns.scatterplot, alpha=0.2, palette="husl")
g.map_lower(sns.kdeplot, lw=3, palette="husl")
g.map_diag(sns.kdeplot, lw=3, palette="husl")
g.add_legend()
plt.show()
And the data.csv is like belows:
logP tPSA QED HBA HBD type
0 -2.50000 200.00 0.300000 8 1 Target 1
1 1.68070 87.31 0.896898 3 2 Training set
2 3.72930 44.12 0.862259 4 0 Training set
3 2.29702 91.68 0.701022 6 3 Training set
4 -2.21310 102.28 0.646083 8 2 Training set
You can reassign the dataframe used after partial plotting. E.g. g.data = data[data['type'] == 'Target 1']. So, you can first plot the training dataset, change g.data and then plot the target with other parameters.
The following example supposes the first row of the iris dataset is used as training data. A custom legend is added (this might provoke a warning that should be ignored).
import matplotlib.pyplot as plt
from matplotlib.lines import Line2D
import seaborn as sns
iris = sns.load_dataset('iris')
g = sns.PairGrid(iris)
color_for_trainingset = 'paleturquoise'
# color_for_trainingset = sns.color_palette('husl', 2) [-1] # this is the color from the question
g.map_upper(sns.scatterplot, alpha=0.2, color=color_for_trainingset)
g.map_lower(sns.kdeplot, color=color_for_trainingset)
g.map_diag(sns.kdeplot, lw=3, color=color_for_trainingset)
g.data = iris.iloc[:1]
# g.data = data[data['type'] == 'Target 1']
g.map_upper(sns.scatterplot, alpha=1, color='red')
g.map_lower(sns.scatterplot, alpha=1, color='red', zorder=3)
handles = [Line2D([], [], color='red', ls='', marker='o', label='target'),
Line2D([], [], color=color_for_trainingset, lw=3, label='training set')]
g.add_legend(handles=handles)
plt.show()

Is there any way to change the legends in seaborn?

I would like to change the format of pIC50 in the legend box. I would like it to be "circle according to the size with no filled color". Any suggestions are welcome!
plt.figure(figsize=(7, 7))
sns.scatterplot(x='MW', y='LogP', data=df_2class, hue='class', size='pIC50', edgecolor='black', alpha=0.2)
sns.set_style("whitegrid", {"ytick.major.size": 100,"xtick.major.size": 2, 'grid.linestyle': 'solid'})
plt.xlabel('MW', fontsize=14, fontweight='bold')
plt.ylabel('LogP', fontsize=14, fontweight='bold')
plt.legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0)
In this case, you can loop through the last legend handles and change the color of the dots. Here is an example using the iris dataset:
import matplotlib.pyplot as plt
import seaborn as sns
iris = sns.load_dataset('iris')
ax = sns.scatterplot(data=iris, x='sepal_length', y='petal_length', hue='species', size='sepal_width')
handles, labels = ax.get_legend_handles_labels()
for h in handles[-5:]: # changes the 5 last handles, this number might be different in your case
h.set_facecolor('none')
ax.legend(handles=handles, labels=labels, bbox_to_anchor=[1.02, 1.02], loc='upper left')
plt.tight_layout()
plt.show()

Subplot Seaborn // White box failure

I'm trying to subplot seaborn in a kaggle environment with somethin like
import matplotlib.pyplot as plt
f, axes = plt.subplots(2, 1)
sns.jointplot(data=df, x="sepal length", y="petal length", hue="target", kind="kde", ax=axes[0])
sns.jointplot(data=df, x="sepal width", y="petal width", hue="target", kind="kde", ax=axes[1])
The graphic output looks gives me two empty white boxes:
How should I correct this to achieve a clean subplot?

When using rasterize=True with datashader, how do I get transparency where count=0 to see the underlying tile?

Currently, when I do this:
import pandas as pd
import hvplot.pandas
df = pd.util.testing.makeDataFrame()
plot = df.hvplot.points('A', 'B', tiles=True, rasterize=True, geo=True,
aggregator='count')
I can't see the underlying tile source.
To see the underlying tile source philippjfr suggested setting the color bar limits slightly higher than 0 and set the min clipping_colors to transparent:
plot = plot.redim.range(**{'Count': (0.25, 1)})
plot = plot.opts('Image', clipping_colors={'min': 'transparent'})
Now the underlying tile source is viewable.
Full Code:
import pandas as pd
import hvplot.pandas
df = pd.util.testing.makeDataFrame()
plot = df.hvplot.points('A', 'B', tiles=True, rasterize=True, geo=True,
aggregator='count')
plot = plot.redim.range(**{'Count': (0.25, 1)})
plot = plot.opts('Image', clipping_colors={'min': 'transparent'})
plot

How to rotate ylabel of pairplot in searborn? [duplicate]

I have a simple factorplot
import seaborn as sns
g = sns.factorplot("name", "miss_ratio", "policy", dodge=.2,
linestyles=["none", "none", "none", "none"], data=df[df["level"] == 2])
The problem is that the x labels all run together, making them unreadable. How do you rotate the text so that the labels are readable?
I had a problem with the answer by #mwaskorn, namely that
g.set_xticklabels(rotation=30)
fails, because this also requires the labels. A bit easier than the answer by #Aman is to just add
plt.xticks(rotation=45)
You can rotate tick labels with the tick_params method on matplotlib Axes objects. To provide a specific example:
ax.tick_params(axis='x', rotation=90)
This is still a matplotlib object. Try this:
# <your code here>
locs, labels = plt.xticks()
plt.setp(labels, rotation=45)
Any seaborn plots suported by facetgrid won't work with (e.g. catplot)
g.set_xticklabels(rotation=30)
however barplot, countplot, etc. will work as they are not supported by facetgrid. Below will work for them.
g.set_xticklabels(g.get_xticklabels(), rotation=30)
Also, in case you have 2 graphs overlayed on top of each other, try set_xticklabels on graph which supports it.
If anyone wonders how to this for clustermap CorrGrids (part of a given seaborn example):
import seaborn as sns
import matplotlib.pyplot as plt
sns.set(context="paper", font="monospace")
# Load the datset of correlations between cortical brain networks
df = sns.load_dataset("brain_networks", header=[0, 1, 2], index_col=0)
corrmat = df.corr()
# Set up the matplotlib figure
f, ax = plt.subplots(figsize=(12, 9))
# Draw the heatmap using seaborn
g=sns.clustermap(corrmat, vmax=.8, square=True)
rotation = 90
for i, ax in enumerate(g.fig.axes): ## getting all axes of the fig object
ax.set_xticklabels(ax.get_xticklabels(), rotation = rotation)
g.fig.show()
You can also use plt.setp as follows:
import matplotlib.pyplot as plt
import seaborn as sns
plot=sns.barplot(data=df, x=" ", y=" ")
plt.setp(plot.get_xticklabels(), rotation=90)
to rotate the labels 90 degrees.
For a seaborn.heatmap, you can rotate these using (based on #Aman's answer)
pandas_frame = pd.DataFrame(data, index=names, columns=names)
heatmap = seaborn.heatmap(pandas_frame)
loc, labels = plt.xticks()
heatmap.set_xticklabels(labels, rotation=45)
heatmap.set_yticklabels(labels[::-1], rotation=45) # reversed order for y
One can do this with matplotlib.pyplot.xticks
import matplotlib.pyplot as plt
plt.xticks(rotation = 'vertical')
# Or use degrees explicitly
degrees = 70 # Adjust according to one's preferences/needs
plt.xticks(rotation=degrees)
Here one can see an example of how it works.
Use ax.tick_params(labelrotation=45). You can apply this to the axes figure from the plot without having to provide labels. This is an alternative to using the FacetGrid if that's not the path you want to take.
If the labels have long names it may be hard to get it right. A solution that worked well for me using catplot was:
import matplotlib.pyplot as plt
fig = plt.gcf()
fig.autofmt_xdate()

Resources