I use the following to create my subplots
fig, axs = plt.subplots(2,2)
sns.plotfunc(..., ax = axs[0])
but, the pairplot function in seaborn does not support the ax augment, any idea how to plot it as subplot?
Thanks in advance.
You can use Seaborn's PairGrid to plot multiple pairplots like this:
g = sns.PairGrid(df, y_vars=['variable_a','variable_b'], x_vars=["variable_c", "variable_d"], height=4)
g.map(sns.regplot)
plt.show()
Another example on how to use PairGrid can be found here.
Actually, if I passed plt.subplots(2, 2), it will return 2*2 array, thus I should use sns.plotfunc(..., ax = axs[0][1]), instead
Related
I am having trouble adding a basemap to my map. My geodataframe is created using X and Y coords of a bunch of points.
gdf = geo.GeoDataFrame(
df, geometry=gpd.points_from_xy(df['X'], df['Y']))
gdf.set_crs(epsg=3857)
Which look like this:
After using contexily to get a basemap, I cannot get the basemap to properly show up. The coords should be showing the bottom of the Mississippi River Basin.
ax = gdf.plot(color="red", figsize=(9, 9))
cx.add_basemap(ax, zoom=0, crs= gdf.crs)
Let me know if there is anything wrong with my code as to why it is not showing up.
Thanks!
It looks like your data is in WGS84/EPSG:4326 (i.e. lat/lon) coordinates. So I think you're confusing geopandas.GeoDataFrame.set_crs, which tells geopandas what the CRS of the data is, with geopandas.GeoDataFrame.to_crs, which transforms the data from the current CRS to the new one you specify. Also note that neither of these operations are in-place by default. So I think you want:
gdf = geo.GeoDataFrame(
df, geometry=gpd.points_from_xy(df['X'], df['Y'])
)
gdf = gdf.set_crs("epsg:4326")
gdf_mercator = gdf.to_crs("epsg:3857")
This really is same as #Michael Delgado answer. It's simpler to state the CRS at GeoDataFrame construction time. Also make sure you are using correct CRS
MWE
import geopandas as gpd
import geopandas as geo
import pandas as pd
import contextily as cx
# construct a dataframe with X and Y of some points in US
places = gpd.read_file(
gpd.datasets.get_path("naturalearth_cities"),
mask=gpd.read_file(gpd.datasets.get_path("naturalearth_lowres")).loc[
lambda d: d["iso_a3"].eq("USA")
],
)
df = pd.DataFrame({"X": places.geometry.x, "Y": places.geometry.y})
# user code, state CRS at construction time
gdf = geo.GeoDataFrame(
df, geometry=gpd.points_from_xy(df["X"], df["Y"]), crs="epsg:4326"
)
ax = gdf.plot(color="red", figsize=(9, 9))
cx.add_basemap(ax, zoom=0, crs=gdf.crs)
My dataframe has a column 'rideable_type' which has 3 unique values:
1.classic_bike
2.docked_bike
3.electric_bike
While plotting a barplot using the following code:
g = sns.FacetGrid(electric_casual_type_week, col='member_casual', hue='rideable_type', height=7, aspect=0.65)
g.map(sns.barplot, 'day_of_week', 'number_of_rides').add_legend()
I only get a plot showing 2 unique 'rideable_type' values.
Here is the plot:
As you can see only 'electric_bike' and 'classic_bike' are seen and not 'docked_bike'.
The main problem is that all the bars are drawn on top of each other. Seaborn's barplots don't easily support stacked bars. Also, this way of creating the barplot doesn't support the default "dodging" (barplot is called separately for each hue value, while it would be needed to call it in one go for dodging to work).
Therefore, the recommended way is to use catplot, a special version of FacetGrid for categorical plots.
g = sns.catplot(kind='bar', data=electric_casual_type_week, x='day_of_week', y='number_of_rides',
col='member_casual', hue='rideable_type', height=7, aspect=0.65)
Here is an example using Seaborn's 'tips' dataset:
import seaborn as sns
tips = sns.load_dataset('tips')
g = sns.FacetGrid(data=tips, col='time', hue='sex', height=7, aspect=0.65)
g.map_dataframe(sns.barplot, x='day', y='total_bill')
g.add_legend()
When comparing with sns.catplot, the coinciding bars are clear:
g = sns.catplot(kind='bar', data=tips, x='day', y='total_bill', col='time', hue='sex', height=7, aspect=0.65)
I would like to change the format of pIC50 in the legend box. I would like it to be "circle according to the size with no filled color". Any suggestions are welcome!
plt.figure(figsize=(7, 7))
sns.scatterplot(x='MW', y='LogP', data=df_2class, hue='class', size='pIC50', edgecolor='black', alpha=0.2)
sns.set_style("whitegrid", {"ytick.major.size": 100,"xtick.major.size": 2, 'grid.linestyle': 'solid'})
plt.xlabel('MW', fontsize=14, fontweight='bold')
plt.ylabel('LogP', fontsize=14, fontweight='bold')
plt.legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0)
In this case, you can loop through the last legend handles and change the color of the dots. Here is an example using the iris dataset:
import matplotlib.pyplot as plt
import seaborn as sns
iris = sns.load_dataset('iris')
ax = sns.scatterplot(data=iris, x='sepal_length', y='petal_length', hue='species', size='sepal_width')
handles, labels = ax.get_legend_handles_labels()
for h in handles[-5:]: # changes the 5 last handles, this number might be different in your case
h.set_facecolor('none')
ax.legend(handles=handles, labels=labels, bbox_to_anchor=[1.02, 1.02], loc='upper left')
plt.tight_layout()
plt.show()
I plotted a scatterplot with seaborn library and I want to change the legend text but dont know how to do that.
example:
The following is iris dataset with species columns encoded in 0/1/2 as per species.
plt.figure(figsize=(8,8))
pl = sns.scatterplot(x='petal_length', y ='petal_width', hue='Species', data=data, s=40,
palette='Set1', legend='full')
I want to change the legends text from [0, 1, 2] to ['setosa', 'versicolor', 'virginica'].
can anybody help.
First, Seaborn (and Matplotlib) usually picks up the labels to put into the legend for hue from the unique values of the array you provide as hue. So as a first step, check that the column Species in your dataframe actually contains the values "setosa", "versicolor", "virginica". If not, one solution is to temporarily map them to other values, for the purpose of plotting:
legend_map = {0: 'setosa',
1: 'versicolor',
2: 'virginica'}
plt.figure(figsize=(8,8))
ax = sns.scatterplot(x=data['petal_length'], y =data['petal_width'], hue=data['species'].map(legend_map),
s=40, palette='Set1', legend='full')
plt.show()
Alternatively, if you want to directly manipulate the plot information and not the underlying data, you can do by accessing the legend names directly:
plt.figure(figsize=(8,8))
ax = sns.scatterplot(x='petal_length', y ='petal_width', hue='species', data=data, s=40,
palette='Set1', legend='full')
l = ax.legend()
l.get_texts()[0].set_text('Species') # You can also change the legend title
l.get_texts()[1].set_text('Setosa')
l.get_texts()[2].set_text('Versicolor')
l.get_texts()[3].set_text('Virginica')
plt.show()
This methodology allows you to also change the legend title, if need be.
I have a simple factorplot
import seaborn as sns
g = sns.factorplot("name", "miss_ratio", "policy", dodge=.2,
linestyles=["none", "none", "none", "none"], data=df[df["level"] == 2])
The problem is that the x labels all run together, making them unreadable. How do you rotate the text so that the labels are readable?
I had a problem with the answer by #mwaskorn, namely that
g.set_xticklabels(rotation=30)
fails, because this also requires the labels. A bit easier than the answer by #Aman is to just add
plt.xticks(rotation=45)
You can rotate tick labels with the tick_params method on matplotlib Axes objects. To provide a specific example:
ax.tick_params(axis='x', rotation=90)
This is still a matplotlib object. Try this:
# <your code here>
locs, labels = plt.xticks()
plt.setp(labels, rotation=45)
Any seaborn plots suported by facetgrid won't work with (e.g. catplot)
g.set_xticklabels(rotation=30)
however barplot, countplot, etc. will work as they are not supported by facetgrid. Below will work for them.
g.set_xticklabels(g.get_xticklabels(), rotation=30)
Also, in case you have 2 graphs overlayed on top of each other, try set_xticklabels on graph which supports it.
If anyone wonders how to this for clustermap CorrGrids (part of a given seaborn example):
import seaborn as sns
import matplotlib.pyplot as plt
sns.set(context="paper", font="monospace")
# Load the datset of correlations between cortical brain networks
df = sns.load_dataset("brain_networks", header=[0, 1, 2], index_col=0)
corrmat = df.corr()
# Set up the matplotlib figure
f, ax = plt.subplots(figsize=(12, 9))
# Draw the heatmap using seaborn
g=sns.clustermap(corrmat, vmax=.8, square=True)
rotation = 90
for i, ax in enumerate(g.fig.axes): ## getting all axes of the fig object
ax.set_xticklabels(ax.get_xticklabels(), rotation = rotation)
g.fig.show()
You can also use plt.setp as follows:
import matplotlib.pyplot as plt
import seaborn as sns
plot=sns.barplot(data=df, x=" ", y=" ")
plt.setp(plot.get_xticklabels(), rotation=90)
to rotate the labels 90 degrees.
For a seaborn.heatmap, you can rotate these using (based on #Aman's answer)
pandas_frame = pd.DataFrame(data, index=names, columns=names)
heatmap = seaborn.heatmap(pandas_frame)
loc, labels = plt.xticks()
heatmap.set_xticklabels(labels, rotation=45)
heatmap.set_yticklabels(labels[::-1], rotation=45) # reversed order for y
One can do this with matplotlib.pyplot.xticks
import matplotlib.pyplot as plt
plt.xticks(rotation = 'vertical')
# Or use degrees explicitly
degrees = 70 # Adjust according to one's preferences/needs
plt.xticks(rotation=degrees)
Here one can see an example of how it works.
Use ax.tick_params(labelrotation=45). You can apply this to the axes figure from the plot without having to provide labels. This is an alternative to using the FacetGrid if that's not the path you want to take.
If the labels have long names it may be hard to get it right. A solution that worked well for me using catplot was:
import matplotlib.pyplot as plt
fig = plt.gcf()
fig.autofmt_xdate()