Scatterplot with x axis only - scatter-plot

I have a dataframe 'Spreads' where one of the columns is 'HY_OAS'. My goal is to draw a horizontal line (basically representing a range of values for 'HY_OAS') and plot the column mean on that line. In addition, I wanted the x axis min/max to be the min/max for that column and I'd like to include text boxes annotating the min/max. The problem is I'm not sure how to proceed because all I have is the below. Thanks for any and all help. The goal is the second image and the current code is the first image.
fig8 = px.scatter(x=[Spreads['HY_OAS'].mean()], y=[0])
fig8.update_xaxes(visible=True,showticklabels=False,range=[Spreads['HY_OAS'].min(),Spreads['HY_OAS'].max()])
fig8.update_yaxes(visible=True,showticklabels=False, range=[0,0])

Following what you describe and what you have coded
generate some sample data in a dataframe
scatter values along x-axis and use constant for y-axis
add mean marker
format figure
add required annotations
import numpy as np
import plotly.express as px
import pandas as pd
# simulate some data
Spreads = pd.DataFrame({"HY_OAS": np.sin(np.random.uniform(0, np.pi * 2, 50))})
# scatter values along x-axis and and larger point for mean
fig = px.scatter(Spreads, x="HY_OAS", y=np.full(len(Spreads), 0)).add_traces(
px.scatter(x=[Spreads.mean()], y=[0])
.update_traces(marker={"color": "red", "size": 20})
.data
)
# fix up figure config
fig.update_layout(
xaxis_visible=False,
yaxis_visible=False,
showlegend=False,
paper_bgcolor="rgba(0,0,0,0)",
plot_bgcolor="rgba(0,0,0,0)",
)
# finally required annootations
fig.add_annotation(x=Spreads["HY_OAS"].mean(), y=0, text=Spreads["HY_OAS"].mean().round(4))
fig.add_annotation(x=Spreads["HY_OAS"].min(), y=0, text=Spreads["HY_OAS"].min().round(2), showarrow=False, xshift=-20)
fig.add_annotation(x=Spreads["HY_OAS"].max(), y=0, text=Spreads["HY_OAS"].max().round(2), showarrow=False, xshift=20)
straight line
build base figure as follows
then same code to add annotations and configure layout
fig = px.line(x=[Spreads["HY_OAS"].min(), Spreads["HY_OAS"].max()], y=[0,0]).add_traces(
px.scatter(x=[Spreads.mean()], y=[0])
.update_traces(marker={"color": "red", "size": 20})
.data
)

Related

Add location marker on plotted Geopandas Dataframe using Folium

Context
I have an merged geodataframe of 1). Postalcode areas and 2). total amount of deliveries within that postalcode area in the city of Groningen called results. The geodataframe includes geometry that include Polygons and Multiploygons visualizing different Postal code areas within the city.
I am new to GeoPandas and therefore I've tried different tutorials including this one from the geopandas official website wherein I got introduced into interactive Folium maps, which I really like. I was able to plot my geodataframe using result.explore(), which resulted in the following map
The problem
So far so good, but now I want to simply place an marker using the folium libarty with the goal to calculate the distance between the marker and the postalcode areas. After some looking on the internet I found out in the quickstart guild that you need to create an folium.Map, then you need folium.Choropleth for my geodataframe and folium.Marker and add them to the folium.Map.
m = folium.Map(location=[53.21917, 6.56667], zoom_start=15)
folium.Marker(
[53.210903, 6.598276],
popup="My marker"
).add_to(m)
folium.Choropleth(results, data=results, columns="Postcode", fill_color='OrRd', name="Postalcode areas").add_to(m)
folium.LayerControl().add_to(m)
m
But when try to run the above code I get the following error:
What is the (possible) best way?
Besides my failing code (which would be great if someone could help me out). I am curious if this is the way to do it (Folium map + marker + choropleth). Is it not possible to call geodataframe.explore() which results into the map in second picture and then just add an marker on the same map? I have the feeling that I am making it too difficult, there must be an better solution using Geopandas.
you have not provided the geometry. Have found postal districts of Netherlands and used that
explore() supports will draw a point as a marker with appropriate parameters
hence two layers,
one is postal areas coloured using number of deliveries
second is point, with distance to each area calculated
import geopandas as gpd
import shapely.geometry
import pandas as pd
import numpy as np
geo_url = "https://geodata.nationaalgeoregister.nl/cbsgebiedsindelingen/wfs?request=GetFeature&service=WFS&version=2.0.0&typeName=cbs_provincie_2017_gegeneraliseerd&outputFormat=json"
gdf = gpd.read_file(geo_url).assign(
deliveries=lambda d: np.random.randint(10**4, 10**6, len(d))
)
p = gpd.GeoSeries(shapely.geometry.Point(6.598276, 53.210903), crs="epsg:4386")
# calc distances to point
gdf["distance"] = gdf.distance(p.to_crs(gdf.crs).values[0])
# dataframe of flattened distances
dfp = pd.DataFrame(
[
"<br>".join(
[f"{a} - {b:.2f}" for a, b in gdf.loc[:, ["statcode", "distance"]].values]
)
],
columns=["info"],
)
# generate colored choropleth
m = gdf.explore(
column="deliveries", categorical=True, legend=False, height=400, width=400
)
# add marker with distances
gpd.GeoDataFrame(
geometry=p,
data=dfp,
).explore(m=m, marker_type="marker")

How to customize seaborn.scatterplot legends?

I plotted a scatterplot with seaborn library and I want to change the legend text but dont know how to do that.
example:
The following is iris dataset with species columns encoded in 0/1/2 as per species.
plt.figure(figsize=(8,8))
pl = sns.scatterplot(x='petal_length', y ='petal_width', hue='Species', data=data, s=40,
palette='Set1', legend='full')
I want to change the legends text from [0, 1, 2] to ['setosa', 'versicolor', 'virginica'].
can anybody help.
First, Seaborn (and Matplotlib) usually picks up the labels to put into the legend for hue from the unique values of the array you provide as hue. So as a first step, check that the column Species in your dataframe actually contains the values "setosa", "versicolor", "virginica". If not, one solution is to temporarily map them to other values, for the purpose of plotting:
legend_map = {0: 'setosa',
1: 'versicolor',
2: 'virginica'}
plt.figure(figsize=(8,8))
ax = sns.scatterplot(x=data['petal_length'], y =data['petal_width'], hue=data['species'].map(legend_map),
s=40, palette='Set1', legend='full')
plt.show()
Alternatively, if you want to directly manipulate the plot information and not the underlying data, you can do by accessing the legend names directly:
plt.figure(figsize=(8,8))
ax = sns.scatterplot(x='petal_length', y ='petal_width', hue='species', data=data, s=40,
palette='Set1', legend='full')
l = ax.legend()
l.get_texts()[0].set_text('Species') # You can also change the legend title
l.get_texts()[1].set_text('Setosa')
l.get_texts()[2].set_text('Versicolor')
l.get_texts()[3].set_text('Virginica')
plt.show()
This methodology allows you to also change the legend title, if need be.

How to rotate ylabel of pairplot in searborn? [duplicate]

I have a simple factorplot
import seaborn as sns
g = sns.factorplot("name", "miss_ratio", "policy", dodge=.2,
linestyles=["none", "none", "none", "none"], data=df[df["level"] == 2])
The problem is that the x labels all run together, making them unreadable. How do you rotate the text so that the labels are readable?
I had a problem with the answer by #mwaskorn, namely that
g.set_xticklabels(rotation=30)
fails, because this also requires the labels. A bit easier than the answer by #Aman is to just add
plt.xticks(rotation=45)
You can rotate tick labels with the tick_params method on matplotlib Axes objects. To provide a specific example:
ax.tick_params(axis='x', rotation=90)
This is still a matplotlib object. Try this:
# <your code here>
locs, labels = plt.xticks()
plt.setp(labels, rotation=45)
Any seaborn plots suported by facetgrid won't work with (e.g. catplot)
g.set_xticklabels(rotation=30)
however barplot, countplot, etc. will work as they are not supported by facetgrid. Below will work for them.
g.set_xticklabels(g.get_xticklabels(), rotation=30)
Also, in case you have 2 graphs overlayed on top of each other, try set_xticklabels on graph which supports it.
If anyone wonders how to this for clustermap CorrGrids (part of a given seaborn example):
import seaborn as sns
import matplotlib.pyplot as plt
sns.set(context="paper", font="monospace")
# Load the datset of correlations between cortical brain networks
df = sns.load_dataset("brain_networks", header=[0, 1, 2], index_col=0)
corrmat = df.corr()
# Set up the matplotlib figure
f, ax = plt.subplots(figsize=(12, 9))
# Draw the heatmap using seaborn
g=sns.clustermap(corrmat, vmax=.8, square=True)
rotation = 90
for i, ax in enumerate(g.fig.axes): ## getting all axes of the fig object
ax.set_xticklabels(ax.get_xticklabels(), rotation = rotation)
g.fig.show()
You can also use plt.setp as follows:
import matplotlib.pyplot as plt
import seaborn as sns
plot=sns.barplot(data=df, x=" ", y=" ")
plt.setp(plot.get_xticklabels(), rotation=90)
to rotate the labels 90 degrees.
For a seaborn.heatmap, you can rotate these using (based on #Aman's answer)
pandas_frame = pd.DataFrame(data, index=names, columns=names)
heatmap = seaborn.heatmap(pandas_frame)
loc, labels = plt.xticks()
heatmap.set_xticklabels(labels, rotation=45)
heatmap.set_yticklabels(labels[::-1], rotation=45) # reversed order for y
One can do this with matplotlib.pyplot.xticks
import matplotlib.pyplot as plt
plt.xticks(rotation = 'vertical')
# Or use degrees explicitly
degrees = 70 # Adjust according to one's preferences/needs
plt.xticks(rotation=degrees)
Here one can see an example of how it works.
Use ax.tick_params(labelrotation=45). You can apply this to the axes figure from the plot without having to provide labels. This is an alternative to using the FacetGrid if that's not the path you want to take.
If the labels have long names it may be hard to get it right. A solution that worked well for me using catplot was:
import matplotlib.pyplot as plt
fig = plt.gcf()
fig.autofmt_xdate()

Setting correct limits with imshow if image data shape changes

I have a 3D array, of which the first two dimensions are spatial, so say (x,y). The third dimension contains point-specific information.
print H.shape # --> (200, 480, 640) spatial extents (200,480)
Now, by selecting a certain plane in the third dimension, I can display an image with
imdat = H[:,:,100] # shape (200, 480)
img = ax.imshow(imdat, cmap='jet',vmin=imdat.min(),vmax=imdat.max(), animated=True, aspect='equal')
I want to now rotate the cube, so that I switch from (x,y) to (y,x).
H = np.rot90(H) # could also use H.swapaxes(0,1) or H.transpose((1,0,2))
print H.shape # --> (480, 200, 640)
Now, when I call:
imdat = H[:,:,100] # shape (480,200)
img.set_data(imdat)
ax.relim()
ax.autoscale_view(tight=True)
I get weird behavior. The image along the rows displays the data till 200th row, and then it is black until the end of the y-axis (480). The x-axis extends from 0 to 200 and shows the rotated data. Now on, another rotation by 90-degrees, the image displays correctly (just rotated 180 degrees of course)
It seems to me like after rotating the data, the axis limits, (or image extents?) or something is not refreshing correctly. Can somebody help?
PS: to indulge in bad hacking, I also tried to regenerate a new image (by calling ax.imshow) after each rotation, but I still get the same behavior.
Below I include a solution to your problem. The method resetExtent uses the data and the image to explicitly set the extent to the desired values. Hopefully I correctly emulated the intended outcome.
import matplotlib.pyplot as plt
import numpy as np
def resetExtent(data,im):
"""
Using the data and axes from an AxesImage, im, force the extent and
axis values to match shape of data.
"""
ax = im.get_axes()
dataShape = data.shape
if im.origin == 'upper':
im.set_extent((-0.5,dataShape[0]-.5,dataShape[1]-.5,-.5))
ax.set_xlim((-0.5,dataShape[0]-.5))
ax.set_ylim((dataShape[1]-.5,-.5))
else:
im.set_extent((-0.5,dataShape[0]-.5,-.5,dataShape[1]-.5))
ax.set_xlim((-0.5,dataShape[0]-.5))
ax.set_ylim((-.5,dataShape[1]-.5))
def main():
fig = plt.gcf()
ax = fig.gca()
H = np.zeros((200,480,10))
# make distinguishing corner of data
H[100:,...] = 1
H[100:,240:,:] = 2
imdat = H[:,:,5]
datShape = imdat.shape
im = ax.imshow(imdat,cmap='jet',vmin=imdat.min(),
vmax=imdat.max(),animated=True,
aspect='equal',
# origin='lower'
)
resetExtent(imdat,im)
fig.savefig("img1.png")
H = np.rot90(H)
imdat = H[:,:,0]
im.set_data(imdat)
resetExtent(imdat,im)
fig.savefig("img2.png")
if __name__ == '__main__':
main()
This script produces two images:
First un-rotated:
Then rotated:
I thought just explicitly calling set_extent would do everything resetExtent does, because it should adjust the axes limits if 'autoscle' is True. But for some unknown reason, calling set_extent alone does not do the job.

In Matplotlib, how do you add an Imagedraw object to a PyPlot?

I need to add a shape to a preexisting image generated using a pyplot (plt). The best way I know of to generate basic shapes quickly is using Imagedraw's predefined shapes. The original data has points with corresponding colors in line_holder and colorholder. I need to add a bounding box (or in this case ellipse) to the plot to make it obvious to the user whether the data is in an acceptable range.
import matplotlib.pyplot as plt
from matplotlib.collections import LineCollection
from PIL import Image
...
lines = LineCollection(mpl.line_holder, colors=mpl.colorholder , linestyle='solid')
plt.axes().add_collection(lines)
plt.axes().set_aspect('equal', 'datalim')
plt.axes().autoscale_view(True,True,True)
plt.draw()
plt.show()
I tried inserting this before the show():
image = Image.new('1',(int(ceil(disc/conv))+2,int(ceil(disc/conv))+1), 1)
draw = ImageDraw.Draw(image)
box=(1, 1, int(ceil(disc/conv)), int(ceil(disc/conv))) #create bounding box
draw.ellipse(box, 1, 0) #draw circle in black
but I cannot find a way to then add this ellipse to the pyplot. Does anyone know how one would go about getting the images together? If it is not possible to add an imagedraw object to a pyplot, are there good alternatives for performing this type of operation?
Matplotlib has several patches (shapes) that appear to meet your needs (and remove PIL as a dependency). They are documented here. A helpful example using shapes is here.
To add an ellipse to a plot, you first create a Ellipse patch and then add that patch to the axes you're currently working on. Beware that Circle's (or Ellipse's with equal minor radii) will appear elliptical if your aspect ratio is not equal.
In your snippet you call plt.axes() several times. This is unnecessary, as it is just returning the current axes object. I think it is clearer to keep the axes object and directly operate on it rather than repeatedly getting the same object via plt.axes(). As far as axes() is used in your snippet, gca() does the same thing. The end of my script demonstrates this.
I've also replaced your add_collection() line by a plotting a single line. These essentially do the same thing and allows my snippet to be executed as a standalone script.
import matplotlib.pyplot as plt
import matplotlib as mpl
# set up your axes object
ax = plt.axes()
ax.set_aspect('equal', 'datalim')
ax.autoscale_view(True, True, True)
# adding a LineCollection is equivalent to plotting a line
# this will run as a stand alone script
x = range(10)
plt.plot( x, x, 'x-')
# add and ellipse to the axes
c = mpl.patches.Ellipse( (5, 5), 1, 6, angle=45)
ax.add_patch(c)
# you can get the current axes a few ways
ax2 = plt.axes()
c2 = mpl.patches.Ellipse( (7, 7), 1, 6, angle=-45, color='green')
ax2.add_patch(c2)
ax3 = plt.gca()
c3 = mpl.patches.Ellipse( (0, 2), 3, 3, color='black')
ax3.add_patch(c3)
print id(ax), id(ax2), id(ax3)
plt.show()

Resources