I have a working sns.heatmap, showing volatility of currency pairs. What I want is to be retain the color from the volatility, but superimpose the numeric spot rate in the relevant cell.
This way each cell displays two values- spot numeric, and volatility color. Is there a way to use sns.heatmap to do this directly, or failing that, grab the graphical info and superimpose the spot data.
never mind, I've found an updated part of seaborn documentation. annot can now be set to an array of values separate from the color map.
Simple example:
import seaborn as sns
data_number = pd.DataFrame(np.random.randint(0, 3, (11,5)), columns=['A', 'B', 'C', 'D', 'E'], index = range(2000, 2011, 1))
data_string = data_number.replace(0,"").replace(1,"Some").replace(2,"Lots")
sns.heatmap(data_number, annot=data_string,
xticklabels=data_number.columns,
yticklabels=data_number.index,
fmt="s", cmap = sns.cm.rocket_r)
Results in:
Related
My dataframe has a column 'rideable_type' which has 3 unique values:
1.classic_bike
2.docked_bike
3.electric_bike
While plotting a barplot using the following code:
g = sns.FacetGrid(electric_casual_type_week, col='member_casual', hue='rideable_type', height=7, aspect=0.65)
g.map(sns.barplot, 'day_of_week', 'number_of_rides').add_legend()
I only get a plot showing 2 unique 'rideable_type' values.
Here is the plot:
As you can see only 'electric_bike' and 'classic_bike' are seen and not 'docked_bike'.
The main problem is that all the bars are drawn on top of each other. Seaborn's barplots don't easily support stacked bars. Also, this way of creating the barplot doesn't support the default "dodging" (barplot is called separately for each hue value, while it would be needed to call it in one go for dodging to work).
Therefore, the recommended way is to use catplot, a special version of FacetGrid for categorical plots.
g = sns.catplot(kind='bar', data=electric_casual_type_week, x='day_of_week', y='number_of_rides',
col='member_casual', hue='rideable_type', height=7, aspect=0.65)
Here is an example using Seaborn's 'tips' dataset:
import seaborn as sns
tips = sns.load_dataset('tips')
g = sns.FacetGrid(data=tips, col='time', hue='sex', height=7, aspect=0.65)
g.map_dataframe(sns.barplot, x='day', y='total_bill')
g.add_legend()
When comparing with sns.catplot, the coinciding bars are clear:
g = sns.catplot(kind='bar', data=tips, x='day', y='total_bill', col='time', hue='sex', height=7, aspect=0.65)
I've followed this tutorial:
https://towardsdatascience.com/creating-beautiful-maps-with-python-6e1aae54c55c
and the one this above was derived from.
They pass a list of edge colors to the plot_graph function
like so:
fig, ax = ox.plot_graph(gdf, node_size=0, bbox = (north, south, east, west),figsize=(height, width),
dpi = 96,bgcolor = bgcolor,
save = False, edge_color=roadColors,
edge_linewidth=roadWidths, edge_alpha=1)
I don't think they're assigned the way that the tutorial indicates.
On the github I found get_edge_colors_by_attr which seems to take attributes into account.
How are the colors assigned?
Specifically I am asking because I'd like to plot "highways" in different colors based on their openstreetmap tag.
How does osmnx.plot_graph determine which edges get which colors?
You can see how it does it here. Essentially, it either applies a single color to all edges or, if you passed it a list of colors, it assigns the first color in the list to the first edge in the graph, the second to the second, the third to the third, and so on.
Specifically I am asking because I'd like to plot "highways" in different colors based on their openstreetmap tag.
You can create a list of colors based on the edges' highway attribute values:
import osmnx as ox
G = ox.graph_from_place('Piedmont, California, USA', network_type='drive')
# assign colors to edges based on "highway" value
hwy_color = {'residential': 'gray',
'secondary': 'r',
'tertiary': 'y',
'tertiary_link': 'b',
'unclassified': 'm'}
edges = ox.graph_to_gdfs(G, nodes=False)['highway']
ec = edges.replace(hwy_color)
# plot graph using these colors
fig, ax = ox.plot_graph(G, edge_color=ec)
Also, you mentioned get_edge_colors_by_attr but note that per the docs the attribute must be numeric.
I plotted a scatterplot with seaborn library and I want to change the legend text but dont know how to do that.
example:
The following is iris dataset with species columns encoded in 0/1/2 as per species.
plt.figure(figsize=(8,8))
pl = sns.scatterplot(x='petal_length', y ='petal_width', hue='Species', data=data, s=40,
palette='Set1', legend='full')
I want to change the legends text from [0, 1, 2] to ['setosa', 'versicolor', 'virginica'].
can anybody help.
First, Seaborn (and Matplotlib) usually picks up the labels to put into the legend for hue from the unique values of the array you provide as hue. So as a first step, check that the column Species in your dataframe actually contains the values "setosa", "versicolor", "virginica". If not, one solution is to temporarily map them to other values, for the purpose of plotting:
legend_map = {0: 'setosa',
1: 'versicolor',
2: 'virginica'}
plt.figure(figsize=(8,8))
ax = sns.scatterplot(x=data['petal_length'], y =data['petal_width'], hue=data['species'].map(legend_map),
s=40, palette='Set1', legend='full')
plt.show()
Alternatively, if you want to directly manipulate the plot information and not the underlying data, you can do by accessing the legend names directly:
plt.figure(figsize=(8,8))
ax = sns.scatterplot(x='petal_length', y ='petal_width', hue='species', data=data, s=40,
palette='Set1', legend='full')
l = ax.legend()
l.get_texts()[0].set_text('Species') # You can also change the legend title
l.get_texts()[1].set_text('Setosa')
l.get_texts()[2].set_text('Versicolor')
l.get_texts()[3].set_text('Virginica')
plt.show()
This methodology allows you to also change the legend title, if need be.
Consider I have load a dataset as follows:
ds = yt.load('pltxxx')
The dataset includes the following fields
density, mag_vort, tracer, x_velocity, y_velocity
One can simply plot the mag_vort which is the magnitude of vorticity in 2D domain in this case, by means of:
slc = yt.SlicePlot(ds, 'z', 'mag_vort')
If I want to export the x-cooridnates, y-coordinates and vorticity_magnitude in the txt file (or numpy array) or plot it via matplotlib scatter plot
plt.scatter(x_coor, y_coor, c=mag_vort)
Is there an easy way to extract those information from dataset?
You can use a data object (in this case we use the all_data data object) to access the field values for the 'x', 'y', and 'mag_vort' fields:
ad = ds.all_data()
x = ad['x']
y = ad['y']
mag_vort = ad['mag_vort']
The arrays you get back from accessing a data object are YTArray instances. YTArray is a subclass of numpy's ndarray that has units attached.
Before you pass these arrays to matplotlib, convert them to whichever units you want to do the plot in, then cast them to numpy arrays:
x_plot = np.array(x.to('km'))
y_plot = np.array(y.to('km'))
plt.scatter(x_plot, y_plot, c=np.array(mag_vort))
I am using sci-kit image to get the "regionprops" of a segmented image. I then wish to replace each of the segment labels with their corresponding statistic (e.g eccentricity).
from skimage import segmentation
from skimage.measure import regionprops
#a segmented image
labels = segmentation.slic(img1, compactness=10, n_segments=200)
propimage = labels
#props loop
for region in regionprops(labels1, properties ='eccentricity') :
eccentricity = region.eccentricity
propimage[propimage==region] = eccentricity
This runs, but the propimage values do not change from their original labels
I have also tried:
for i in range(0,max(labels)):
prop = regions[i].eccentricity #the way to cal a single prop
propimage[i]= prop
This delivers this error
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
I am a recent migrant from matlab where I have implemented this, but the data structures used are completely different.
Can any one help me with this?
Thanks
Use ndimage from scipy : the sum() function can operate using your label array.
from scipy import ndimage as nd
sizes = nd.sum(label_file[0]>0, labels=label_file[0], index=np.arange(0,label_file[1])
You can then evaluate the distribution with numpy.histogram and so on.