Adding a basemap to a plot in Geopandas using X,Y coords - geopandas

I am having trouble adding a basemap to my map. My geodataframe is created using X and Y coords of a bunch of points.
gdf = geo.GeoDataFrame(
df, geometry=gpd.points_from_xy(df['X'], df['Y']))
gdf.set_crs(epsg=3857)
Which look like this:
After using contexily to get a basemap, I cannot get the basemap to properly show up. The coords should be showing the bottom of the Mississippi River Basin.
ax = gdf.plot(color="red", figsize=(9, 9))
cx.add_basemap(ax, zoom=0, crs= gdf.crs)
Let me know if there is anything wrong with my code as to why it is not showing up.
Thanks!

It looks like your data is in WGS84/EPSG:4326 (i.e. lat/lon) coordinates. So I think you're confusing geopandas.GeoDataFrame.set_crs, which tells geopandas what the CRS of the data is, with geopandas.GeoDataFrame.to_crs, which transforms the data from the current CRS to the new one you specify. Also note that neither of these operations are in-place by default. So I think you want:
gdf = geo.GeoDataFrame(
df, geometry=gpd.points_from_xy(df['X'], df['Y'])
)
gdf = gdf.set_crs("epsg:4326")
gdf_mercator = gdf.to_crs("epsg:3857")

This really is same as #Michael Delgado answer. It's simpler to state the CRS at GeoDataFrame construction time. Also make sure you are using correct CRS
MWE
import geopandas as gpd
import geopandas as geo
import pandas as pd
import contextily as cx
# construct a dataframe with X and Y of some points in US
places = gpd.read_file(
gpd.datasets.get_path("naturalearth_cities"),
mask=gpd.read_file(gpd.datasets.get_path("naturalearth_lowres")).loc[
lambda d: d["iso_a3"].eq("USA")
],
)
df = pd.DataFrame({"X": places.geometry.x, "Y": places.geometry.y})
# user code, state CRS at construction time
gdf = geo.GeoDataFrame(
df, geometry=gpd.points_from_xy(df["X"], df["Y"]), crs="epsg:4326"
)
ax = gdf.plot(color="red", figsize=(9, 9))
cx.add_basemap(ax, zoom=0, crs=gdf.crs)

Related

Add location marker on plotted Geopandas Dataframe using Folium

Context
I have an merged geodataframe of 1). Postalcode areas and 2). total amount of deliveries within that postalcode area in the city of Groningen called results. The geodataframe includes geometry that include Polygons and Multiploygons visualizing different Postal code areas within the city.
I am new to GeoPandas and therefore I've tried different tutorials including this one from the geopandas official website wherein I got introduced into interactive Folium maps, which I really like. I was able to plot my geodataframe using result.explore(), which resulted in the following map
The problem
So far so good, but now I want to simply place an marker using the folium libarty with the goal to calculate the distance between the marker and the postalcode areas. After some looking on the internet I found out in the quickstart guild that you need to create an folium.Map, then you need folium.Choropleth for my geodataframe and folium.Marker and add them to the folium.Map.
m = folium.Map(location=[53.21917, 6.56667], zoom_start=15)
folium.Marker(
[53.210903, 6.598276],
popup="My marker"
).add_to(m)
folium.Choropleth(results, data=results, columns="Postcode", fill_color='OrRd', name="Postalcode areas").add_to(m)
folium.LayerControl().add_to(m)
m
But when try to run the above code I get the following error:
What is the (possible) best way?
Besides my failing code (which would be great if someone could help me out). I am curious if this is the way to do it (Folium map + marker + choropleth). Is it not possible to call geodataframe.explore() which results into the map in second picture and then just add an marker on the same map? I have the feeling that I am making it too difficult, there must be an better solution using Geopandas.
you have not provided the geometry. Have found postal districts of Netherlands and used that
explore() supports will draw a point as a marker with appropriate parameters
hence two layers,
one is postal areas coloured using number of deliveries
second is point, with distance to each area calculated
import geopandas as gpd
import shapely.geometry
import pandas as pd
import numpy as np
geo_url = "https://geodata.nationaalgeoregister.nl/cbsgebiedsindelingen/wfs?request=GetFeature&service=WFS&version=2.0.0&typeName=cbs_provincie_2017_gegeneraliseerd&outputFormat=json"
gdf = gpd.read_file(geo_url).assign(
deliveries=lambda d: np.random.randint(10**4, 10**6, len(d))
)
p = gpd.GeoSeries(shapely.geometry.Point(6.598276, 53.210903), crs="epsg:4386")
# calc distances to point
gdf["distance"] = gdf.distance(p.to_crs(gdf.crs).values[0])
# dataframe of flattened distances
dfp = pd.DataFrame(
[
"<br>".join(
[f"{a} - {b:.2f}" for a, b in gdf.loc[:, ["statcode", "distance"]].values]
)
],
columns=["info"],
)
# generate colored choropleth
m = gdf.explore(
column="deliveries", categorical=True, legend=False, height=400, width=400
)
# add marker with distances
gpd.GeoDataFrame(
geometry=p,
data=dfp,
).explore(m=m, marker_type="marker")

Annotate a geoplot when using a projection

I got a dataframe with the following columns Name (string), size (num), latitude (num), longitude (num), geometry (shapely.geometry.point.Point).
When i'm plotting my points on a map and are trying to annotate each point the annotation is not shown at all. My guess is that this is due to the projection im using.
Here are the lines of codes im running:
import geopandas as gpd
import geoplot as gplt
proj = gplt.crs.AlbersEqualArea()
fig, ax = plt.subplots(figsize=(10, 10), subplot_kw={'projection': proj})
gdf = gpd.GeoDataFrame(df, geometry=gpd.points_from_xy(df.longitude, df.latitude))
gplt.pointplot(gdf, hue='size', s=15, ax=ax, cmap=palette, legend=True, zorder=10)
for idx, row in gdf.iterrows():
plt.annotate(s=row['Name'], xy=[row['latitude'],row['longitude']])
plt.show()
You need coordinate transformation in
plt.annotate(s=row['Name'], xy=[row['latitude'],row['longitude']])
The transformation should be
xtran = gplt.crs.ccrs.AlbersEqualArea()
Replace that line with
x, y = xtran.transform_point(row['longitude'], row['latitude'], ccrs.PlateCarree())
plt.annotate( s=row['Name'], xy=[x, y] )

contextily making weird background maps

This is my code:
import pandas as pd
import geoplot as gplt
import geopandas as gpd
import geoplot.crs as gcrs
import contextily
df = pd.read_csv('dataframe_master.csv', index_col='id')
crs = {'init': 'epsg:4326'}
geometry = [geometry.Point(xy) for xy in zip(df['latitude'], df['longitude'])]
df_geo = gpd.GeoDataFrame(df_geo, crs=crs, geometry=geometry)
test = df_geo[:200000]
test = test.to_crs(epsg=3857)
ax = test.plot(marker='o', markersize=1)
contextily.add_basemap(ax)
plt.show()
And it generates this image:
image, which doesn't show a background map and seems a little distorted.
My coordinate data was originally made with the RD-coordinaten standard (EPSG:28992), which I converted to EPSG:4326 with this code:
lon_l = []
lat_l = []
p1 = Proj(init='epsg:28992')
p2 = Proj(proj='latlong',datum='WGS84')
for row in range(len(df)):
lon, lat, z = transform(p1, p2, df.iloc[row, 7], df.iloc[row, 8], 0.0)
lon_l.append(lon)
lat_l.append(lat)
I did a sanity check on the longitude latitude output by comparing to some online converters, and the output points to the correct locations.
I tried following this solution: https://gis.stackexchange.com/questions/348339/using-crs-epsg3857-but-misalignment-between-stamen-background-and-coordinates-o in case my conversion was missing the "towgs84"part, but the image still looked the same with a slightly different colour.
I figured it out! I should've listed longitude before latitude when building the geometry.
geometry = [geometry.Point(xy) for xy in zip(df['longitude'], df['latitude'])]

Geoview and geopandas groupby projection error

I’m experiencing projection errors following a groupby on geodataframe. Below you will find the libraries that I am using:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib
import holoviews as hv
from holoviews import opts
import panel as pn
from bokeh.resources import INLINE
import geopandas as gpd
import geoviews as gv
from cartopy import crs
hv.extension('bokeh', 'matplotlib')
gv.extension('bokeh')
pd.options.plotting.backend = 'holoviews'
Whilst these are the versions of some key libraries:
bokeh 2.1.1
geopandas 0.6.1
geoviews 1.8.1
holoviews 1.13.3
I have concatenated 3 shapefiles to build a polygon picture of UK healthcare boundaries (links to files provided if needed). Unfortunately, from what i have found the UK doesn’t produce one file that combines all of those, so have had to merge the shape files from the 3 individual countries i’m interested in. The 3 shape files have a size of:
shape file 1 = (https://www.opendatani.gov.uk/dataset/department-of-health-trust-boundaries)
shape file 2 = (https://geoportal.statistics.gov.uk/datasets/5252644ec26e4bffadf9d3661eef4826_4)
shape file 3 = (https://data.gov.uk/dataset/31ab16a2-22da-40d5-b5f0-625bafd76389/local-health-boards-december-2016-ultra-generalised-clipped-boundaries-in-wales)
My code to concat them together is below:
England_CCG.drop(['objectid', 'bng_e', 'bng_n', 'long', 'lat', 'st_areasha', 'st_lengths'], inplace = True, axis = 1 )
Wales_HB.drop(['objectid', 'bng_e', 'bng_n', 'long', 'lat', 'st_areasha', 'st_lengths', 'lhb16nmw'], inplace = True, axis = 1 )
Scotland_HB.drop(['Shape_Leng', 'Shape_Area'], inplace = True, axis = 1)
#NI_HB.drop(['Shape_Leng', 'Shape_Area'], inplace = True, axis = 1 )
England_CCG.rename(columns={'ccg20cd': 'CCG_Code', 'ccg20nm': 'CCG_Name'}, inplace = True )
Wales_HB.rename(columns={'lhb16cd': 'CCG_Code', 'lhb16nm': 'CCG_Name'}, inplace = True )
Scotland_HB.rename(columns={'HBCode': 'CCG_Code', 'HBName': 'CCG_Name'}, inplace = True )
#NI_HB.rename(columns={'TrustCode': 'CCG_Code', 'TrustName': 'CCG_Name'}, inplace = True )
UK_shape = [England_CCG, Wales_HB, Scotland_HB]
Merged_Shapes = gpd.GeoDataFrame(pd.concat(UK_shape))
Each of the files has the same esri projection once joined, and the shape plots perfectly as one when I run:
Test= gv.Polygons(Merged_Shapes, vdims=[('CCG_Name')], crs=crs.OSGB())
This gives me a polygon plot of the UK, with all the area boundaries for each ccg.
To my geodataframe, I then add a new column, called ‘Country’ which attributes each CCG to whatever the country they belong to. So, all the Welsh CCGs are attributed to Wales, all the English ones to England and all the Scottish ones to Scotland. Just a simple additional grouping of the data really.
What I want to achieve is to have a dropdown next to the polygon map I am making, that will show all the CCGs in a particular country when it is selected from the drop down widget. I understand that the way to to do this is by a groupby. However, when I use the following code to achieve this:
c1 = gv.Polygons(Merged_Shapes, vdims=[('CCG_Name','Country')], crs=crs.OSGB()).groupby(['Country'])
I get a long list of projection errors stating:
“WARNING:param.project_path: While projecting a Polygons element from a PlateCarree coordinate reference system (crs) to a Mercator projection none of the projected paths were contained within the bounds specified by the projection. Ensure you have specified the correct coordinate system for your data.”
To which I am left without a map but I retain the widget. Does anyone know what is going wrong here and what a possible solution would be? its been driving me crazy!
Kind regards,
For some reason geoviews doesn't like the OSGB projection then followed by a groupby, as it tries to default back to platecaree projection.
The way I fixed it was to just make the entire dataset project in epsg:4326. For anyone who also runs into this problem, code below (it is a well documented solution:
Merged_Shapes.to_crs({'init': 'epsg:4326'},inplace=True)
gv.Polygons(Merged_Shapes, vdims=[('CCG_Name'),('Country')]).groupby('Country')
The groupby works fine after this.

How to rotate ylabel of pairplot in searborn? [duplicate]

I have a simple factorplot
import seaborn as sns
g = sns.factorplot("name", "miss_ratio", "policy", dodge=.2,
linestyles=["none", "none", "none", "none"], data=df[df["level"] == 2])
The problem is that the x labels all run together, making them unreadable. How do you rotate the text so that the labels are readable?
I had a problem with the answer by #mwaskorn, namely that
g.set_xticklabels(rotation=30)
fails, because this also requires the labels. A bit easier than the answer by #Aman is to just add
plt.xticks(rotation=45)
You can rotate tick labels with the tick_params method on matplotlib Axes objects. To provide a specific example:
ax.tick_params(axis='x', rotation=90)
This is still a matplotlib object. Try this:
# <your code here>
locs, labels = plt.xticks()
plt.setp(labels, rotation=45)
Any seaborn plots suported by facetgrid won't work with (e.g. catplot)
g.set_xticklabels(rotation=30)
however barplot, countplot, etc. will work as they are not supported by facetgrid. Below will work for them.
g.set_xticklabels(g.get_xticklabels(), rotation=30)
Also, in case you have 2 graphs overlayed on top of each other, try set_xticklabels on graph which supports it.
If anyone wonders how to this for clustermap CorrGrids (part of a given seaborn example):
import seaborn as sns
import matplotlib.pyplot as plt
sns.set(context="paper", font="monospace")
# Load the datset of correlations between cortical brain networks
df = sns.load_dataset("brain_networks", header=[0, 1, 2], index_col=0)
corrmat = df.corr()
# Set up the matplotlib figure
f, ax = plt.subplots(figsize=(12, 9))
# Draw the heatmap using seaborn
g=sns.clustermap(corrmat, vmax=.8, square=True)
rotation = 90
for i, ax in enumerate(g.fig.axes): ## getting all axes of the fig object
ax.set_xticklabels(ax.get_xticklabels(), rotation = rotation)
g.fig.show()
You can also use plt.setp as follows:
import matplotlib.pyplot as plt
import seaborn as sns
plot=sns.barplot(data=df, x=" ", y=" ")
plt.setp(plot.get_xticklabels(), rotation=90)
to rotate the labels 90 degrees.
For a seaborn.heatmap, you can rotate these using (based on #Aman's answer)
pandas_frame = pd.DataFrame(data, index=names, columns=names)
heatmap = seaborn.heatmap(pandas_frame)
loc, labels = plt.xticks()
heatmap.set_xticklabels(labels, rotation=45)
heatmap.set_yticklabels(labels[::-1], rotation=45) # reversed order for y
One can do this with matplotlib.pyplot.xticks
import matplotlib.pyplot as plt
plt.xticks(rotation = 'vertical')
# Or use degrees explicitly
degrees = 70 # Adjust according to one's preferences/needs
plt.xticks(rotation=degrees)
Here one can see an example of how it works.
Use ax.tick_params(labelrotation=45). You can apply this to the axes figure from the plot without having to provide labels. This is an alternative to using the FacetGrid if that's not the path you want to take.
If the labels have long names it may be hard to get it right. A solution that worked well for me using catplot was:
import matplotlib.pyplot as plt
fig = plt.gcf()
fig.autofmt_xdate()

Resources