Is any simpler way to plot GeoPandas data in Altair chart? - geopandas

The basic way to display GeoDataFrame in Altair:
import altair as alt
import geopandas as gpd
alt.renderers.enable('notebook')
world = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))
data = alt.InlineData(values = world[world.continent=='Africa'].__geo_interface__, #geopandas to geojson
# root object type is "FeatureCollection" but we need its features
format = alt.DataFormat(property='features',type='json'))
alt.Chart(data).mark_geoshape(
).encode(
color='properties.pop_est:Q', # GeoDataFrame fields are accessible through a "properties" object
tooltip=['properties.name:N','properties.pop_est:Q']
).properties(
width=500,
height=300
)
But it will crush if I add column with Nan or DateTime values.

At first you can use world = alt.utils.sanitize_dataframe(world) to convert columns with JSON incompatible types.
Or you can use gpdvega module to simplify code.
import altair as alt
import geopandas as gpd
import gpdvega
alt.renderers.enable('notebook')
world = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))
alt.Chart(world[world.continent=='Africa']).mark_geoshape(
).encode(
color='pop_est',
tooltip=['name','pop_est']
).properties(
width=500,
height=300
)
Just pip install gpdvega and import gpdvega. altair will work with GeoDataFrame as usual DataFrame. See details in documentation

Related

Adding a basemap to a plot in Geopandas using X,Y coords

I am having trouble adding a basemap to my map. My geodataframe is created using X and Y coords of a bunch of points.
gdf = geo.GeoDataFrame(
df, geometry=gpd.points_from_xy(df['X'], df['Y']))
gdf.set_crs(epsg=3857)
Which look like this:
After using contexily to get a basemap, I cannot get the basemap to properly show up. The coords should be showing the bottom of the Mississippi River Basin.
ax = gdf.plot(color="red", figsize=(9, 9))
cx.add_basemap(ax, zoom=0, crs= gdf.crs)
Let me know if there is anything wrong with my code as to why it is not showing up.
Thanks!
It looks like your data is in WGS84/EPSG:4326 (i.e. lat/lon) coordinates. So I think you're confusing geopandas.GeoDataFrame.set_crs, which tells geopandas what the CRS of the data is, with geopandas.GeoDataFrame.to_crs, which transforms the data from the current CRS to the new one you specify. Also note that neither of these operations are in-place by default. So I think you want:
gdf = geo.GeoDataFrame(
df, geometry=gpd.points_from_xy(df['X'], df['Y'])
)
gdf = gdf.set_crs("epsg:4326")
gdf_mercator = gdf.to_crs("epsg:3857")
This really is same as #Michael Delgado answer. It's simpler to state the CRS at GeoDataFrame construction time. Also make sure you are using correct CRS
MWE
import geopandas as gpd
import geopandas as geo
import pandas as pd
import contextily as cx
# construct a dataframe with X and Y of some points in US
places = gpd.read_file(
gpd.datasets.get_path("naturalearth_cities"),
mask=gpd.read_file(gpd.datasets.get_path("naturalearth_lowres")).loc[
lambda d: d["iso_a3"].eq("USA")
],
)
df = pd.DataFrame({"X": places.geometry.x, "Y": places.geometry.y})
# user code, state CRS at construction time
gdf = geo.GeoDataFrame(
df, geometry=gpd.points_from_xy(df["X"], df["Y"]), crs="epsg:4326"
)
ax = gdf.plot(color="red", figsize=(9, 9))
cx.add_basemap(ax, zoom=0, crs=gdf.crs)

How to convert a xarray DataArray to a geopandas GeoDataFrame

In xarray there is a method called to_dataframe(), see:
http://xarray.pydata.org/en/stable/pandas.html
With this method a DataArray can be converted to a pandas DataFrame.
How can I convert a convert a xarray DataArray to a geopandas GeoDataFrame, so like the above but with polygons included of the gridcells?
This code will create a GDF with a geometry column containing a series of POINT objects corresponding to the lat/lon coordinates in my xarray.
import geopandas as gpd
import xarray as xr
xds = xr.open_dataset('yourfile.nc')
xarr = x['value_column']
df = xarr.to_dataframe().reset_index()
gdf = gpd.GeoDataFrame(
df.value_column, geometry=gpd.points_from_xy(df.lon,df.lat))

Geoview and geopandas groupby projection error

I’m experiencing projection errors following a groupby on geodataframe. Below you will find the libraries that I am using:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib
import holoviews as hv
from holoviews import opts
import panel as pn
from bokeh.resources import INLINE
import geopandas as gpd
import geoviews as gv
from cartopy import crs
hv.extension('bokeh', 'matplotlib')
gv.extension('bokeh')
pd.options.plotting.backend = 'holoviews'
Whilst these are the versions of some key libraries:
bokeh 2.1.1
geopandas 0.6.1
geoviews 1.8.1
holoviews 1.13.3
I have concatenated 3 shapefiles to build a polygon picture of UK healthcare boundaries (links to files provided if needed). Unfortunately, from what i have found the UK doesn’t produce one file that combines all of those, so have had to merge the shape files from the 3 individual countries i’m interested in. The 3 shape files have a size of:
shape file 1 = (https://www.opendatani.gov.uk/dataset/department-of-health-trust-boundaries)
shape file 2 = (https://geoportal.statistics.gov.uk/datasets/5252644ec26e4bffadf9d3661eef4826_4)
shape file 3 = (https://data.gov.uk/dataset/31ab16a2-22da-40d5-b5f0-625bafd76389/local-health-boards-december-2016-ultra-generalised-clipped-boundaries-in-wales)
My code to concat them together is below:
England_CCG.drop(['objectid', 'bng_e', 'bng_n', 'long', 'lat', 'st_areasha', 'st_lengths'], inplace = True, axis = 1 )
Wales_HB.drop(['objectid', 'bng_e', 'bng_n', 'long', 'lat', 'st_areasha', 'st_lengths', 'lhb16nmw'], inplace = True, axis = 1 )
Scotland_HB.drop(['Shape_Leng', 'Shape_Area'], inplace = True, axis = 1)
#NI_HB.drop(['Shape_Leng', 'Shape_Area'], inplace = True, axis = 1 )
England_CCG.rename(columns={'ccg20cd': 'CCG_Code', 'ccg20nm': 'CCG_Name'}, inplace = True )
Wales_HB.rename(columns={'lhb16cd': 'CCG_Code', 'lhb16nm': 'CCG_Name'}, inplace = True )
Scotland_HB.rename(columns={'HBCode': 'CCG_Code', 'HBName': 'CCG_Name'}, inplace = True )
#NI_HB.rename(columns={'TrustCode': 'CCG_Code', 'TrustName': 'CCG_Name'}, inplace = True )
UK_shape = [England_CCG, Wales_HB, Scotland_HB]
Merged_Shapes = gpd.GeoDataFrame(pd.concat(UK_shape))
Each of the files has the same esri projection once joined, and the shape plots perfectly as one when I run:
Test= gv.Polygons(Merged_Shapes, vdims=[('CCG_Name')], crs=crs.OSGB())
This gives me a polygon plot of the UK, with all the area boundaries for each ccg.
To my geodataframe, I then add a new column, called ‘Country’ which attributes each CCG to whatever the country they belong to. So, all the Welsh CCGs are attributed to Wales, all the English ones to England and all the Scottish ones to Scotland. Just a simple additional grouping of the data really.
What I want to achieve is to have a dropdown next to the polygon map I am making, that will show all the CCGs in a particular country when it is selected from the drop down widget. I understand that the way to to do this is by a groupby. However, when I use the following code to achieve this:
c1 = gv.Polygons(Merged_Shapes, vdims=[('CCG_Name','Country')], crs=crs.OSGB()).groupby(['Country'])
I get a long list of projection errors stating:
“WARNING:param.project_path: While projecting a Polygons element from a PlateCarree coordinate reference system (crs) to a Mercator projection none of the projected paths were contained within the bounds specified by the projection. Ensure you have specified the correct coordinate system for your data.”
To which I am left without a map but I retain the widget. Does anyone know what is going wrong here and what a possible solution would be? its been driving me crazy!
Kind regards,
For some reason geoviews doesn't like the OSGB projection then followed by a groupby, as it tries to default back to platecaree projection.
The way I fixed it was to just make the entire dataset project in epsg:4326. For anyone who also runs into this problem, code below (it is a well documented solution:
Merged_Shapes.to_crs({'init': 'epsg:4326'},inplace=True)
gv.Polygons(Merged_Shapes, vdims=[('CCG_Name'),('Country')]).groupby('Country')
The groupby works fine after this.

How to annotate each dot of seaborn scatter plot with in python?

I am making scatter plot in seaborn and I want to add some text to each point of scatter plot according to my data ("Countries" column in hap_educ and hap_rel tables). I think I need loop to do this but cannot figure out how to do it for seaborn. Here is code I use:
https://ibb.co/hZ9NBV0
https://ibb.co/ZYLdgkt
import pandas as pd
import os
import seaborn as sns
import matplotlib.pyplot as plt
# Set up working directory
os.chdir(r'D:/PROJECT CSS/')
#importing data from xlsx files
educ = pd.read_excel(r'D:\PROJECT CSS\educ.xlsx')
happiness= pd.read_excel(r'D:\PROJECT CSS\happiness edited.xlsx')
religious=pd.read_excel(r'D:\PROJECT CSS\religious edited.xlsx')
#Merging data into 2 tables
hap_rel = pd.merge(religious, happiness, on ='Country')
hap_educ= pd.merge(educ, happiness, on ='Country')
p1=sns.regplot(x =hap_educ['Score'], y =hap_educ['Pupil teacher ratio'], data=hap_educ, label='Countries')
plt.xlabel("Index of happiness")
plt.ylabel("Pupil / teacher ratio")
p2=sns.regplot(x=hap_rel['Score'], y=hap_rel['Yes'], data=hap_rel)
plt.xlabel("Index of happiness")
plt.ylabel("Percent of religious people(1=100%)")
Expect to see each point to be Annotated with Country name from my table

How to select irregular shapes in a image

Using python code we are able to create image segments as shown in the screenshot. our requirement is how to select specific segment in the image and apply different color to it ?
The following is our python snippet
from skimage.segmentation import felzenszwalb, slic,quickshift
from skimage.segmentation import mark_boundaries
from skimage.util import img_as_float
import matplotlib.pyplot as plt
from skimage import measure
from skimage import restoration
from skimage import img_as_float
image = img_as_float(io.imread("leaf.jpg"))
segments = quickshift(image, ratio=1.0, kernel_size=20, max_dist=10,return_tree=False, sigma=0, convert2lab=True, random_seed=42)
fig = plt.figure("Superpixels -- %d segments" % (500))
ax = fig.add_subplot(1, 1, 1)
ax.imshow(mark_boundaries(image, segments))
plt.axis("off")
plt.show()
do this:
seg_num = 64 # desired segment to be colored
color = float64([1,0,0]) # red color
image[segments == 64] = color # assign color to the segment
You can use OpenCV python module - example:

Resources