Nearest Polygon to A point In Geopandas - geopandas

I have a point and I want to figure out which polygon is nearest to the point.
I have geo data of both points and Polygons.

You can use distance to find the distance to each polygon and sort them to retrieve the nearest.
Example
>>> from shapely.geometry import Point, Polygon
>>> import geopandas as gpd
>>> d = {'geometry': [Polygon([(0, 0), (1, 1), (1, 0)]), Polygon([(3, 3), (4, 3), (4, 4)])]}
>>> gdf = gpd.GeoDataFrame(d)
>>> red_point = Point(1,2)
>>> polygon_index = gdf.distance(red_point).sort_values().index[0]
>>> gdf.loc[polygon_index]
geometry POLYGON ((0.00000 0.00000, 1.00000 1.00000, 1....
Name: 0, dtype: geometry
Note: remember to set the CRS of the GeoDataFrame.

Related

shapely nearest_points between two polygons returns expected result for simple polygons, unexpected result for complex polygons

I am using nearest_points from shapely to retrieve the nearest points between two polygons.
I get the expected result for two simple polygons:
However for more complex polygons, the points are not the expected nearest points between the polygons.
Note: I added ax.set_aspect('equal') so that the nearest points line woild have to be at a right angle (right?)
What is wrong with my code or my polygons (or me)?
from shapely.geometry import Point, Polygon, LineString, MultiPolygon
from shapely.ops import nearest_points
import matplotlib.pyplot as plt
import shapely.wkt as wkt
#Set of 2 Polyogons A where the nearest points don's seem right
poly1=wkt.loads('POLYGON((-0.136319755454978 51.460464712623626, -0.1363352511419218 51.46042713866513, -0.1363348393705439 51.460425, -0.1365967582352347 51.460425, -0.1363077932125138 51.4605825392028, -0.136237298707157 51.46052697162038, -0.136319755454978 51.460464712623626))')
poly2=wkt.loads('POLYGON ((-0.1371553266140889 51.46046700960882, -0.1371516327997412 51.46046599134276, -0.1371478585043985 51.46046533117243, -0.1371440383598866 51.46046503515535, -0.1371402074187299 51.460465106007696, -0.1371364008325196 51.460465543079344, -0.137132653529373 51.46046634235985, -0.1371289998934435 51.46046749651525, -0.1371254734494216 51.46046899495536, -0.1371221065549237 51.46047082393093, -0.1366012836405492 51.460786954965236, -0.1365402944168757 51.46074798846902, -0.1370125055334012 51.46045400071198, -0.1371553266140889 51.46046700960882))')
#Set of 2 polygons B where the nearest points seem right
#poly1 = Polygon([(0, 0), (2, 8), (14, 10), (6, 1)])
#poly2 = Polygon([(10, 0),(13,5),(14,2)])
p1, p2 = nearest_points(poly1, poly2)
fig,ax= plt.subplots()
ax.set_aspect('equal')
x1,y1=poly1.exterior.xy
x2,y2=poly2.exterior.xy
#Plot Polgygons
plt.plot(x1,y1)
plt.plot(x2,y2)
#Plot LineString connecting the nearest points
plt.plot([p1.x, p2.x],[p1.y,p2.y], color='green')
fig.show()

Discrete logarithmic colorbar in matplotlib

I want to create a pcolormesh plot with a discrete logarithmic colorbar. Some resolution is lost, but the matching between colors and values seems to be easier (at least for me) if the colormap is discrete.
The code snippet below produces a continuous log colormap with the preferred value range. How can I make it discrete? Here I found how to create a discrete linear colormap, but I couldn't extend it to log scale.
plt.pcolormesh(X,Y,Z,norm=mcolors.LogNorm(vmin=0.01, vmax=100.))
plt.colorbar()
fig = matplotlib.pyplot.gcf()
fig.set_size_inches(4*2.5, 3*2.5)
plt.xlabel("X", horizontalalignment='right', x=1.0)
plt.ylabel("Y", horizontalalignment='right', y=1.0)
plt.tight_layout()
I've managed to create a logarithmic colorbar with even spacing. However, I couldn't figure out how to create a discrete logarithmic colorbar with a logarithmic spacing of the colorbar. I hope this helps!
import matplotlib.pyplot as plt
import matplotlib as mpl
import numpy as np
X = np.arange(0, 50)
Y = np.arange(0, 50)
Z = np.random.rand(50, 50)*10
bounds = [0, 0.1, 0.2, 0.3, 0.4, 0.5, .7, .8, .9, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
num_c = len(bounds)
cmap = mpl.colormaps['viridis'].resampled(num_c)
norm = mpl.colors.BoundaryNorm(bounds, cmap.N)
fig, ax = plt.subplots()
fig.set_size_inches(4*2.5, 3*2.5)
ax.pcolormesh(X, Y, Z, norm=norm, cmap=cmap)
plt.xlabel("X", horizontalalignment='right', x=1.0)
plt.ylabel("Y", horizontalalignment='right', y=1.0)
fig.colorbar(mpl.cm.ScalarMappable(cmap=cmap, norm=norm))
plt.tight_layout()

How to calculate distance using geopandas

I wanted to calculate the distance from Manila to cities in the Philippines using geopandas GeoSeries.distance(self, other) function.
Steps:
# So I start with the dataset, which should produce a geopandas dataframe consisting basically of cities and a polygon of its boundaries in latlong.
url = 'https://raw.githubusercontent.com/macoymejia/geojsonph/master/MuniCities/MuniCities.minimal.json'
df1 = gpd.read_file(url)
# then I define a centroid column
df1['Centroid'] = df1.geometry.centroid
# then I define Manila location as a shapely point geometry, which produces a DataFrame with point geometry and address as columns
manila_loc = gpd.tools.geocode('Manila')
# then I try to calculate the distance
df1.Centroid.distance(manila_loc.geometry)
But I'm getting this error:
AttributeError Traceback (most recent call last)
<ipython-input-30-76585915942f> in <module>
----> 1 df1.Centroid.distance(manila_loc.geometry)
~/opt/anaconda3/envs/Coursera/lib/python3.8/site-packages/pandas/core/generic.py in __getattr__(self, name)
5137 if self._info_axis._can_hold_identifiers_and_holds_name(name):
5138 return self[name]
-> 5139 return object.__getattribute__(self, name)
5140
5141 def __setattr__(self, name: str, value) -> None:
AttributeError: 'Series' object has no attribute 'distance'
I'm new to GeoPandas but I thought from the documentation that distance method can act on GeoSeries and that df1.Centroid and manila.geometry are valid shapely geometry objects. So I don't know what I am missing. Help pls.
Try this
# relevant code only
dists = []
for i, centr in df1.Centroid.iteritems():
dist = centr.distance( manila.geometry[0] )
dists.append(dist)
print("Dist2Manila: ", dist)
To create new column for the distances:
df1["Dist2Manila"] = dists
You need to feed a singular Point object to the distance method:
from shapely.geometry import Point
from geopandas import GeoDataFrame
destination = Point(5, 5)
geoms = map(lambda x: Point(*x), [(0, 0), (3, 3), (4, 1), (8, 2), (1, 10)])
departures = GeoDataFrame({'city': list('ABCDE'), 'geometry': geoms})
print(departures.assign(dist_to_dest=departures.distance(destination)))
Which give me:
city geometry dist_to_dest
0 A POINT (0.00000 0.00000) 7.071068
1 B POINT (3.00000 3.00000) 2.828427
2 C POINT (4.00000 1.00000) 4.123106
3 D POINT (8.00000 2.00000) 4.242641
4 E POINT (1.00000 10.00000) 6.403124
I was able to solve it. I mistakenly assumed that since it is a geopandas data frame, the centroid column will already be treated as such. But I think you have to explicitly define it. So it's what I did. A minor revision on the original code from:
df1.Centroid.distance(manila_loc.geometry)
to:
gpd.GeoSeries(df1.Centroid).distance(manila_loc.iloc[0,0])
It worked.

Geopandas: Converting single polygons to multipolygon, keeping individual polygonal topology?

I assume this is possible. Example: I have polygons in a geodataframe, some polygons have the same attribute data, they are just separate individual polygons with the same data, each polygon has its own row in the gdf.
I would like to combine the polygons into a multipolygon so they take up only 1 row in the gdf.
The two polygons overlap, I do not want to dissolve them together, I want them to remain 2 separate entities.
There are single polygons, I assume they will also have to be converted to multipolygons even though they are in the singular as ultimately they will be exported for use in GIS software, one geom type per dataset.
I have achieved a .dissolve(by='ID') but as stated above, I do not want to change the polygons geometry.
Suggestions?
You can adapt geopandas' dissolve to generate MultiPolygon instead of unary union. The original code I adapted is here.
import geopandas as gpd
from shapely.geometry import Polygon, MultiPolygon
def groupby_multipoly(df, by, aggfunc="first"):
data = df.drop(labels=df.geometry.name, axis=1)
aggregated_data = data.groupby(by=by).agg(aggfunc)
# Process spatial component
def merge_geometries(block):
return MultiPolygon(block.values)
g = df.groupby(by=by, group_keys=False)[df.geometry.name].agg(
merge_geometries
)
# Aggregate
aggregated_geometry = gpd.GeoDataFrame(g, geometry=df.geometry.name, crs=df.crs)
# Recombine
aggregated = aggregated_geometry.join(aggregated_data)
return aggregated
df = gpd.GeoDataFrame(
{"a": [0, 0, 1], "b": [1, 2, 3]},
geometry=[
Polygon([(0, 0), (1, 0), (1, 1)]),
Polygon([(1, 0), (1, 0), (1, 1)]),
Polygon([(0, 2), (1, 0), (1, 1)]),
],
)
grouped = groupby_multipoly(df, by='a')
grouped
geometry b
a
0 MULTIPOLYGON (((0.00000 0.00000, 1.00000 0.000... 1
1 MULTIPOLYGON (((0.00000 2.00000, 1.00000 0.000... 3
If you change MultiPolygon within merge_geometries to GeometryCollection, you should be able to combine any type of geometry to a single row. But that might not be supported by certain file formats.

swap coordinates in numpy 3d array

I have a numpy array with shape (3,10,10).
How can I change the array such that the first coordinate (column) will be the last, and the last will be the first (e.g. the shape of the array will be (10, 10, 3))?
I tried:
arr.flatten().reshape((10,10,3))
Is there a more elegant/efficient way?
You should use transpose function with axes parameter:
>>>import numpy as np
>>>x = np.ones((3,10,10))
>>>tx = np.transpose(x, (2, 1, 0))
>>>tx.shape
(10, 10, 3)

Resources