Geopandas: Converting single polygons to multipolygon, keeping individual polygonal topology? - geopandas

I assume this is possible. Example: I have polygons in a geodataframe, some polygons have the same attribute data, they are just separate individual polygons with the same data, each polygon has its own row in the gdf.
I would like to combine the polygons into a multipolygon so they take up only 1 row in the gdf.
The two polygons overlap, I do not want to dissolve them together, I want them to remain 2 separate entities.
There are single polygons, I assume they will also have to be converted to multipolygons even though they are in the singular as ultimately they will be exported for use in GIS software, one geom type per dataset.
I have achieved a .dissolve(by='ID') but as stated above, I do not want to change the polygons geometry.
Suggestions?

You can adapt geopandas' dissolve to generate MultiPolygon instead of unary union. The original code I adapted is here.
import geopandas as gpd
from shapely.geometry import Polygon, MultiPolygon
def groupby_multipoly(df, by, aggfunc="first"):
data = df.drop(labels=df.geometry.name, axis=1)
aggregated_data = data.groupby(by=by).agg(aggfunc)
# Process spatial component
def merge_geometries(block):
return MultiPolygon(block.values)
g = df.groupby(by=by, group_keys=False)[df.geometry.name].agg(
merge_geometries
)
# Aggregate
aggregated_geometry = gpd.GeoDataFrame(g, geometry=df.geometry.name, crs=df.crs)
# Recombine
aggregated = aggregated_geometry.join(aggregated_data)
return aggregated
df = gpd.GeoDataFrame(
{"a": [0, 0, 1], "b": [1, 2, 3]},
geometry=[
Polygon([(0, 0), (1, 0), (1, 1)]),
Polygon([(1, 0), (1, 0), (1, 1)]),
Polygon([(0, 2), (1, 0), (1, 1)]),
],
)
grouped = groupby_multipoly(df, by='a')
grouped
geometry b
a
0 MULTIPOLYGON (((0.00000 0.00000, 1.00000 0.000... 1
1 MULTIPOLYGON (((0.00000 2.00000, 1.00000 0.000... 3
If you change MultiPolygon within merge_geometries to GeometryCollection, you should be able to combine any type of geometry to a single row. But that might not be supported by certain file formats.

Related

shapely nearest_points between two polygons returns expected result for simple polygons, unexpected result for complex polygons

I am using nearest_points from shapely to retrieve the nearest points between two polygons.
I get the expected result for two simple polygons:
However for more complex polygons, the points are not the expected nearest points between the polygons.
Note: I added ax.set_aspect('equal') so that the nearest points line woild have to be at a right angle (right?)
What is wrong with my code or my polygons (or me)?
from shapely.geometry import Point, Polygon, LineString, MultiPolygon
from shapely.ops import nearest_points
import matplotlib.pyplot as plt
import shapely.wkt as wkt
#Set of 2 Polyogons A where the nearest points don's seem right
poly1=wkt.loads('POLYGON((-0.136319755454978 51.460464712623626, -0.1363352511419218 51.46042713866513, -0.1363348393705439 51.460425, -0.1365967582352347 51.460425, -0.1363077932125138 51.4605825392028, -0.136237298707157 51.46052697162038, -0.136319755454978 51.460464712623626))')
poly2=wkt.loads('POLYGON ((-0.1371553266140889 51.46046700960882, -0.1371516327997412 51.46046599134276, -0.1371478585043985 51.46046533117243, -0.1371440383598866 51.46046503515535, -0.1371402074187299 51.460465106007696, -0.1371364008325196 51.460465543079344, -0.137132653529373 51.46046634235985, -0.1371289998934435 51.46046749651525, -0.1371254734494216 51.46046899495536, -0.1371221065549237 51.46047082393093, -0.1366012836405492 51.460786954965236, -0.1365402944168757 51.46074798846902, -0.1370125055334012 51.46045400071198, -0.1371553266140889 51.46046700960882))')
#Set of 2 polygons B where the nearest points seem right
#poly1 = Polygon([(0, 0), (2, 8), (14, 10), (6, 1)])
#poly2 = Polygon([(10, 0),(13,5),(14,2)])
p1, p2 = nearest_points(poly1, poly2)
fig,ax= plt.subplots()
ax.set_aspect('equal')
x1,y1=poly1.exterior.xy
x2,y2=poly2.exterior.xy
#Plot Polgygons
plt.plot(x1,y1)
plt.plot(x2,y2)
#Plot LineString connecting the nearest points
plt.plot([p1.x, p2.x],[p1.y,p2.y], color='green')
fig.show()

Geopandas: How to relate the length of a linestring to the linestring point used to find distance to polygon

I’m trying to find the length of the linestring between the starting point of the linestring and the point which are used to find the nearest distance to a polygon.
So I used the following code to get the minimum distance between the linestring and some polygons.
gdf['MinDistToTrack'] = gdf.geometry.apply(lambda l: min(rail_or.distance(l)))
and I would also like to get the distance from the start of the linestring to the point used by the above code.
Now I get dataframe containing the polygons with a value 'MinDistToTrack' (which I have now) but also with a value ‘Length_Of_Linestring_Up_To_Location_Of_Polygon’.
So, let’s say that from the start of the linestring to the polygon there are 22 meters following the path of the linestring, then this is the value I would like to save together with the 'MinDistToTrack'
Polygon ID : 1
'MinDistToTrack' : 1m
'LengthOfLinestringUpToLocationOfPolygon' : 22m
Is this possible or do I need to split the linestring up into small elements and then look at all elements and the length of all the preceding elements in relation to the linestring elements which is nearest to the polygon?
Picture showing the problem
You may use the following concepts from shapely:
The nearest_points() function in shapely.ops calculates the nearest points in a pair of geometries.
shapely.ops.nearest_points(geom1, geom2)
Returns a tuple of the nearest points in the input geometries. The points are returned in the same order as the input geometries.
https://shapely.readthedocs.io/en/stable/manual.html#shapely.ops.nearest_points
from shapely.ops import nearest_points
P = Polygon([(0, 0), (1, 0), (0.5, 1), (0, 0)])
Lin = Linestring([(0, 2), (1, 2), (1, 3), (0, 3)])
nps = [o.wkt for o in nearest_points(P, Lin)]
##nps = ['POINT (0.5 1)', 'POINT (0.5 2)']
np_lin = = nps[1]
You can then use the point np_lin and Project it on the Lin to get the distance using
d = Lin.project(np_lin)
d will be the distance along Lin to the point np_lin i.e. nearest to the corresponding Point of P.

How to calculate distance using geopandas

I wanted to calculate the distance from Manila to cities in the Philippines using geopandas GeoSeries.distance(self, other) function.
Steps:
# So I start with the dataset, which should produce a geopandas dataframe consisting basically of cities and a polygon of its boundaries in latlong.
url = 'https://raw.githubusercontent.com/macoymejia/geojsonph/master/MuniCities/MuniCities.minimal.json'
df1 = gpd.read_file(url)
# then I define a centroid column
df1['Centroid'] = df1.geometry.centroid
# then I define Manila location as a shapely point geometry, which produces a DataFrame with point geometry and address as columns
manila_loc = gpd.tools.geocode('Manila')
# then I try to calculate the distance
df1.Centroid.distance(manila_loc.geometry)
But I'm getting this error:
AttributeError Traceback (most recent call last)
<ipython-input-30-76585915942f> in <module>
----> 1 df1.Centroid.distance(manila_loc.geometry)
~/opt/anaconda3/envs/Coursera/lib/python3.8/site-packages/pandas/core/generic.py in __getattr__(self, name)
5137 if self._info_axis._can_hold_identifiers_and_holds_name(name):
5138 return self[name]
-> 5139 return object.__getattribute__(self, name)
5140
5141 def __setattr__(self, name: str, value) -> None:
AttributeError: 'Series' object has no attribute 'distance'
I'm new to GeoPandas but I thought from the documentation that distance method can act on GeoSeries and that df1.Centroid and manila.geometry are valid shapely geometry objects. So I don't know what I am missing. Help pls.
Try this
# relevant code only
dists = []
for i, centr in df1.Centroid.iteritems():
dist = centr.distance( manila.geometry[0] )
dists.append(dist)
print("Dist2Manila: ", dist)
To create new column for the distances:
df1["Dist2Manila"] = dists
You need to feed a singular Point object to the distance method:
from shapely.geometry import Point
from geopandas import GeoDataFrame
destination = Point(5, 5)
geoms = map(lambda x: Point(*x), [(0, 0), (3, 3), (4, 1), (8, 2), (1, 10)])
departures = GeoDataFrame({'city': list('ABCDE'), 'geometry': geoms})
print(departures.assign(dist_to_dest=departures.distance(destination)))
Which give me:
city geometry dist_to_dest
0 A POINT (0.00000 0.00000) 7.071068
1 B POINT (3.00000 3.00000) 2.828427
2 C POINT (4.00000 1.00000) 4.123106
3 D POINT (8.00000 2.00000) 4.242641
4 E POINT (1.00000 10.00000) 6.403124
I was able to solve it. I mistakenly assumed that since it is a geopandas data frame, the centroid column will already be treated as such. But I think you have to explicitly define it. So it's what I did. A minor revision on the original code from:
df1.Centroid.distance(manila_loc.geometry)
to:
gpd.GeoSeries(df1.Centroid).distance(manila_loc.iloc[0,0])
It worked.

swap coordinates in numpy 3d array

I have a numpy array with shape (3,10,10).
How can I change the array such that the first coordinate (column) will be the last, and the last will be the first (e.g. the shape of the array will be (10, 10, 3))?
I tried:
arr.flatten().reshape((10,10,3))
Is there a more elegant/efficient way?
You should use transpose function with axes parameter:
>>>import numpy as np
>>>x = np.ones((3,10,10))
>>>tx = np.transpose(x, (2, 1, 0))
>>>tx.shape
(10, 10, 3)

Rearranging RGB values to GRB, GBR, BRG, BGR, and RBG through an entire directory

I have a directory of images for a CNN. I would like to be able to rearrange each band in a different order to help better train my model to allow it to recognize my objects. I so far have some code working with cv2. It is separating the bands, but I am having trouble rearranging the bands.
import cv2
import numpy
img = cv2.imread("IMG_4540.jpg")
g,b,r = cv2.split(img)
cv2.imwrite('green_channel.jpg', g)
I would like to have 6 separate images all with different band combinations from one singular image if possible.
You can just form all reorderings with numpy's indexing capabilities.
import numpy as np
from itertools import permutations
# first generate all sets of rearrangements you'd like to make..
orderings = [p for p in permutations(np.arange(3)) if p!=(0,1,2)]
# [(0, 2, 1), (1, 0, 2), (1, 2, 0), (2, 0, 1), (2, 1, 0)]
# rbg, brg, and so on.
# then reorder along axis=-1 using these. (0,1,2) --> (0,2,1) and so on.
for order in orderings:
reordered = im[...,order]
# then save each an appropriate filename
cv2.imsave('filename.jpg', reordered)
del reordered, order

Resources