Split lines in GeoPandas - geopandas

I can find how to split 1 line using geopandas/shapely
def split_line_by_point(line, point, tolerance: float=1.0e-12):
return split(snap(line, point, tolerance), point)
However I can't figure out how to apply this over an entire geometry column while maintaining other values.
Applying the above function to df.geometry loses a bunch of information
How do split a below linestring so that it 'explodes' maintaining 'type' and 'properties' cols?
{"Feature":'Hi',"ID":1,Linestring([1,1],[2,2],[3,3])},
{"Feature":'bye',"ID":2,Linestring([10,10],[20,20],[30,30])}
To
{"Feature":'Hi',"ID":1,Linestring([1,1],[2,2])},
{"Feature":'Hi',"ID":1,Linestring([2,2],[3,3])},
{"Feature":'bye',"ID":2,Linestring([10,10],[20,20])}
{"Feature":'bye',"ID":2,Linestring([20,20],[30,30])}
Lines need to be smaller where length > x

This works for me:
import geopandas
from shapely import geometry
from shapely.ops import split, snap
gdf = geopandas.GeoDataFrame([
{"Feature":'Hi',"ID":1,"geometry": geometry.LineString(([1,1],[2,2],[3,3]))},
{"Feature":'bye',"ID":2, "geometry": geometry.LineString(([10,10],[20,20],[30,30]))}
])
def split_line_by_point(line, point, tolerance: float=1.0e-12):
return split(snap(line, point, tolerance), point)
result = (
gdf
.assign(geometry=gdf.apply(
lambda x: split_line_by_point(
x.geometry,
geometry.Point(x.geometry.coords[1])
), axis=1
))
.explode()
.reset_index(drop=True)
)
Output:
Feature ID geometry
0 Hi 1 LINESTRING (1.00000 1.00000, 2.00000 2.00000)
1 Hi 1 LINESTRING (2.00000 2.00000, 3.00000 3.00000)
2 bye 2 LINESTRING (10.00000 10.00000, 20.00000 20.00000)
3 bye 2 LINESTRING (20.00000 20.00000, 30.00000 30.00000)

Related

shapely nearest_points between two polygons returns expected result for simple polygons, unexpected result for complex polygons

I am using nearest_points from shapely to retrieve the nearest points between two polygons.
I get the expected result for two simple polygons:
However for more complex polygons, the points are not the expected nearest points between the polygons.
Note: I added ax.set_aspect('equal') so that the nearest points line woild have to be at a right angle (right?)
What is wrong with my code or my polygons (or me)?
from shapely.geometry import Point, Polygon, LineString, MultiPolygon
from shapely.ops import nearest_points
import matplotlib.pyplot as plt
import shapely.wkt as wkt
#Set of 2 Polyogons A where the nearest points don's seem right
poly1=wkt.loads('POLYGON((-0.136319755454978 51.460464712623626, -0.1363352511419218 51.46042713866513, -0.1363348393705439 51.460425, -0.1365967582352347 51.460425, -0.1363077932125138 51.4605825392028, -0.136237298707157 51.46052697162038, -0.136319755454978 51.460464712623626))')
poly2=wkt.loads('POLYGON ((-0.1371553266140889 51.46046700960882, -0.1371516327997412 51.46046599134276, -0.1371478585043985 51.46046533117243, -0.1371440383598866 51.46046503515535, -0.1371402074187299 51.460465106007696, -0.1371364008325196 51.460465543079344, -0.137132653529373 51.46046634235985, -0.1371289998934435 51.46046749651525, -0.1371254734494216 51.46046899495536, -0.1371221065549237 51.46047082393093, -0.1366012836405492 51.460786954965236, -0.1365402944168757 51.46074798846902, -0.1370125055334012 51.46045400071198, -0.1371553266140889 51.46046700960882))')
#Set of 2 polygons B where the nearest points seem right
#poly1 = Polygon([(0, 0), (2, 8), (14, 10), (6, 1)])
#poly2 = Polygon([(10, 0),(13,5),(14,2)])
p1, p2 = nearest_points(poly1, poly2)
fig,ax= plt.subplots()
ax.set_aspect('equal')
x1,y1=poly1.exterior.xy
x2,y2=poly2.exterior.xy
#Plot Polgygons
plt.plot(x1,y1)
plt.plot(x2,y2)
#Plot LineString connecting the nearest points
plt.plot([p1.x, p2.x],[p1.y,p2.y], color='green')
fig.show()

how to get valid latitude and longitude from linestring

I want a list of latitudes and longitudes of North Carolina roads for my research work. So I got the .shp file from here
https://xfer.services.ncdot.gov/gisdot/DistDOTData/NCRoutes_SHP.zip
and I loaded the file using geopandas.
import geopandas as gpd
graph = gpd.read_file("NCRoutes.shp")
Here is the geometry column of the shapefile.
graph['geometry']
Output:
0 MULTILINESTRING Z ((950413.442 761781.527 0.00...
1 MULTILINESTRING Z ((947047.370 633980.630 2181...
2 MULTILINESTRING Z ((946481.250 821340.560 3756...
3 LINESTRING Z (1000455.242 564424.433 1854.400,...
4 LINESTRING Z (1840729.024 842228.554 588.800, ...
...
373365 LINESTRING Z (2474108.250 658112.370 67.400, 2...
373366 LINESTRING Z (2331115.610 180293.340 37.400, 2...
373367 LINESTRING Z (2398439.990 968349.560 156.400, ...
373368 LINESTRING Z (1465953.417 567437.417 810.200, ...
373369 LINESTRING Z (1782694.744 871896.463 833.000, ...
Name: geometry, Length: 373370, dtype: geometry
When I print a single linestring, it looks like this-
graph['geometry'][3].coords.xy
Output:
(array('d', [1000455.2419360131, 1000541.414176017, 1000666.0802560151, 1000866.2999680042, 1001138.8699360043, 1001250.1976800114, 1001361.2661760151, 1001444.3955520093, 1001527.3755040169, 1001610.2039200068, 1001692.8797440082,.....
how do I convert these multistring and linestrings to latitudes and longitudes?
You need to transform your geometries to a geographic coordinate system.. Let's use the poplar WGS 84. But first of all, let's check to see that the data has a CRS defined. Printing the gdf.crs attribute on our gdf lets us know that the data are currently in a NC State Plane projected coordinate system. Hint: This can also be deduced by looking at the .prj file where it's stored in WKT(well-known text) format. You can also find this information in QGIS, ArcGIS, etc.
import geopandas as gpd
gdf = gpd.read_file('NCRoutes.shp')
gdf.crs
Next we can use gpd's .to_crs() to convert to WGS 84, which uses EPSG code 4326.
gdf_wgs84 = gdf.to_crs(4326)
Let's look at the first few geometry rows. As you can see the coordinates are no longer measured in feet, but are now longitude and latitude.
gdf_wgs84['geometry'].head()
Output:
0 MULTILINESTRING ((-82.53992 35.79168, -82.5399...
1 MULTILINESTRING ((-82.53591 35.44045, -82.5358...
2 MULTILINESTRING ((-82.56037 35.95481, -82.5603...
3 LINESTRING (-82.34883 35.25454, -82.34854 35.2...
4 LINESTRING (-79.53887 36.06291, -79.53876 36.0...
Name: geometry, dtype: geometry

Access Z coordinate in a LINESTRING Z in geopandas?

I have a GeoDataFrame with a LINESTRING Z geometry where Z is my altitude for the lat/long. (There are other columns in the dataframe that I deleted for ease of sharing but are relevant when displaying the resulting track)
TimeUTC Latitude Longitude AGL geometry
0 2021-06-16 00:34:04+00:00 42.835413 -70.919610 82.2 LINESTRING Z (-70.91961 42.83541 82.20000, -70...
I would like to find the maximum Z value in that linestring but I am unable to find a way to access it or extract the x,y,z values in a way that I can determine the maximum value outside of the linestring.
line.geometry.bounds only returns the x,y min/max.
The best solution I could come up with was to turn all the points into a list of tuples:
points = line.apply(lambda x: [y for y in x['geometry'].coords], axis=1)
And then find the maximum value of the third element:
from operator import itemgetter
max(ft2,key=itemgetter(2))[2]
I hope there is a better solution available.
Thank you.
You can take your lambda function approach and just take it one step further:
import numpy as np
line['geometry'].apply(lambda geom: np.max([coord[2] for coord in geom.coords]))
Here's a fully reproducible example from start to finish:
import shapely
import numpy as np
import geopandas as gpd
linestring = shapely.geometry.LineString([[0,0,0],
[1,1,1],
[2,2,2]])
gdf = gpd.GeoDataFrame({'id':[1,2,3],
'geometry':[linestring,
linestring,
linestring]})
gdf['max_z'] = (gdf['geometry']
.apply(lambda geom:
np.max([coord[2] for coord in geom.coords])))
In the example above, I create a new column called "max_z" that stores the maximum Z value for each row.
Important note
This solution will only work if you exclusively have LineStrings in your geometries. If, for example, you have MultiLineStrings, you'll have to adapt the function I wrote to take care of that.

How to calculate distance using geopandas

I wanted to calculate the distance from Manila to cities in the Philippines using geopandas GeoSeries.distance(self, other) function.
Steps:
# So I start with the dataset, which should produce a geopandas dataframe consisting basically of cities and a polygon of its boundaries in latlong.
url = 'https://raw.githubusercontent.com/macoymejia/geojsonph/master/MuniCities/MuniCities.minimal.json'
df1 = gpd.read_file(url)
# then I define a centroid column
df1['Centroid'] = df1.geometry.centroid
# then I define Manila location as a shapely point geometry, which produces a DataFrame with point geometry and address as columns
manila_loc = gpd.tools.geocode('Manila')
# then I try to calculate the distance
df1.Centroid.distance(manila_loc.geometry)
But I'm getting this error:
AttributeError Traceback (most recent call last)
<ipython-input-30-76585915942f> in <module>
----> 1 df1.Centroid.distance(manila_loc.geometry)
~/opt/anaconda3/envs/Coursera/lib/python3.8/site-packages/pandas/core/generic.py in __getattr__(self, name)
5137 if self._info_axis._can_hold_identifiers_and_holds_name(name):
5138 return self[name]
-> 5139 return object.__getattribute__(self, name)
5140
5141 def __setattr__(self, name: str, value) -> None:
AttributeError: 'Series' object has no attribute 'distance'
I'm new to GeoPandas but I thought from the documentation that distance method can act on GeoSeries and that df1.Centroid and manila.geometry are valid shapely geometry objects. So I don't know what I am missing. Help pls.
Try this
# relevant code only
dists = []
for i, centr in df1.Centroid.iteritems():
dist = centr.distance( manila.geometry[0] )
dists.append(dist)
print("Dist2Manila: ", dist)
To create new column for the distances:
df1["Dist2Manila"] = dists
You need to feed a singular Point object to the distance method:
from shapely.geometry import Point
from geopandas import GeoDataFrame
destination = Point(5, 5)
geoms = map(lambda x: Point(*x), [(0, 0), (3, 3), (4, 1), (8, 2), (1, 10)])
departures = GeoDataFrame({'city': list('ABCDE'), 'geometry': geoms})
print(departures.assign(dist_to_dest=departures.distance(destination)))
Which give me:
city geometry dist_to_dest
0 A POINT (0.00000 0.00000) 7.071068
1 B POINT (3.00000 3.00000) 2.828427
2 C POINT (4.00000 1.00000) 4.123106
3 D POINT (8.00000 2.00000) 4.242641
4 E POINT (1.00000 10.00000) 6.403124
I was able to solve it. I mistakenly assumed that since it is a geopandas data frame, the centroid column will already be treated as such. But I think you have to explicitly define it. So it's what I did. A minor revision on the original code from:
df1.Centroid.distance(manila_loc.geometry)
to:
gpd.GeoSeries(df1.Centroid).distance(manila_loc.iloc[0,0])
It worked.

Geopandas: Get a box that coveres area of a geopandas GeoDataFrame to use it to invert a map

I'm trying to invert a map.
import geopandas as gpd
import geoplot as gplt
world = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))
denmark = world[world.name == 'Denmark']
I would like to find out the boundaries of the "denmark" dataframe, to that I can create a box shaped GeoDataFrame that covers all of Denmark.
I'd then intersect that with "denmark" to get a shape of all that is not denmark, which I can later use to cover parts of a map I don't want to show.
I tried looking through the GeoDataFrame to create this box manually, but that doesn't work well.
cords = [c3
for c in mapping(denmark['geometry'])['features']
for c2 in c['geometry']['coordinates']
for c3 in c2
]
xcords = [x[0] for x in cords if isinstance(x[0], float)]
ycords = [y[1] for y in cords if isinstance(y[1], float)]
w3 = gpd.GeoDataFrame(
[Polygon([[max(xcords), max(ycords)],
[max(xcords), min(ycords)],
[min(xcords), min(ycords)],
[min(xcords), max(ycords)]
])],
columns = ['geometry'],
geometry='geometry')
Is there an easy, quick way to get this box?
Or is there a way tp invert a GeoDataFrame?
A GeoDataFrame has the total_bounds attribute, which returns the minx, miny, maxx, maxy of all geometries (the min/max of the bounds of all geometries).
And to create a Polygon of this, you can then pass those values to the shapely.geometry.box function:
>>> denmark.total_bounds
array([ 8.08997684, 54.80001455, 12.69000614, 57.73001659])
>>> from shapely.geometry import box
>>> box(*denmark.total_bounds)
<shapely.geometry.polygon.Polygon at 0x7f06be3e7668>
>>> print(box(*denmark.total_bounds))
POLYGON ((12.6900061377556 54.80001455343792, 12.6900061377556 57.73001658795485, 8.089976840862221 57.73001658795485, 8.089976840862221 54.80001455343792, 12.6900061377556 54.80001455343792))
Looks like a GeoDataFrame has a property "total_bounds"
So it's
denmark.total_bounds
which returns
array([ 8.08997684, 54.80001455, 12.69000614, 57.73001659])

Resources