how to get valid latitude and longitude from linestring - geopandas

I want a list of latitudes and longitudes of North Carolina roads for my research work. So I got the .shp file from here
https://xfer.services.ncdot.gov/gisdot/DistDOTData/NCRoutes_SHP.zip
and I loaded the file using geopandas.
import geopandas as gpd
graph = gpd.read_file("NCRoutes.shp")
Here is the geometry column of the shapefile.
graph['geometry']
Output:
0 MULTILINESTRING Z ((950413.442 761781.527 0.00...
1 MULTILINESTRING Z ((947047.370 633980.630 2181...
2 MULTILINESTRING Z ((946481.250 821340.560 3756...
3 LINESTRING Z (1000455.242 564424.433 1854.400,...
4 LINESTRING Z (1840729.024 842228.554 588.800, ...
...
373365 LINESTRING Z (2474108.250 658112.370 67.400, 2...
373366 LINESTRING Z (2331115.610 180293.340 37.400, 2...
373367 LINESTRING Z (2398439.990 968349.560 156.400, ...
373368 LINESTRING Z (1465953.417 567437.417 810.200, ...
373369 LINESTRING Z (1782694.744 871896.463 833.000, ...
Name: geometry, Length: 373370, dtype: geometry
When I print a single linestring, it looks like this-
graph['geometry'][3].coords.xy
Output:
(array('d', [1000455.2419360131, 1000541.414176017, 1000666.0802560151, 1000866.2999680042, 1001138.8699360043, 1001250.1976800114, 1001361.2661760151, 1001444.3955520093, 1001527.3755040169, 1001610.2039200068, 1001692.8797440082,.....
how do I convert these multistring and linestrings to latitudes and longitudes?

You need to transform your geometries to a geographic coordinate system.. Let's use the poplar WGS 84. But first of all, let's check to see that the data has a CRS defined. Printing the gdf.crs attribute on our gdf lets us know that the data are currently in a NC State Plane projected coordinate system. Hint: This can also be deduced by looking at the .prj file where it's stored in WKT(well-known text) format. You can also find this information in QGIS, ArcGIS, etc.
import geopandas as gpd
gdf = gpd.read_file('NCRoutes.shp')
gdf.crs
Next we can use gpd's .to_crs() to convert to WGS 84, which uses EPSG code 4326.
gdf_wgs84 = gdf.to_crs(4326)
Let's look at the first few geometry rows. As you can see the coordinates are no longer measured in feet, but are now longitude and latitude.
gdf_wgs84['geometry'].head()
Output:
0 MULTILINESTRING ((-82.53992 35.79168, -82.5399...
1 MULTILINESTRING ((-82.53591 35.44045, -82.5358...
2 MULTILINESTRING ((-82.56037 35.95481, -82.5603...
3 LINESTRING (-82.34883 35.25454, -82.34854 35.2...
4 LINESTRING (-79.53887 36.06291, -79.53876 36.0...
Name: geometry, dtype: geometry

Related

Split lines in GeoPandas

I can find how to split 1 line using geopandas/shapely
def split_line_by_point(line, point, tolerance: float=1.0e-12):
return split(snap(line, point, tolerance), point)
However I can't figure out how to apply this over an entire geometry column while maintaining other values.
Applying the above function to df.geometry loses a bunch of information
How do split a below linestring so that it 'explodes' maintaining 'type' and 'properties' cols?
{"Feature":'Hi',"ID":1,Linestring([1,1],[2,2],[3,3])},
{"Feature":'bye',"ID":2,Linestring([10,10],[20,20],[30,30])}
To
{"Feature":'Hi',"ID":1,Linestring([1,1],[2,2])},
{"Feature":'Hi',"ID":1,Linestring([2,2],[3,3])},
{"Feature":'bye',"ID":2,Linestring([10,10],[20,20])}
{"Feature":'bye',"ID":2,Linestring([20,20],[30,30])}
Lines need to be smaller where length > x
This works for me:
import geopandas
from shapely import geometry
from shapely.ops import split, snap
gdf = geopandas.GeoDataFrame([
{"Feature":'Hi',"ID":1,"geometry": geometry.LineString(([1,1],[2,2],[3,3]))},
{"Feature":'bye',"ID":2, "geometry": geometry.LineString(([10,10],[20,20],[30,30]))}
])
def split_line_by_point(line, point, tolerance: float=1.0e-12):
return split(snap(line, point, tolerance), point)
result = (
gdf
.assign(geometry=gdf.apply(
lambda x: split_line_by_point(
x.geometry,
geometry.Point(x.geometry.coords[1])
), axis=1
))
.explode()
.reset_index(drop=True)
)
Output:
Feature ID geometry
0 Hi 1 LINESTRING (1.00000 1.00000, 2.00000 2.00000)
1 Hi 1 LINESTRING (2.00000 2.00000, 3.00000 3.00000)
2 bye 2 LINESTRING (10.00000 10.00000, 20.00000 20.00000)
3 bye 2 LINESTRING (20.00000 20.00000, 30.00000 30.00000)

Geopandas: How to relate the length of a linestring to the linestring point used to find distance to polygon

I’m trying to find the length of the linestring between the starting point of the linestring and the point which are used to find the nearest distance to a polygon.
So I used the following code to get the minimum distance between the linestring and some polygons.
gdf['MinDistToTrack'] = gdf.geometry.apply(lambda l: min(rail_or.distance(l)))
and I would also like to get the distance from the start of the linestring to the point used by the above code.
Now I get dataframe containing the polygons with a value 'MinDistToTrack' (which I have now) but also with a value ‘Length_Of_Linestring_Up_To_Location_Of_Polygon’.
So, let’s say that from the start of the linestring to the polygon there are 22 meters following the path of the linestring, then this is the value I would like to save together with the 'MinDistToTrack'
Polygon ID : 1
'MinDistToTrack' : 1m
'LengthOfLinestringUpToLocationOfPolygon' : 22m
Is this possible or do I need to split the linestring up into small elements and then look at all elements and the length of all the preceding elements in relation to the linestring elements which is nearest to the polygon?
Picture showing the problem
You may use the following concepts from shapely:
The nearest_points() function in shapely.ops calculates the nearest points in a pair of geometries.
shapely.ops.nearest_points(geom1, geom2)
Returns a tuple of the nearest points in the input geometries. The points are returned in the same order as the input geometries.
https://shapely.readthedocs.io/en/stable/manual.html#shapely.ops.nearest_points
from shapely.ops import nearest_points
P = Polygon([(0, 0), (1, 0), (0.5, 1), (0, 0)])
Lin = Linestring([(0, 2), (1, 2), (1, 3), (0, 3)])
nps = [o.wkt for o in nearest_points(P, Lin)]
##nps = ['POINT (0.5 1)', 'POINT (0.5 2)']
np_lin = = nps[1]
You can then use the point np_lin and Project it on the Lin to get the distance using
d = Lin.project(np_lin)
d will be the distance along Lin to the point np_lin i.e. nearest to the corresponding Point of P.

Access Z coordinate in a LINESTRING Z in geopandas?

I have a GeoDataFrame with a LINESTRING Z geometry where Z is my altitude for the lat/long. (There are other columns in the dataframe that I deleted for ease of sharing but are relevant when displaying the resulting track)
TimeUTC Latitude Longitude AGL geometry
0 2021-06-16 00:34:04+00:00 42.835413 -70.919610 82.2 LINESTRING Z (-70.91961 42.83541 82.20000, -70...
I would like to find the maximum Z value in that linestring but I am unable to find a way to access it or extract the x,y,z values in a way that I can determine the maximum value outside of the linestring.
line.geometry.bounds only returns the x,y min/max.
The best solution I could come up with was to turn all the points into a list of tuples:
points = line.apply(lambda x: [y for y in x['geometry'].coords], axis=1)
And then find the maximum value of the third element:
from operator import itemgetter
max(ft2,key=itemgetter(2))[2]
I hope there is a better solution available.
Thank you.
You can take your lambda function approach and just take it one step further:
import numpy as np
line['geometry'].apply(lambda geom: np.max([coord[2] for coord in geom.coords]))
Here's a fully reproducible example from start to finish:
import shapely
import numpy as np
import geopandas as gpd
linestring = shapely.geometry.LineString([[0,0,0],
[1,1,1],
[2,2,2]])
gdf = gpd.GeoDataFrame({'id':[1,2,3],
'geometry':[linestring,
linestring,
linestring]})
gdf['max_z'] = (gdf['geometry']
.apply(lambda geom:
np.max([coord[2] for coord in geom.coords])))
In the example above, I create a new column called "max_z" that stores the maximum Z value for each row.
Important note
This solution will only work if you exclusively have LineStrings in your geometries. If, for example, you have MultiLineStrings, you'll have to adapt the function I wrote to take care of that.

Geopandas: Get a box that coveres area of a geopandas GeoDataFrame to use it to invert a map

I'm trying to invert a map.
import geopandas as gpd
import geoplot as gplt
world = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))
denmark = world[world.name == 'Denmark']
I would like to find out the boundaries of the "denmark" dataframe, to that I can create a box shaped GeoDataFrame that covers all of Denmark.
I'd then intersect that with "denmark" to get a shape of all that is not denmark, which I can later use to cover parts of a map I don't want to show.
I tried looking through the GeoDataFrame to create this box manually, but that doesn't work well.
cords = [c3
for c in mapping(denmark['geometry'])['features']
for c2 in c['geometry']['coordinates']
for c3 in c2
]
xcords = [x[0] for x in cords if isinstance(x[0], float)]
ycords = [y[1] for y in cords if isinstance(y[1], float)]
w3 = gpd.GeoDataFrame(
[Polygon([[max(xcords), max(ycords)],
[max(xcords), min(ycords)],
[min(xcords), min(ycords)],
[min(xcords), max(ycords)]
])],
columns = ['geometry'],
geometry='geometry')
Is there an easy, quick way to get this box?
Or is there a way tp invert a GeoDataFrame?
A GeoDataFrame has the total_bounds attribute, which returns the minx, miny, maxx, maxy of all geometries (the min/max of the bounds of all geometries).
And to create a Polygon of this, you can then pass those values to the shapely.geometry.box function:
>>> denmark.total_bounds
array([ 8.08997684, 54.80001455, 12.69000614, 57.73001659])
>>> from shapely.geometry import box
>>> box(*denmark.total_bounds)
<shapely.geometry.polygon.Polygon at 0x7f06be3e7668>
>>> print(box(*denmark.total_bounds))
POLYGON ((12.6900061377556 54.80001455343792, 12.6900061377556 57.73001658795485, 8.089976840862221 57.73001658795485, 8.089976840862221 54.80001455343792, 12.6900061377556 54.80001455343792))
Looks like a GeoDataFrame has a property "total_bounds"
So it's
denmark.total_bounds
which returns
array([ 8.08997684, 54.80001455, 12.69000614, 57.73001659])

inconsistent definition of longitude and latitude for healpy.pixelfunc.get_interp_val() or healpy.mollview()?

when I rotate a Healpix map along longitude or latitude, I get the wrong behavior.
I'm probably missing something obvious here but so far, I failed to find what.
See demo:
import numpy as np
import healpy as hp
import matplotlib.pyplot as plt
nside = 4
npix = hp.nside2npix(nside)
idx = 70
offset = 1 # rad
# set one pixel to 1 in the map
data = np.array(np.equal(np.arange(npix), idx), dtype=float)
hp.mollview(data, nest=True, title='original')
# longitude and co-latitude in radians
theta, phi = hp.pix2ang(nside, np.arange(npix), nest=True)
# rotate: offset on longitude, keep co-latitude the same
rotated = hp.get_interp_val(data, theta + offset, phi, nest=True)
hp.mollview(rotated, nest=True, title='rotated longitude')
# rotate: keep longitude the same, offset on co-latitude
rotated = hp.get_interp_val(data, theta, phi+offset, nest=True)
hp.mollview(rotated, nest=True, title='rotated latitude')
and results:
original map
rotated longitude
rotated latitude
The dot in the map rotated along longitude is translated vertically, while it is translated horizontally for the rotation along latitude. I'd expect the reverse.
Any hint about what's wrong here?
E.
Theta is co-latitude, Phi is longitude.
It is confusing because their order is inverted than what we usually expect. In fact even in healpy, for example in pix2ang if you set lonlat to true, you get as outputs first Longitude and then Latitude.
Unfortunately this is the convention and we have to stick to this.

Resources