How do I index a document with below data in elasticsearch(geo datatype)?
<west>5.8663152683722</west>
<north>55.0583836008072</north>
<east>15.0418156516163</east>
<south>47.2701236047002</south>
I tried geo_point and its working for lon and lat's, not sure how to save this data. any help is highly appreciated.
You'll have to use the geo_shape datatype and convert your XML (I assume) semi-points into a line string or polygon before sync.
I'm gonna go with a polygon here.
Let's visualise the conventional cardinal directions:
North (+90)
|
(-180) West ——+—— East (+180)
|
South (-90)
geo_shape expects GeoJSON-like inputs so you'll need five coordinate points, the first and last of which are identical (according to the GeoJSON spec).
Therefore, borrowing from TurfJS and going from bottom left counter-clockwise,
const lowLeft = [west, south];
const topLeft = [west, north];
const topRight = [east, north];
const lowRight = [east, south];
return
[
[
lowLeft,
lowRight,
topRight,
topLeft,
lowLeft
]
]
Finally, let's create our index and plug your numbers in
PUT /example
{
"mappings": {
"properties": {
"location": {
"type": "geo_shape"
}
}
}
}
POST /example/_doc
{
"location":{
"type":"polygon",
"coordinates":[
[
[
5.8663152683722,
47.2701236047002
],
[
15.0418156516163,
47.2701236047002
],
[
15.0418156516163,
55.0583836008072
],
[
5.8663152683722,
55.0583836008072
],
[
5.8663152683722,
47.2701236047002
]
]
]
}
}
Then verify that the center of your square if indeed inside of your indexed polygon:
GET example/_search
{
"query": {
"geo_shape": {
"location": {
"shape": {
"type": "point",
"coordinates": [
10.45406545999425,
51.1642536027537
]
},
"relation": "intersects"
}
}
}
}
Related
I'm trying to insert a location that is using a circle processor to generate the circle points. Location [30, 10] in the document is generated with the circle points properly.
docs: Circle processor | Elasticsearch Guide [8.5] | Elastic
PUT _ingest/pipeline/polygonize_circles
{
"description": "translate circle to polygon",
"processors": [
{
"circle": {
"field": "circle",
"error_distance": 1,
"shape_type": "geo_shape"
}
}
]
}
PUT circles
{
"mappings": {
"properties": {
"circle": {
"type": "geo_shape"
}
}
}
}
PUT circles/_doc/2?pipeline=polygonize_circles
{
"circle": {
"type": "circle",
"radius": "40m",
"coordinates": [35.539917, -78.472000]
}
}
GET circles/_doc/2
But if I use another location. The generated coordinate looks like an oval with the wrong radius.
my location [35.54171753710938, -78.472]
created coordinates:
"circle": {
"coordinates": [
[
[
35.54171753710938,
-78.472
],
[
35.54112406581135,
-78.47173430472324
],
[
35.540630197847186,
-78.47167003953963
],
[
35.540375564960186,
-78.47165140797998
],
[
35.54021828908823,
-78.47164529506406
],
[
35.54010640465818,
-78.47164274491611
],
[
35.54001650261309,
-78.47164162397662
],
[
35.53993651647515,
-78.47164003979641
],
[
35.539858062238046,
-78.47164049555624
],
[
35.53977439153409,
-78.47164207973975
],
[
35.5396750942597,
-78.47164321572573
],
[
35.5395458932956,
-78.47164846905713
],
[
35.539348254773515,
-78.47165779735127
],
[
35.53899878746994,
-78.47169061817682
],
[
35.53833938849573,
-78.47182842440924
],
[
35.53833938849573,
-78.47217157559075
],
[
35.53899878746994,
-78.47230938182317
],
[
35.539348254773515,
-78.47234220264872
],
[
35.5395458932956,
-78.47235153094286
],
[
35.5396750942597,
-78.47235678427425
],
[
35.53977439153409,
-78.47235792026024
],
[
35.539858062238046,
-78.47235950444374
],
[
35.53993651647515,
-78.47235996020358
],
[
35.54001650261309,
-78.47235837602337
],
[
35.54010640465818,
-78.47235725508388
],
[
35.54021828908823,
-78.47235470493592
],
[
35.540375564960186,
-78.47234859202001
],
[
35.540630197847186,
-78.47232996046036
],
[
35.54112406581135,
-78.47226569527675
],
[
35.54171753710938,
-78.472
]
]
],
"type": "Polygon"
}
coordinates mapping on google maps
Is it an issue or It's working as expected? Because the coordinates are not a circle so it's impacting the search result.
You need to specify your coordinate array using longitude first and then latitude, I think you did the opposite and your circle is in the middle of Antartica.
If you do it like this:
PUT circles/_doc/2?pipeline=polygonize_circles
{
"circle": {
"type": "circle",
"radius": "40m",
"coordinates": [-78.472000, 35.539917]
}
}
Then your circle doesn't look oval anymore:
From the official doc:
In GeoJSON and WKT, and therefore Elasticsearch, the correct coordinate order is longitude, latitude (X, Y) within coordinate arrays. This differs from many Geospatial APIs (e.g., Google Maps) that generally use the colloquial latitude, longitude (Y, X).
the elasticsearch index contains json as below, only relevant element is show
"geoLocation": {
"coordinates": [ [ -90.66487121582031, 42.49201965332031 ], [ -90.66487884521484, 42.49202346801758 ], [ -90.6648941040039, 42.492034912109375 ], [ -90.66490936279297, 42.49203872680664 ], [ -90.66492462158203, 42.492042541503906 ], [ -90.6649398803711, 42.49204635620117 ], [ -90.66495513916016, 42.49205017089844 ], [ -90.66497039794922, 42.4920539855957 ], [ -90.66498565673828, 42.492061614990234 ], [ -90.66500854492188, 42.492061614990234 ], [ -90.66502380371094, 42.49207305908203 ], [ -90.6650390625, 42.4920654296875 ] ],
"type": "linestring"
},
The template for generating the mapping is as below
PUT _template/template_1?include_type_name=true
{
"index_patterns": ["metromind-its-alerts-day2-*"],
"settings": {
"number_of_shards": 2
},
"mappings": {
"logs": {
"properties": {
"geoLocation": {
"type": "geo_shape"
}
}
}
}
}
the mapping generated is shown below
mapping showing the geoLocation type
When Kibana Maps are used it detects the geo_shape
Kibana Map to render Linestring
Note However no Linestring is rendered, please suggest the resolution
The line string is there, in Dubuque IL, it's just that it's extra small at the scale of the earth.
Just click on the following icon and Elastic Map will focus on it and you'll see it:
I have following mapping in elastic search. I am able to PUT documents using Sense plugin but unable to do so using XContentBuilder to set the geo_shape field value. I am getting following error:
error:
[106]: index [streets], type [street], id [{dc872755-f307-4c5e-93f6-bba9c95791c7}], message [MapperParsingException[failed to parse [shape]]; nested: ElasticsearchParseException[shape must be an object consisting of type and coordinates];]
mapping:
PUT /streets
{
"mappings": {
"street": {
"properties": {
"id": {
"type": "string"
},
"shape": {
"type": "geo_shape",
"tree": "quadtree"
}
}
}
}
}
code:
val bulkRequest:BulkRequestBuilder = esClient.prepareBulk()
//inloop
xb = jsonBuilder().startObject()
xb.field("id", guid)
xb.field("shape", jsonString) // removing this line creates the index OK but without the geo_shape
xb.endObject()
bulkRequest.add(esClient.prepareIndex("streets", "street", guid).setSource(xb))
//end loop
val bulkResponse:BulkResponse = bulkRequest.execute().actionGet()
if(bulkResponse.hasFailures){
println(bulkResponse.buildFailureMessage())
}
jsonString:
{
"id": "{98b8fd8d-074c-4349-a83b-6e892bf2d0ef}",
"shape": {
"type": "LineString",
"coordinates": [
[-70.81866815832467, 43.12187109162505],
[-70.83054813653018, 43.15917412985851],
[-70.81320737213957, 43.23522269547419],
[-70.90108590067649, 43.28102004268419]
],
"crs": {
"type": "name",
"properties": {
"name": "EPSG:4326"
}
}
}
}
Appreciate any feedback?
Thanks
It might be a bit late for you, but this could help someone facing a similar issue even nowadays.
Following your index mapping for the document streets, we have these properties: id and shape.
In your error message, it's described that:
shape must be an object consisting of type and coordinates
So for your concrete case, the crs array is just not accepted (don't know exactly why you can't add extra parameters).
This is an example for how to add a document into the streets index using CURL:
curl -X POST "localhost:9200/streets/_doc?pretty" -H 'Content-Type: application/json' -d '
{
"id": 123,
"shape": {
"type": "Polygon",
"coordinates": [
[
[
32.85444259643555,
39.928694653732364
],
[
32.847232818603516,
39.9257985682691
],
[
32.837791442871094,
39.91947941109337
],
[
32.837276458740234,
39.91579296675271
],
[
32.85392761230469,
39.913423004886894
],
[
32.86937713623047,
39.91329133793421
],
[
32.88036346435547,
39.91539797880347
],
[
32.85444259643555,
39.928694653732364
]
]
]
}
}'
If you need to add a LineString, instead of a Polygon, just change the 'type' attribute from the 'shape'.
I hope this helps people having to add documents with shapes into an ElasticSearch database.
I am trying to store spatial data in the form of geojson,csv files and shape files into elasticsearch USING PYTHON.I am new to elasticsearch and even after following the documentation i am not able to successfully index it. Any help would be appreciated.
sample geojson file :
{
"type": "FeatureCollection",
"features": [
{
"type": "Feature",
"properties": {
"ID_0": 105,
"ISO": "IND",
"NAME_0": "India",
"ID_1": 1288,
"NAME_1": "Telangana",
"ID_2": 15715,
"NAME_2": "Telangana",
"VARNAME_2": null,
"NL_NAME_2": null,
"HASC_2": "IN.TS.AD",
"CC_2": null,
"TYPE_2": "State",
"ENGTYPE_2": "State",
"VALIDFR_2": "Unknown",
"VALIDTO_2": "Present",
"REMARKS_2": null,
"Shape_Leng": 8.103535,
"Shape_Area": 127258717496
},
"geometry": {
"type": "Polygon",
"coordinates": [
[
[
79.14429367552918,
19.500257885106404
],
[
79.14582245808431,
19.498859172536427
],
[
79.14600496956801,
19.498823981691853
],
[
79.14966523737327,
19.495821705263914
]
]
]
}
}
]
}
Code
import geojson
from datetime import datetime
from elasticsearch import Elasticsearch, helpers
def geojson_to_es(gj):
for feature in gj['features']:
date = datetime.strptime("-".join(feature["properties"]["event_date"].split('-')[0:2]) + "-" + feature["properties"]["year"], "%d-%b-%Y")
feature["properties"]["timestamp"] = int(date.timestamp())
feature["properties"]["event_date"] = date.strftime('%Y-%m-%d')
yield feature
with open("GeoObs.json") as f:
gj = geojson.load(f)
es = Elasticsearch(hosts=[{'host': 'localhost', 'port': 9200}])
k = ({
"_index": "YOUR_INDEX",
"_source": feature,
} for feature in geojson_to_es(gj))
helpers.bulk(es, k)
Explanation
with open("GeoObs.json") as f:
gj = geojson.load(f)
es = Elasticsearch(hosts=[{'host': 'localhost', 'port': 9200}])
This portion of the code loads an external geojson file, then connects to Elasticsearch.
k = ({
"_index": "conflict-data",
"_source": feature,
} for feature in geojson_to_es(gj))
helpers.bulk(es, k)
The ()s here creates a generator which we will feed to helpers.bulk(es, k). Remember _source is the original data as is in Elasticsearch speak - IE: our raw JSON. _index is just the index in which we want to put our data. You'll see other examples with _doc here. This is part of the mapping types and no longer exists in Elasticsearch 7.X+.
def geojson_to_es(gj):
for feature in gj['features']:
date = datetime.strptime("-".join(feature["properties"]["event_date"].split('-')[0:2]) + "-" + feature["properties"]["year"], "%d-%b-%Y")
feature["properties"]["timestamp"] = int(date.timestamp())
feature["properties"]["event_date"] = date.strftime('%Y-%m-%d')
yield feature
The function geojson uses a generator to produce events. A generator function will, instead of returning and finishingresume at the keywordyield` after each call. In this case, we are generating our GeoJSON features. In my code you also see:
date = datetime.strptime("-".join(feature["properties"]["event_date"].split('-')[0:2]) + "-" + feature["properties"]["year"], "%d-%b-%Y")
feature["properties"]["timestamp"] = int(date.timestamp())
feature["properties"]["event_date"] = date.strftime('%Y-%m-%d')
This is just an example of manipulating the data in the JSON before sending it out to Elasticsearch.
The key is in your mapping file you must have something tagged as geo_point or geo_shape. These data types are how Elasticsearch recognizes geo data. Example from my mapping file:
...
{
"properties": {
"geometry": {
"properties": {
"coordinates": {
"type": "geo_point"
},
"type": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
...
That is to say, before uploading your GeoJSON data with Python, you need to create your index, and then apply a mapping file which includes either geo_shape or geo_point using something like:
curl -X PUT "localhost:9200/YOUR_INDEX?pretty"
curl -X PUT localhost:9200/YOUR_INDEX/_mapping?pretty -H "Content-Type: application/json" -d #mapping.json
You must separate the GeoJson features into (1) geometry and (2) properties/attributes parts. You cannot index GeoJson features and feature collections directly (see documentation), only the geometry part is supported as a field type.
So you final indexable document would look somewhat flattened:
{
"ID_0": 105,
"ISO": "IND",
"NAME_0": "India",
"ID_1": 1288,
"NAME_1": "Telangana",
"ID_2": 15715,
"NAME_2": "Telangana",
"VARNAME_2": null,
"NL_NAME_2": null,
"HASC_2": "IN.TS.AD",
"CC_2": null,
"TYPE_2": "State",
"ENGTYPE_2": "State",
"VALIDFR_2": "Unknown",
"VALIDTO_2": "Present",
"REMARKS_2": null,
"Shape_Leng": 8.103535,
"Shape_Area": 127258717496,
"geometry": {
"type": "Polygon",
"coordinates": [
[
[
79.14429367552918,
19.500257885106404
],
[
79.14582245808431,
19.498859172536427
],
[
79.14600496956801,
19.498823981691853
],
[
79.14966523737327,
19.495821705263914
]
]
]
}
}
Using Elasticsearch 2.3.1 and the python client I'm getting unexpected results with geo search. In the code below I create a simple index. I then index two shapes - a big square and a small square completely contained in that square (look at them here). I then query the index using with the small square using all available strategies {'intersects', 'disjoint', 'within', 'contains'}.
from elasticsearch import Elasticsearch
es_client = Elasticsearch()
# CREATE INDEX
INDEX = "geo_shapes_2"
DOC_TYPE = "metros"
es_client.indices.delete(INDEX, ignore=404)
body = {
"mappings": {
"metro": {
"properties": {
"id": {
"type": "string",
"index": "not_analyzed"
},
"geometry": {
"type": "geo_shape",
"tree": "quadtree",
"precision": "50m",
"distance_error_pct": 0.025
}
}
}
}
}
es_client.indices.create(INDEX, body)
# INDEX TWO DOCS
# small_square is completely contained within big_square
small_square = 'small square'
body = {
'id': small_square,
'geometry': {
"type": "Polygon",
"coordinates": [
[
[
-106.248779296875,
35.12889434101051
],
[
-106.34765625,
35.88014896488361
],
[
-105.062255859375,
35.8356283888737
],
[
-105.13916015625,
35.05698043137265
],
[
-106.248779296875,
35.12889434101051
]
]
]
}
}
es_client.create(INDEX, DOC_TYPE, body=body, id=small_square)
big_square = 'big square'
body = {
'id': big_square,
'geometry': {
"type": "Polygon",
"coordinates": [
[
[
-108.97338867187499,
32.93492866908233
],
[
-108.69873046875,
38.19502155795573
],
[
-102.095947265625,
38.12591462924157
],
[
-102.39257812499999,
32.87036022808355
],
[
-108.97338867187499,
32.93492866908233
]
]
]
}
}
es_client.create(INDEX, DOC_TYPE, body=body, id=big_square)
# ISSUE ALL 4 STRATEGY QUERIES FOR SMALL SQUARE
body = {
'filter': {
'geo_shape': {
'geometry': {
'indexed_shape': {
'id': small_square,
'type': DOC_TYPE,
'index': INDEX,
'path': 'geometry',
},
}
}
}
}
for strategy in ['intersects', 'disjoint', 'within', 'contains']:
body['filter']['geo_shape']['geometry']['relation'] = strategy
response = es_client.search(INDEX, body=body)
print strategy + ': ' + (', '.join([r['_source']['id'] for r in response['hits']['hits']]) or 'none')
Here I would expect the response to be:
intersects: big square, small square
disjoint: none
within: none (or maybe include small square depending upon the semantics of "within")
contains: big square, small square (or maybe omit small square depending upon the semantics of "contains")
Instead I see:
intersects: none
disjoint: big square, small square
within: none
and this is followed by an error
...
/Library/Python/2.7/site-packages/elasticsearch/connection/base.pyc in _raise_error(self, status_code, raw_data)
106 pass
107
--> 108 raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code, error_message, additional_info)
109
110
RequestError: TransportError(400, u'search_phase_execution_exception', u'')
Am I doing something wrong? Or is this a bug?
Figured it out mostly. The above code has a typo in the mapping where the doc type is 'metro' but everywhere else I use DOC_TYPE of 'metros'.
But contains still returns an error.