In elasticsearch I memorized a series of documents having the following structure:
{
"_index": "logstash-2018.05.10",
"_type": "doc",
"_id": "VM-QSWMBq8te9tFe-bcj",
"_version": 1,
"_score": null,
"_source": {
"location": {
"lat": 42,
"lon": 12.5
},
"#timestamp": "2018-05-10T10:20:24.988Z",
"port": 53950,
"tags": [
"geoinfo"
],
"host": "gateway",
"#version": "1",
"message": "{\"#version\":1,\"level\":\"INFO\",\"logger_name\":\"it.test.elk.ELKTestApplication\",\"appName\":\"docker-elk-master\",\"thread_name\":\"main\",\"message\":\"LAT: 42, LON: 12.5\"}\r",
"type": "java"
},
"fields": {
"#timestamp": [
"2018-05-10T10:20:24.988Z"
]
},
"sort": [
1525947624988
]
}
The corresponding index is this (I only show a small part):
{
"logstash-2018.05.10": {
"aliases": {},
"mappings": {
"doc": {
...
"properties": {
...
"geoip": {
"dynamic": "true",
"properties": {
"ip": {
"type": "ip"
},
"latitude": {
"type": "half_float"
},
"location": {
"type": "geo_point"
},
"longitude": {
"type": "half_float"
}
}
},
...
"location": {
"properties": {
"lat": {
"type": "float"
},
"lon": {
"type": "float"
}
}
},
...
}
}
},
...
}
}
}
On Kibana -> Coordinates Map, I tried to show the geospatial information contained in the location field, but nothing is seen.
Maybe location should be a geo_point?
Where am I wrong?
In the Discover view, if the field appears with a ? next to it (or it doesn't have a globe next to it), it means that Kibana doesn't know it's a geo_point (even if Elasticsearch does). To fix that, you need to go into kibana index settings (for logstash-*) and hit the refresh button.
After you've hit reload, type location into the search box on the index definition screen and make sure it shows up as a geo_point.
Related
I have a weird problem with Elasticsearch 6.0.
I have an index with the following mapping:
{
"cities": {
"mappings": {
"cities": {
"properties": {
"city": {
"properties": {
"id": {
"type": "long"
},
"name": {
"properties": {
"en": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"it": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
"slug": {
"properties": {
"en": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"it": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
},
"doctype": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"suggest": {
"type": "completion",
"analyzer": "accents",
"search_analyzer": "simple",
"preserve_separators": true,
"preserve_position_increments": false,
"max_input_length": 50
},
"weight": {
"type": "long"
}
}
}
}
}
}
I have these documents in my index:
{
"_index": "cities",
"_type": "cities",
"_id": "991-city",
"_version": 128,
"found": true,
"_source": {
"doctype": "city",
"suggest": {
"input": [
"nazaré",
"nazare",
"나자레",
"najare",
"najale",
"ナザレ",
"Ναζαρέ"
],
"weight": 1807
},
"weight": 3012,
"city": {
"id": 991,
"name": {
"en": "Nazaré",
"it": "Nazaré"
},
"slug": {
"en": "nazare",
"it": "nazare"
}
}
}
}
{
"_index": "cities",
"_type": "cities",
"_id": "1085-city",
"_version": 128,
"found": true,
"_source": {
"doctype": "city",
"suggest": {
"input": [
"nazareth",
"nazaret",
"拿撒勒",
"na sa le",
"sa le",
"le",
"na-sa-lei",
"나사렛",
"nasares",
"nasales",
"ナザレス",
"nazaresu",
"नज़ारेथ",
"nj'aareth",
"aareth",
"najaratha",
"Назарет",
"Ναζαρέτ",
"názáret",
"nazaretas"
],
"weight": 1809
},
"weight": 3015,
"city": {
"id": 1085,
"name": {
"en": "Nazareth",
"it": "Nazareth"
},
"slug": {
"en": "nazareth",
"it": "nazareth"
}
}
}
}
Now, when I search using the suggester, with the following query:
POST /cities/_search
{
"suggest":{
"suggest":{
"prefix":"nazare",
"completion":{
"field":"suggest"
}
}
}
}
I expect to have both documents in my results, but I only get the second one (nazareth) back:
{
"took": 0,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 0,
"max_score": 0.0,
"hits": []
},
"suggest": {
"suggest": [
{
"text": "nazare",
"offset": 0,
"length": 6,
"options": [
{
"text": "nazaresu",
"_index": "cities",
"_type": "cities",
"_id": "1085-city",
"_score": 1809.0,
"_source": {
"doctype": "city",
"suggest": {
"input": [
"nazareth",
"nazaret",
"拿撒勒",
"na sa le",
"sa le",
"le",
"na-sa-lei",
"나사렛",
"nasares",
"nasales",
"ナザレス",
"nazaresu",
"नज़ारेथ",
"nj'aareth",
"aareth",
"najaratha",
"Назарет",
"Ναζαρέτ",
"názáret",
"nazaretas"
],
"weight": 1809
},
"weight": 3015,
"city": {
"id": 1085,
"name": {
"en": "Nazareth",
"it": "Nazareth"
},
"slug": {
"en": "nazareth",
"it": "nazareth"
}
}
}
}
]
}
]
}
}
This is unexpected, because in the suggester input for the first document, the term that I searched "nazare" appears exactly as I input it.
Another fun fact is that if I search for "najare" instead of "nazare" I get the correct results.
Any hint will be really appreciated!
For a quick solution, use the size parameter in the completion object of your query.
GET /cities/_search
{
"suggest":{
"suggest":{
"prefix":"nazare",
"completion":{
"field":"suggest",
"size": 100 <- HERE
}
}
}
}
The size parameter default to 5, so once elasticsearch as found 5 terms (and not document) having the correct prefix, it will stop looking for more terms (and consequently documents).
This limit is per term, not per document. So if one document contains 5 terms having the correct and you use the default value of 5, then possibly the other documents will not be returned.
I strongly believe that it is whats happening in your case. The returned document has at least 5 suggest terms having the prefix nazare so only this one will be returned.
For your fun fact, when you are searching najare, there is only one term having the correct prefix, so you have the correct result.
The tricky thing is that the results depends on the order elasticsearch retrieve the documents. If the first document would have been retrieved first, it would not have reach the size threshold (only 2 or 3 prefix occurrences), the next document would be also retrieved and you would have get the correct result.
Also, unless necessary, avoid using a very high value (e.g. > 1000) for the sizeparameter. It might impact the performance particularly for short or common prefixes.
I am reindexing my index data from ES 5.0(parent-child) to ES 6.2(Join type)
Data in index ES 5.0 is stored as parent-child documents in separate types and for reindex i have created new index/mapping based on 6.2 in my new cluster.
The parent documents flawlessly reindex to new index but the child documents throwing error as below
{
"index": "index_two",
"type": "_doc",
"id": "AVpisCkMuwDYFnQZiFXl",
"cause": {
"type": "mapper_parsing_exception",
"reason": "failed to parse",
"caused_by": {
"type": "illegal_argument_exception",
"reason": "[routing] is missing for join field [field_relationship]"
}
},
"status": 400
}
scripts i am using to reindex the data
{
"source": {
"remote": {
"host": "http://myescluster.com:9200",
"socket_timeout": "1m",
"connect_timeout": "20s"
},
"index": "index_two",
"type": ["actions"],
"size": 5000,
"query":{
"bool":{
"must":[
{"term": {"client_id.raw": "cl14ous0ydao"}}
]
}
}
},
"dest": {
"index": "index_two",
"type": "_doc"
},
"script": {
"params": {
"jdata": {
"name": "actions"
}
},
"source": "ctx._routing=ctx._routing;ctx.remove('_parent');params.jdata.parent=ctx._source.user_id;ctx._source.field_relationship=params.jdata"
}
}
I have passed the routing field in painless script as the documents are dynamic from source index.
Mapping of the destination index
{
"index_two": {
"mappings": {
"_doc": {
"dynamic_templates": [
{
"template_actions": {
"match_mapping_type": "string",
"mapping": {
"fields": {
"raw": {
"index": true,
"ignore_above": 256,
"type": "keyword"
}
},
"type": "text"
}
}
}
],
"date_detection": false,
"properties": {
"attributes": {
"type": "nested"
}
},
"cl_other_params": {
"type": "nested"
},
"cl_triggered_ts": {
"type": "date"
},
"cl_utm_params": {
"type": "nested"
},
"end_ts": {
"type": "date"
},
"field_relationship": {
"type": "join",
"eager_global_ordinals": true,
"relations": {
"users": [
"actions",
"segments"
]
}
},
"ip_address": {
"type": "ip"
},
"location": {
"type": "geo_point"
},
"processed_ts": {
"type": "date"
},
"processing_time": {
"type": "date"
},
"products": {
"type": "nested",
"properties": {
"traits": {
"type": "nested"
}
}
},
"segment_id": {
"type": "integer"
},
"start_ts": {
"type": "date"
}
}
}
}
}
My sample source document
{
"_index": "index_two",
"_type": "actions",
"_id": "AVvKUYcceQCc2OyLKWZ9",
"_score": 7.4023576,
"_routing": "cl14ous0ydaob71ab2a1-837c-4904-a755-11e13410fb94",
"_parent": "cl14ous0ydaob71ab2a1-837c-4904-a755-11e13410fb94",
"_source": {
"user_id": "cl14ous0ydaob71ab2a1-837c-4904-a755-11e13410fb94",
"client_id": "cl14ous0ydao",
"session_id": "CL-e0ec3941-6dad-4d2d-bc9b",
"source": "betalist",
"action": "pageview",
"action_type": "pageview",
"device": "Desktop",
"ip_address": "49.35.14.224",
"location": "20.7333 , 77",
"attributes": [
{
"key": "url",
"value": "https://www.google.com/",
"type": "string"
}
],
"products": []
}
}
I had the same issue and searching in elasticsearch discussions I found this that works:
POST _reindex
{
"source": {
"index": "old_index",
"type": "actions"
},
"dest": {
"index": "index_two"
},
"script": {
"source": """
ctx._type = "_doc";
String routingCode = ctx._source.user_id;
Map join = new HashMap();
join.put('name', 'actions');
join.put('parent', routingCode);
ctx._source.put('field_relationship', join);
ctx._parent = null;
ctx._routing = new StringBuffer(routingCode)"""
}
}
Hope this helps :) .
I'd like to point out that routing is generally not required for a join field, however if you're creating the child before the parent is created, then you're going to face this problem.
It's advisable to re-index all the parents first then the children.
I'd like to show my dataset on tile map. I'm using kibana 4.1.1.
My data is set like this:
{
"_index": "business-data",
"_type": "users",
"_id": "AVkRMFztZOUsFUpKvZ-0",
"_score": 1,
"_source": {
"first_name": "Nessa",
"gender": "female",
"location": {
"lat": 48.8668481401949,
"lon": 2.19256871957155
}
}
}
The mapping:
{
"mappings": {
"user": {
"properties": {
"first_name": {
"type": "string"
},
"gender": {
"type": "string"
},
"location": {
"type": "geo_point"
}
}
}
}
}
Location is a valid geo_point.
The tile map is shown when the visualisation is being created, but "No result" when the basic geohash aggregation bu location field is requested.
giving:
I managed to do it with you advice, thanks. Anyways, the kibana index is not event based in my case.
New mapping:
{
"mappings": {
"user": {
"properties": {
"#timestamp" : {
"type" : "date",
"format" : "dateOptionalTime"
},
"user_id": {
"type": "integer"
},
"first_name": {
"type": "string"
},
"gender": {
"type": "string"
},
"age": {
"type": "integer"
},
"location": {
"type": "geo_point"
}
}
}
}
}
I created index using:
curl -XPUT localhost:9200/mobapp -d '{
"mappings": {
"publish_messages": {
"properties": {
"title": {
"type": "string"
},
"location": {
"type": "nested",
"position": {
"type": "geo_point"
},
"name": {
"type": "string"
},
"state": {
"type": "string"
},
"country": {
"type": "string"
},
"city": {
"type": "integer"
}
},
"time": {
"type": "date",
"format": "dd-MM-YYYY"
}
}
}
}
}'
I have this index
"hits": [
{
"_index": "mobapp",
"_type": "publish_messages",
"_id": "184123e0-6123-11e5-83d5-7bdc2a9aa3c7",
"_score": 1,
"_source": {
"title": "Kolkata rocka",
"tags": [
"Tag5",
"Tag4"
],
"date": "2015-09-22T12:11:46.335Z",
"location": {
"position": {
"lat": 11.81776,
"lon": 10.9376
},
"country": "India",
"locality": "Bengaluru",
"sublocality_level_1": "Koramangala"
}
}
}
]
I am trying to do this query:
FilterBuilder filter = geoDistanceFilter("location")
.point(lat, lon)
.distance(distanceRangeInkm, DistanceUnit.KILOMETERS)
.optimizeBbox("memory")
.geoDistance(GeoDistance.ARC);
FilterBuilder boolFilter = boolFilter()
.must(termFilter("tags", tag))
.must(filter);
GeoDistanceSortBuilder geoSort = SortBuilders.geoDistanceSort("location").point(lat, lon).order(SortOrder.ASC);
SearchResponse searchResponse
= client.prepareSearch(AppConstants.ES_INDEX)
.setTypes("publish_messages")
.addSort("time", SortOrder.DESC)
.addSort(geoSort)
.setSearchType(SearchType.DFS_QUERY_THEN_FETCH)
.setPostFilter(boolFilter)
.setFrom(startPage).setSize(AppConstants.DEFAULT_PAGINATION_SIZE)
.execute()
.actionGet();
I am getting QueryParsingException[[mobapp] failed to find geo_point field [location.position]]; }
If you only want to keep your location data together, you don't need to use the nested type, simply use a normal object type (i.e. the default), like this:
curl -XPUT localhost:9200/mobapp -d '{
"mappings": {
"publish_messages": {
"properties": {
"title": {
"type": "string"
},
"location": {
"type": "object", <--- use object here
"properties": { <--- and don't forget properties here
"position": {
"type": "geo_point"
},
"name": {
"type": "string"
},
"state": {
"type": "string"
},
"country": {
"type": "string"
},
"city": {
"type": "integer"
}
}
},
"time": {
"type": "date",
"format": "dd-MM-YYYY"
}
}
}
}
}'
Note that you first need to wipe out your current index using curl -XDELETE localhost:9200/mobapp and then recreate it with the above command and reindex your data. Your query should work afterwards.
I'm using XDCR replication to sync the data between CB and Elasticsearch, using the couchbase transport plugin for Elasticsearch.
As far as i understand, all documents in Couchbase will come with the type "couchbaseDocument". But I have different documents types with a specific mapping for each document.
Is there a way to have specific dynamic type instead of the default "couchbaseDocument"?
(where if the json document have a field "type":"beer" it will be indexed in ES as _type:"beer" and if "type":"wine" it will be indexed as _type:"wine")
What I have in couchbase:
bucket: "drinks",
beer_1234:
{
"type": "beer",
"name": "leffe"
}
How it's indexed in Elasticsearch:
{
"_index": "drinks",
"_type": "couchbaseDocument", // <======================== ????
"_id": "beer_1234",
"_version": 1,
"_source": {
"doc": {
"type": "beer",
"name": "leffe"
},
"meta": {
"id": "beer_1234",
"rev": "9-000049e945bd62fa0000000000000000",
"expiration": 0,
"flags": 0
}
}
What I need:
{
"_index": "drinks",
"_type": "beer", // <======================== NICE TYPE
"_id": "beer_1234",
"_version": 1,
"_source": {
"doc": {
"type": "beer",
"name": "leffe"
},
"meta": {
"id": "beer_1234",
"rev": "9-000049e945bd62fa0000000000000000",
"expiration": 0,
"flags": 0
}
}
Thanks
The philosophy is to modify the default transport mapping to index your type field. For example:
curl -XPUT 'http: //localhost:9200/drinks/'-d'{
"mappings": {
"couchbaseCheckpoint": {
"dynamic": "true",
"_source": {
"includes": [
"doc.*"
]
},
"dynamic_templates": [
{
"store_no_index": {
"match": "*",
"mapping": {
"store": "no",
"index": "no",
"include_in_all": false
}
}
}
]
},
"couchbaseDocument": {
"_all": {
"enabled": false
},
"dynamic": "true",
"_source": {
"includes": [
"meta.*"
]
},
"dynamic_templates": [
{
"all_strings_to_avoid_collisions": {
"match": "*",
"mapping": {
"store": "no",
"index": "not_analyzed",
"include_in_all": false,
"type": "string",
"analyzer": "whitespace"
}
}
}
],
"properties": {
"doc": {
"properties": {
"type": {
"type": "string"
}
}
},
"meta": {
"properties": {
"id": {
"type": "string",
"analyzer": "whitespace"
}
}
}
}
}
}
}'