Date range search in Elassandra - elasticsearch

I have created a index like below.
curl -XPUT -H 'Content-Type: application/json' 'http://x.x.x.x:9200/date_index' -d '{
"settings" : { "keyspace" : "keyspace1"},
"mappings" : {
"table1" : {
"discover":"sent_date",
"properties" : {
"sent_date" : { "type": "date", "format": "yyyy-MM-dd HH:mm:ssZZ" }
}
}
}
}'
I need to search the results pertaining to date range, example "from" : "2039-05-07 11:22:34+0000", "to" : "2039-05-07 11:22:34+0000" both inclusive.
I am trying like this,
curl -XGET -H 'Content-Type: application/json' 'http://x.x.x.x:9200/date_index/_search?pretty=true' -d '
{
"query" : {
"aggregations" : {
"date_range" : {
"sent_date" : {
"from" : "2039-05-07 11:22:34+0000",
"to" : "2039-05-07 11:22:34+0000"
}
}
}
}
}'
I am getting error as below.
"error" : {
"root_cause" : [
{
"type" : "parsing_exception",
"reason" : "no [query] registered for [aggregations]",
"line" : 4,
"col" : 22
}
],
"type" : "parsing_exception",
"reason" : "no [query] registered for [aggregations]",
"line" : 4,
"col" : 22
},
"status" : 400
Please advise.

The query seems to be malformed. Please see the date range aggregation documentation at https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-daterange-aggregation.html and note the differences:
you're introducing a query without defining any - do you need one?
you should use aggs instead of aggregations
you should name your aggregation

Related

Elasticsearch: strict_dynamic_mapping_exception

Hi,
I am trying to modify the date format in an elasticsearch index (operate-operation-0.26.0_). But I get the following error.
{
"took" : 148,
"errors" : true,
"items" : [
{
"index" : {
"_index" : "operate-operation-0.26.0_",
"_type" : "_doc",
"_id" : "WBGhSXcB_hD8-yfn-Rh5",
"status" : 400,
"error" : {
"type" : "strict_dynamic_mapping_exception",
"reason" : "mapping set to strict, dynamic introduction of [dynamic] within [_doc] is not allowed"
}
}
}
]
}
The json file I am using is bulk6.json:
{"index":{}}
{"dynamic":"strict","properties":{"date":{"type":"date","format":"yyyy-MM-dd'T'HH:mm:ss.SSSZZ"}}}
The command I am running is
curl -H "Content-Type: application/x-ndjson" -XPOST 'localhost:9200/operate-operation-0.26.0_/_bulk?pretty&refresh' --data-binary #"bulk6.json"
The _bulk API endpoint is not meant for changing mappings. You need to use the _mapping API endpoint like this:
The JSON file mapping.json should contain:
{
"dynamic": "strict",
"properties": {
"date": {
"type": "date",
"format": "yyyy-MM-dd'T'HH:mm:ss.SSSZZ"
}
}
}
And then the call can be made like this:
curl -H "Content-Type: application/json" -XPUT 'localhost:9200/operate-operation-0.26.0_/_mapping?pretty&refresh' --data-binary #"mapping.json"
However, this is still not going to work as you're not allowed to change the date format after the index has been created. You're going to get the following error:
{
"error" : {
"root_cause" : [
{
"type" : "illegal_argument_exception",
"reason" : "Mapper for [date] conflicts with existing mapper:\n\tCannot update parameter [format] from [strict_date_optional_time||epoch_millis] to [yyyy-MM-dd'T'HH:mm:ss.SSSZZ]"
}
],
"type" : "illegal_argument_exception",
"reason" : "Mapper for [date] conflicts with existing mapper:\n\tCannot update parameter [format] from [strict_date_optional_time||epoch_millis] to [yyyy-MM-dd'T'HH:mm:ss.SSSZZ]"
},
"status" : 400
}
You need to create a new index with the desired correct mapping and reindex your data.

'Unknown BaseAggregationBuilder [composite] error' when running elasticsearch composite aggregation

I'm trying to create a composite aggregation per the documentation here:
https://www.elastic.co/guide/en/elasticsearch/reference/6.8/search-aggregations-bucket-composite-aggregation.html
I'm basically following this example:
curl -X GET "localhost:9200/_search?pretty" -H 'Content-Type: application/json' -d'
{
"aggs" : {
"my_buckets": {
"composite" : {
"sources" : [
{ "product": { "terms" : { "field": "product" } } }
]
}
}
}
}
'
but every time I try to run the code I get the below error regardless of which field I try to aggregate on:
{
"error" : {
"root_cause" : [
{
"type" : "unknown_named_object_exception",
"reason" : "Unknown BaseAggregationBuilder [composite]",
"line" : 5,
"col" : 27
}
],
"type" : "unknown_named_object_exception",
"reason" : "Unknown BaseAggregationBuilder [composite]",
"line" : 5,
"col" : 27
},
"status" : 400
}
I did some digging around and haven't seen the error 'Unknown BaseAggregationBuilder [composite]' come up anywhere else so I thought I'd post this question here to see if anyone has run into a similar issue. Cardinality and regular terms aggregation work fine. Also to clarify, I'm running on v6.8
Composite aggs were released in 6.1.0. The error sounds like you cannot possibly be using >=6.1 but some older ver.
What's your version.number when you run curl -X GET "localhost:9200"?

Elasticsearch How do I get a metadata using Image Plugin

I defined matadata by the mapping of the Elasticsearch image Plugin.
Mapping:
"photo" : {
"mappings" : {
"scenery" : {
"properties" : {
"my_img" : {
"type" : "image",
"feature" : {"FCTH" : { }, ... },
"metadata" : {
"jpeg.image_height" : {"type" : "string","store" : true},
"jpeg.image_width" : {"type" : "string","store" : true}
}
}
}
}
}
}
After an index, although searched, metadata does not return.
How do I get a metadata?
I tried:
curl -XPOST 'localhost:9200/photo/scenery/_search' -d '{
"query":{
"image":{
"my_img":{
"feature":"CEDD",
"index":"photo",
"type":"scenery",
"id":"0",
"path":"my_img",
"hash":"BIT_SAMPLING"
}
}
}
}'
Result:
{"took":14,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":5,"max_score":1.0,"hits":[{"_index":"photo","_type":"scenery","_id":"0","_score":1.0, "_source" : {"file_name": "376423.jpg", "my_img": "/9j/4AAQSkZJRgABAQ...
Perhaps, the original data (base64 encoded image) will be returned _source field. You can use that instead, the fields option.
Try this query.
curl -XPOST 'localhost:9200/photo/scenery/_search' -d '{
"query":{
...
},
"fields": ["my_img.metadata.jpeg.image_height","my_img.metadata.jpeg.image_width" ]
}'

Geo Distance Filter with ElasticSearch always returns 0 results

I am using the Geo Distance Filter with ElasticSearch and no matter what distance I search for, elasticsearch 0.90.11 returns zero results.
Here's what I did: First, delete/create a new index with the geo mapping:
curl -XDELETE 'http://localhost:9200/photos'
curl -XPOST 'http://localhost:9200/photos' -d '
{
"mappings": {
"pin" : {
"properties" : {
"location" : {
"type" : "geo_point"
}
}
}
}
}
'
Then, add a document:
curl -XPOST 'http://localhost:9200/photos/photo?pretty=1' -d '
{
"pin" : {
"location" : {
"lat" : 46.8,
"lon" : -71.2
}
},
"file" : "IMG_2115.JPG"
}
'
Then search:
curl -XGET 'http://localhost:9200/photos/_search?pretty=1&size=20' -d '
{
"query" : {
"match_all" : {}
},
"filter" : {
"geo_distance" : {
"distance" : "10km",
"pin.location" : {
"lat" : "46.8",
"lon" : "-71.2"
}
}
}
}
'
But the search yields zero hits:
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 0,
"max_score" : null,
"hits" : [ ]
}
}
Anyone know why it didn't find the document within the radius? As an interesting side note, the "filtered" syntax having both "query" and "filter" as subfields as described in the document referenced above seems to no longer work at all, seems like now "query" and "filter" need to be at the top level of the json query.
Any help appreciated ...
Turns out that the offical manual page is not accurate, the mapping needs to be
[Sun Feb 9 11:01:55 2014] # Request to: http://localhost:9200
curl -XPUT 'http://localhost:9200/photos?pretty=1' -d '
{
"mappings" : {
"photo" : {
"properties" : {
"Location" : {
"type" : "geo_point"
}
}
}
}
}
'
and the records with the GPS data then need to be added like this:
[Sun Feb 9 11:01:56 2014] # Request to: http://localhost:9200
curl -XPOST 'http://localhost:9200/photos/photo?pretty=1' -d '
{
"file" : "/home/mschilli/iphone/IMG_2115.JPG",
"Location" : [
46.8,
-71.2
]
}
'
Then later the query
[Sun Feb 9 11:05:00 2014] # Request to: http://localhost:9200
curl -XGET 'http://localhost:9200/photos/_search?pretty=1&size=100' -d '
{
"filter" : {
"geo_distance" : {
"distance" : "1km",
"Location" : [
36.986,
-121.443333333333
]
}
},
"query" : {
"match_all" : {}
}
}
'
will show the desired result, correctly filtering out results outside the selected GPS distance.
I have similar problem and spent lot of time to solve it. So maybe it can be helpful to someone.
The problem may be that if you want search by distance on nested filed property like:
something:
properties:
name:
type: text
city:
type: nested
properties:
location:
type: geo_point
You need add nested query inside main query: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-nested-query.html

Sort the basic of Number of documents in ElasticSearch

I am saving user relations in ES Index
i.e
{'id' => 1, 'User_id_1' => '2001', 'relation' => 'friend', 'User_id_2' => '1002'}
{'id' => 2, 'User_id_1' => '2002', 'relation' => 'friend', 'User_id_2' => '1002'}
{'id' => 3, 'User_id_1' => '2002', 'relation' => 'friend', 'User_id_2' => '1001'}
{'id' => 4, 'User_id_1' => '2003', 'relation' => 'friend', 'User_id_2' => '1003'}
no suppose i want to get the user_id_2 who has most friends,
in above case its 1002 as 2001, and 2002 are its friends. (Count = 2)
I just can't figure out the query
Thanks.
EDIT:
Well as suggested by #imotov, term facets is very good choice, but
The problem I have is 2 Indexes
1st index is for saving the main docs and 2nd index for saving the relations
now problem is
Suppose I have 100 USER Docs in my main index, only 50 of them has made relations, so I'll have only 50 USER Docs in my relationship index
So when i implement the "term facet", it sorts the results and gives the correct output i want, but I am missing those left 50 users who don't have any relations yet, i need them in my final output after the 50 sorted users.
First of all, we need to ensure that relationships saved in ES are unique. It can be done by replacing arbitrary ids with ids constructed from user_id_1, relation and user_id_2. We also need to make sure that analyzer for user_ids doesn't produce multiple tokens. If ids are strings, they have to be indexed not_analyzed. With these two conditions satisfied, we can simply use terms facet query for the field user_id_2 on the result list limited by relation:friend. This query will retrieve top user_id_2 ids sorted by number of occurrences in the index. All together it could look something like this:
curl -XPUT http://localhost:9200/relationships -d '{
"mappings" : {
"relation" : {
"_source" : {"enabled" : false },
"properties" : {
"user_id_1": { "type": "string", "index" : "not_analyzed"},
"relation": { "type": "string", "index" : "not_analyzed"},
"user_id_2": { "type": "string", "index" : "not_analyzed"}
}
}
}
}'
curl -XPUT http://localhost:9200/relationships/relation/2001-friend-1002 -d '{"user_id_1": "2001", "relation":"friend", "user_id_2": "1002"}'
curl -XPUT http://localhost:9200/relationships/relation/2002-friend-1002 -d '{"user_id_1": "2002", "relation":"friend", "user_id_2": "1002"}'
curl -XPUT http://localhost:9200/relationships/relation/2002-friend-1001 -d '{"user_id_1": "2002", "relation":"friend", "user_id_2": "1001"}'
curl -XPUT http://localhost:9200/relationships/relation/2003-friend-1003 -d '{"user_id_1": "2003", "relation":"friend", "user_id_2": "1003"}'
curl -XPOST http://localhost:9200/relationships/_refresh
echo
curl -XGET 'http://localhost:9200/relationships/relation/_search?pretty=true&search_type=count' -d '{
"query": {
"term" : {
"relation" : "friend"
}
},
"facets" : {
"popular" : {
"terms" : {
"field" : "user_id_2"
}
}
}
}'
Please, note that due to distributed nature of facets calculation, counts reported by the facet query might be lower than the actual number of records if multiple shards are used. See elasticsearch issue 1832
EDIT:
There are two solutions for the edited question. One solution is to use facet on two fields:
curl -XPUT http://localhost:9200/relationships -d '{
"mappings" : {
"relation" : {
"_source" : {"enabled" : false },
"properties" : {
"user_id_1": { "type": "string", "index" : "not_analyzed"},
"relation": { "type": "string", "index" : "not_analyzed"},
"user_id_2": { "type": "string", "index" : "not_analyzed"}
}
}
}
}'
curl -XPUT http://localhost:9200/users -d '{
"mappings" : {
"user" : {
"_source" : {"enabled" : false },
"properties" : {
"user_id": { "type": "string", "index" : "not_analyzed"}
}
}
}
}'
curl -XPUT http://localhost:9200/users/user/1001 -d '{"user_id": 1001}'
curl -XPUT http://localhost:9200/users/user/1002 -d '{"user_id": 1002}'
curl -XPUT http://localhost:9200/users/user/1003 -d '{"user_id": 1003}'
curl -XPUT http://localhost:9200/users/user/1004 -d '{"user_id": 1004}'
curl -XPUT http://localhost:9200/users/user/1005 -d '{"user_id": 1005}'
curl -XPUT http://localhost:9200/relationships/relation/2001-friend-1002 -d '{"user_id_1": "2001", "relation":"friend", "user_id_2": "1002"}'
curl -XPUT http://localhost:9200/relationships/relation/2002-friend-1002 -d '{"user_id_1": "2002", "relation":"friend", "user_id_2": "1002"}'
curl -XPUT http://localhost:9200/relationships/relation/2002-friend-1001 -d '{"user_id_1": "2002", "relation":"friend", "user_id_2": "1001"}'
curl -XPUT http://localhost:9200/relationships/relation/2003-friend-1003 -d '{"user_id_1": "2003", "relation":"friend", "user_id_2": "1003"}'
curl -XPOST http://localhost:9200/relationships/_refresh
curl -XPOST http://localhost:9200/users/_refresh
echo
curl -XGET 'http://localhost:9200/relationships,users/_search?pretty=true&search_type=count' -d '{
"query": {
"indices" : {
"indices" : ["relationships"],
"query" : {
"filtered" : {
"query" : {
"term" : {
"relation" : "friend"
}
},
"filter" : {
"type" : {
"value" : "relation"
}
}
}
},
"no_match_query" : {
"filtered" : {
"query" : {
"match_all" : { }
},
"filter" : {
"type" : {
"value" : "user"
}
}
}
}
}
},
"facets" : {
"popular" : {
"terms" : {
"fields" : ["user_id", "user_id_2"]
}
}
}
}'
Another solution is to add "self" relation to the relationships index for every user when user is created. I would prefer the second solution since it seems to be less complicated.

Resources