Need help in complicate aggregation with ElasticSearch query - elasticsearch

I try to make aggregation query with ElasticSearch 6.8:
I want to find last documents for specific date, and last documents before specific date, group by documents fields.
curl -X PUT "http://localhost:9200/test/xxx/1" -H 'Content-Type: application/json' -d'{"c1" : "1","c2": "1-1","ts": "2020-01-01T06:00:00.000+0000", "rec_type": "t1"}'
curl -X PUT "http://localhost:9200/test/xxx/2" -H 'Content-Type: application/json' -d'{"c1" : "1","c2": "1-2","ts": "2020-01-02T06:00:00.000+0000", "rec_type": "t1"}'
curl -X PUT "http://localhost:9200/test/xxx/3" -H 'Content-Type: application/json' -d'{"c1" : "1","c2": "1-3","ts": "2020-03-16T06:00:00.000+0000", "rec_type": "t1"}'
curl -X PUT "http://localhost:9200/test/xxx/4" -H 'Content-Type: application/json' -d'{"c1" : "1","c2": "1-4","ts": "2020-03-16T09:00:00.000+0000", "rec_type": "t1"}'
curl -X PUT "http://localhost:9200/test/xxx/5" -H 'Content-Type: application/json' -d'{"c1" : "2","c2": "2-1","ts": "2020-01-01T06:00:00.000+0000", "rec_type": "t1"}'
curl -X PUT "http://localhost:9200/test/xxx/6" -H 'Content-Type: application/json' -d'{"c1" : "2","c2": "2-2","ts": "2020-01-02T06:00:00.000+0000", "rec_type": "t1"}'
curl -X PUT "http://localhost:9200/test/xxx/7" -H 'Content-Type: application/json' -d'{"c1" : "2","c2": "2-3","ts": "2020-03-16T06:00:00.000+0000", "rec_type": "t1"}'
curl -X PUT "http://localhost:9200/test/xxx/8" -H 'Content-Type: application/json' -d'{"c1" : "2","c2": "2-4","ts": "2020-03-16T09:00:00.000+0000", "rec_type": "t1"}'
curl -X PUT "http://localhost:9200/test/_mapping/_doc" -H 'Content-Type: application/json' -d'{"properties" : {"c2": {"type": "text", "fielddata": true}}}'
curl -X PUT "http://localhost:9200/test/_mapping/_doc" -H 'Content-Type: application/json' -d'{"properties" : {"ts": {"type": "date"}}}'
Data looks like:
c1
c2
ts
rec_type
1
1-1
2020-01-01T06:00:00
t1
1
1-2
2020-01-02T06:00:00
t1
1
1-3
2020-03-16T06:00:00
t1
1
1-4
2020-03-16T09:00:00
t1
2
2-1
2020-01-01T06:00:00
t1
2
2-2
2020-01-02T06:00:00
t1
2
2-3
2020-03-16T06:00:00
t1
2
2-4
2020-03-16T09:00:00
t1
My query returns only last record for specific date (1-4 and 2-4 in the table above, but I want to take in same bucket a 1-2 and 2-2 ):
{
"size":0,
"query":{
"bool":{
"must":[
{
"bool":{
"must":[
{
"term":{
"rec_type.keyword":{
"value":"t1",
"boost":1.0
}
}
},
{
"range":{
"ts":{
"from":"2020-03-16T00:00:00.000+0000",
"to":"2020-03-16T23:59:59.000+0000",
"include_lower":true,
"include_upper":true,
"boost":1.0
}
}
}
],
"adjust_pure_negative":true,
"boost":1.0
}
}
],
"adjust_pure_negative":true,
"boost":1.0
}
},
"aggregations":{
"last_records":{
"composite":{
"size":100,
"sources":[
{
"record":{
"terms":{
"field":"c1.keyword",
"order":"asc"
}
}
}
]
},
"aggregations":{
"top_hits":{
"top_hits":{
"from":0,
"size":1,
"version":false,
"explain":false,
"sort":[
{
"ts":{
"order":"desc"
}
},
{
"c2":{
"order":"desc"
}
}
]
}
}
}
}
}
}
So question, is it even possible, and how ?

Related

How to sort data analytics/queries elasticsearch?

any solution for sort by newest data by date ? because i want to know history per users already searching
follow documentation but got stuck
curl -X POST '<ENTERPRISE_SEARCH_BASE_URL>/api/as/v1/engines/national-parks-demo/analytics/queries' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer private-xxxxxxxxxxxxxxxxxxxxxxxx' \
-d '{
{
"filters": {
"all": [
{
"date": {
"from": "2022-02-01T12:00:00+00:00",
"to": "2022-12-31T00:00:00+00:00"
}
}, {
"tag": "C001"
}
]
},
"page": {
"size": 20
}
}'
sorry my bad english

Elasticseach error with null value for dense vector datatype

I created an index with a dense_vector:
curl -X PUT "localhost:9200/my_index?pretty" -H 'Content-Type: application/json' -d'
{
"mappings": {
"properties": {
"my_vector": {
"type": "dense_vector",
"dims": 3
}
}
}
}
'
When I index a document with a vector it works well:
curl -X PUT "localhost:9200/my_index/_doc/1?pretty" -H 'Content-Type: application/json' -d'
{
"my_vector" : [0.5, 10, 6]
}
'
BUT when I index a document with a null value for the vector it returns an error:
curl -X PUT "localhost:9200/my_index/_doc/2?pretty" -H 'Content-Type: application/json' -d'
{
"my_vector" : null
}
'
The error is:
{
"error" : {
"root_cause" : [
{
"type" : "parsing_exception",
"reason" : "Failed to parse object: expecting token of type [VALUE_NUMBER] but found [END_OBJECT]",
"line" : 5,
"col" : 1
}
],
"type" : "mapper_parsing_exception",
"reason" : "failed to parse",
"caused_by" : {
"type" : "parsing_exception",
"reason" : "Failed to parse object: expecting token of type [VALUE_NUMBER] but found [END_OBJECT]",
"line" : 5,
"col" : 1
}
},
"status" : 400
}
How can I handle null value for vector type in ES?
instead of setting it to null you can remove that field from that particular document which is equivalent to setting it as null using the followingrequest
curl --location --request POST 'http://{ip}:9200/my_index/_doc/{docId}/_update' \
--header 'Content-Type: application/json' \
--header 'Content-Type: application/json' \
--data-raw '{
"script" : "ctx._source.remove(\"my_vector\")"
}'

Elasticsearch 7 unable to create index

I am trying to create index using following syntax
curl -H "Content-Type: application/json" -XPUT 127.0.0.1:9200/movies -d '
{
"mappings": {
"movie": {
"properties": {
"year": {"type":"date"}
}
}
}
}'
I guess "movie" cannot be child of the "mappings", can someone please help me transform this into Elasticsearch 7 compatible syntax.
I tried using "movie.year" : {"type":"date"} but then it fails on following insert statement
curl -H "Content-Type: application/json" -XPUT 127.0.0.1:9200/movies/movie/109487 -d '
{
"genre":["IMAX", "Sci-Fi"],
"title":"Intersteller",
"year":2014
}'
I copied from tutorial of Elasticsearch 6
"Rejecting mapping update to [movies] as the final mapping would have
more than 1 type: [_doc, movie]"
In ES 7, there are no more types. You need to do it like this.
First, create the index:
curl -H "Content-Type: application/json" -XPUT 127.0.0.1:9200/movies -d '
{
"mappings": {
"properties": {
"year": {"type":"date"}
}
}
}'
Then, index your document:
curl -H "Content-Type: application/json" -XPUT 127.0.0.1:9200/movies/_doc/109487 -d '
{
"genre":["IMAX", "Sci-Fi"],
"title":"Intersteller",
"year":2014
}'

Add additional attribute to an existing document if the attribute doesn't exist elasticsearch

I have a specific requirement were I have to add an additional attribute to elastic search index which has n documents. This has to be done only if the documents don't contain the attribute. This tasks basically involves 2 steps
1) searching
2) updating
I know how to do this with multiple queries. But it would be great if I manage to do this in a single query. Is it possible? If yes, can someone tell me how this can be done.
You can use update by query combined with the exists query to update and add the new field to only those documents which does not contain the attribute.
For example, you have only one documents containing field attrib2, others don't have that field.
curl -XPUT "http://localhost:9200/my_test_index/doc/1" -H 'Content-Type: application/json' -d'
{
"attrib1": "value1"
}'
curl -XPUT "http://localhost:9200/my_test_index/doc/2" -H 'Content-Type: application/json' -d'
{
"attrib1": "value21"
}'
curl -XPUT "http://localhost:9200/my_test_index/doc/3" -H 'Content-Type: application/json' -d'
{
"attrib1": "value31",
"attrib2": "value32"
}'
The following update by query will do the job.
curl -XPOST "http://localhost:9200/my_test_index/_update_by_query" -H 'Content-Type: application/json' -d'
{
"script": {
"lang": "painless",
"source": "ctx._source.attrib2 = params.attrib2",
"params": {
"attrib2": "new_value_for_attrib2"
}
},
"query": {
"bool": {
"must_not": [
{
"exists": {
"field": "attrib2"
}
}
]
}
}
}'
It will set the new value new_value_for_attrib2 to the field attrib2 on only those documents which don't already have that field.

Elasticsearch Delete Query By Date

I'm running the following query :
q='{
"filtered" : {
"query" : {
"match_all" : {}
},
"filter": {
"and": [
{
"range": {
"creation_time": {
"from": "2012-08-30",
"to": "2012-08-31",
"include_lower": true,
"include_upper": true
}
}
},
]
}
}
}'
My domain is an ec2 server
curl -XDELETE "http://#{mydomain}:9200/monitoring/mention_reports/_query?q=#{q}"
When I am hitting this query it gives me
curl: (3) [globbing] nested braces not supported at pos 118
Please help me thanks
If you’re trying to exec curl from the command line, it should be looking like:
q='YOUR_QUERY_CODE_GOES_HERE'
curl -v -H "Content-type: application/json" -H "Accept: application/json" \
-XDELETE -d $q http://localhost:9200/monitoring/mention_reports/_query
In case of inside-ruby execution, you should format the request as you do, but the silver bullet is still in headers:
-H "Content-type: application/json" -H "Accept: application/json"

Resources