How to sort data analytics/queries elasticsearch? - elasticsearch

any solution for sort by newest data by date ? because i want to know history per users already searching
follow documentation but got stuck
curl -X POST '<ENTERPRISE_SEARCH_BASE_URL>/api/as/v1/engines/national-parks-demo/analytics/queries' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer private-xxxxxxxxxxxxxxxxxxxxxxxx' \
-d '{
{
"filters": {
"all": [
{
"date": {
"from": "2022-02-01T12:00:00+00:00",
"to": "2022-12-31T00:00:00+00:00"
}
}, {
"tag": "C001"
}
]
},
"page": {
"size": 20
}
}'
sorry my bad english

Related

Need help in complicate aggregation with ElasticSearch query

I try to make aggregation query with ElasticSearch 6.8:
I want to find last documents for specific date, and last documents before specific date, group by documents fields.
curl -X PUT "http://localhost:9200/test/xxx/1" -H 'Content-Type: application/json' -d'{"c1" : "1","c2": "1-1","ts": "2020-01-01T06:00:00.000+0000", "rec_type": "t1"}'
curl -X PUT "http://localhost:9200/test/xxx/2" -H 'Content-Type: application/json' -d'{"c1" : "1","c2": "1-2","ts": "2020-01-02T06:00:00.000+0000", "rec_type": "t1"}'
curl -X PUT "http://localhost:9200/test/xxx/3" -H 'Content-Type: application/json' -d'{"c1" : "1","c2": "1-3","ts": "2020-03-16T06:00:00.000+0000", "rec_type": "t1"}'
curl -X PUT "http://localhost:9200/test/xxx/4" -H 'Content-Type: application/json' -d'{"c1" : "1","c2": "1-4","ts": "2020-03-16T09:00:00.000+0000", "rec_type": "t1"}'
curl -X PUT "http://localhost:9200/test/xxx/5" -H 'Content-Type: application/json' -d'{"c1" : "2","c2": "2-1","ts": "2020-01-01T06:00:00.000+0000", "rec_type": "t1"}'
curl -X PUT "http://localhost:9200/test/xxx/6" -H 'Content-Type: application/json' -d'{"c1" : "2","c2": "2-2","ts": "2020-01-02T06:00:00.000+0000", "rec_type": "t1"}'
curl -X PUT "http://localhost:9200/test/xxx/7" -H 'Content-Type: application/json' -d'{"c1" : "2","c2": "2-3","ts": "2020-03-16T06:00:00.000+0000", "rec_type": "t1"}'
curl -X PUT "http://localhost:9200/test/xxx/8" -H 'Content-Type: application/json' -d'{"c1" : "2","c2": "2-4","ts": "2020-03-16T09:00:00.000+0000", "rec_type": "t1"}'
curl -X PUT "http://localhost:9200/test/_mapping/_doc" -H 'Content-Type: application/json' -d'{"properties" : {"c2": {"type": "text", "fielddata": true}}}'
curl -X PUT "http://localhost:9200/test/_mapping/_doc" -H 'Content-Type: application/json' -d'{"properties" : {"ts": {"type": "date"}}}'
Data looks like:
c1
c2
ts
rec_type
1
1-1
2020-01-01T06:00:00
t1
1
1-2
2020-01-02T06:00:00
t1
1
1-3
2020-03-16T06:00:00
t1
1
1-4
2020-03-16T09:00:00
t1
2
2-1
2020-01-01T06:00:00
t1
2
2-2
2020-01-02T06:00:00
t1
2
2-3
2020-03-16T06:00:00
t1
2
2-4
2020-03-16T09:00:00
t1
My query returns only last record for specific date (1-4 and 2-4 in the table above, but I want to take in same bucket a 1-2 and 2-2 ):
{
"size":0,
"query":{
"bool":{
"must":[
{
"bool":{
"must":[
{
"term":{
"rec_type.keyword":{
"value":"t1",
"boost":1.0
}
}
},
{
"range":{
"ts":{
"from":"2020-03-16T00:00:00.000+0000",
"to":"2020-03-16T23:59:59.000+0000",
"include_lower":true,
"include_upper":true,
"boost":1.0
}
}
}
],
"adjust_pure_negative":true,
"boost":1.0
}
}
],
"adjust_pure_negative":true,
"boost":1.0
}
},
"aggregations":{
"last_records":{
"composite":{
"size":100,
"sources":[
{
"record":{
"terms":{
"field":"c1.keyword",
"order":"asc"
}
}
}
]
},
"aggregations":{
"top_hits":{
"top_hits":{
"from":0,
"size":1,
"version":false,
"explain":false,
"sort":[
{
"ts":{
"order":"desc"
}
},
{
"c2":{
"order":"desc"
}
}
]
}
}
}
}
}
}
So question, is it even possible, and how ?

escape triple quotes in curl correctly

I have the following curl request
curl -H "Content-Type: application/json" -X POST http://localhost:9200/_reindex\?wait_for_completion\=true -d '{"source": {"index": "analytics-prod-2019.12.30", "size":1000 }, "dest": {"index": "analytics-prod-2019.12"}, "conflicts": "proceed", "script": { "lang": "painless","source: """ctx._source.index = ctx._index; def eventData = ctx._source["event.data"]; if(eventData != null) { eventData.remove("realmDb.size"); eventData.remove("realmDb.format"); eventData.remove("realmDb.contents"); }""" } }'
but this fails with the following error:
{"error":{"root_cause":[{"type":"x_content_parse_exception","reason":"[1:166] [script] failed to parse object"}],"type":"x_content_parse_exception","reason":"[1:166] [reindex] failed to parse field [script]","caused_by":{"type":"x_content_parse_exception","reason":"[1:166] [script] failed to parse object","caused_by":{"type":"json_parse_exception","reason":"Unexpected character ('\"' (code 34)): was expecting a colon to separate field name and value\n at [Source: org.elasticsearch.common.bytes.BytesReference$MarkSupportingStreamInputWrapper#51c48433; line: 1, column: 177]"}}},"status":400}
if i remove the script field from the request this works just fine:
curl -H "Content-Type: application/json" -X POST http://localhost:9200/_reindex\?wait_for_completion\=true -d '{"source":{"index":"analytics-prod-2019.12.30","size":1000},"dest":{"index":"test-index"},"conflicts":"proceed"}}'
using the kibana UI works fine.
what is the correct way to run this in curl?
Use a single " to surround your script value and \u0027 to escape in your Painless script.
curl -H "Content-Type: application/json" -X POST http://localhost:9200/_reindex\?wait_for_completion\=true -d '
{
"source": {
"index": "analytics-prod-2019.12.30",
"size": 1000
},
"dest": {
"index": "analytics-prod-2019.12"
},
"conflicts": "proceed",
"script": {
"lang": "painless",
"source": "ctx._source.index = ctx._index; def eventData = ctx._source[\u0027event.data\u0027]; if(eventData != null) { eventData.remove(\u0027realmDb.size\u0027); eventData.remove(\u0027realmDb.format\u0027); eventData.remove(\u0027realmDb.contents\u0027);"
}
}
'
You can also see an example of this here, click on the Copy as cURL link and review the example in that format.
Your source was missing a double quote:
Corrected:
curl -H "Content-Type: application/json" \
-X POST http://localhost:9200/_reindex\?wait_for_completion\=true \
-d '{"source": {"index": "analytics-prod-2019.12.30", "size":1000 }, "dest": {"index": "analytics-prod-2019.12"}, "conflicts": "proceed", "script": { "lang": "painless","source": "ctx._source.index = ctx._index; def eventData = ctx._source[\"event.data\"]; if (eventData != null) { eventData.remove(\"realmDb.size\"); eventData.remove(\"realmDb.format\"); eventData.remove(\"realmDb.contents\"); }" } }'
You can either use single quotes like #Zsolt pointed out but even Kibana itself, when clicking "Copy as cURL", uses escaped double quotes.
curl -XPOST "http://elasticsearch:9200/_reindex?requests_per_second=115&wait_for_completion=true" -H 'Content-Type: application/json' -d'
{
"source": {
"index": "analytics-prod-2019.12.30",
"size": 1000
},
"dest": {
"index": "analytics-prod-2019.12"
},
"script": {
"lang": "painless",
"source": " ctx._source.index = ctx._index;\n def eventData = ctx._source[\"event.data\"];\n if (eventData != null) {\n eventData.remove(\"realmDb.size\");\n eventData.remove(\"realmDb.format\");\n eventData.remove(\"realmDb.contents\");\n }"
}
}'
had to escape \"

Add additional attribute to an existing document if the attribute doesn't exist elasticsearch

I have a specific requirement were I have to add an additional attribute to elastic search index which has n documents. This has to be done only if the documents don't contain the attribute. This tasks basically involves 2 steps
1) searching
2) updating
I know how to do this with multiple queries. But it would be great if I manage to do this in a single query. Is it possible? If yes, can someone tell me how this can be done.
You can use update by query combined with the exists query to update and add the new field to only those documents which does not contain the attribute.
For example, you have only one documents containing field attrib2, others don't have that field.
curl -XPUT "http://localhost:9200/my_test_index/doc/1" -H 'Content-Type: application/json' -d'
{
"attrib1": "value1"
}'
curl -XPUT "http://localhost:9200/my_test_index/doc/2" -H 'Content-Type: application/json' -d'
{
"attrib1": "value21"
}'
curl -XPUT "http://localhost:9200/my_test_index/doc/3" -H 'Content-Type: application/json' -d'
{
"attrib1": "value31",
"attrib2": "value32"
}'
The following update by query will do the job.
curl -XPOST "http://localhost:9200/my_test_index/_update_by_query" -H 'Content-Type: application/json' -d'
{
"script": {
"lang": "painless",
"source": "ctx._source.attrib2 = params.attrib2",
"params": {
"attrib2": "new_value_for_attrib2"
}
},
"query": {
"bool": {
"must_not": [
{
"exists": {
"field": "attrib2"
}
}
]
}
}
}'
It will set the new value new_value_for_attrib2 to the field attrib2 on only those documents which don't already have that field.

elasticsearch: getMinuteOfDay() applied to time() in date range filter

I'm trying to build an elasticsearch query to return documents for times between midnight and the current time of day, for all dates. For example, if I run the query at 09:00:00, then the query will return any document with a timestamp between midnight and 09:00:00 regardless of the date.
Here's an example dataset:
curl -XPUT localhost:9200/test/dt/_mapping -d '{"dt" : {"properties" : {"created_at" : {"type" : "date", "format": "YYYY-MM-DD HH:mm:ss" }}}}'
curl -XPOST localhost:9200/test/dt/1 -d '{ "created_at": "2014-10-09 07:00:00" }'
curl -XPOST localhost:9200/test/dt/2 -d '{ "created_at": "2014-10-09 14:00:00" }'
curl -XPOST localhost:9200/test/dt/3 -d '{ "created_at": "2014-10-08 08:00:00" }'
curl -XPOST localhost:9200/test/dt/4 -d '{ "created_at": "2014-10-08 15:00:00" }'
curl -XPOST localhost:9200/test/dt/5 -d '{ "created_at": "2014-10-07 09:00:00" }'
curl -XPOST localhost:9200/test/dt/6 -d '{ "created_at": "2014-10-07 16:00:00" }'
and an example filter:
curl -XPOST localhost:9200/test/dt/_search?pretty -d '{
"query": {
"filtered" : {
"filter" : {
"bool": {
"must": [
{
"script" : {
"script" : "doc[\"created_at\"].date.getMinuteOfDay() < 600"
}
}
]
}
}
}
}
}'
where 600 is a static parameter that I want to replace with a dynamic minutes parameter depending on the time of day when the query is run.
My best attempt is to use the getMinuteOfDay() method to filter each doc, but my problem is how to get getMinuteOfDay() for the current time. I've tried variations using time() instead of the hard-coded parameter 600 in the above query, but can't figure it out.
Any ideas?
This is pretty late but this can be done with DateTime.now(), you should change your query to
GET test/dt/_search
{
"query": {
"filtered": {
"filter": {
"bool": {
"must": [
{
"script": {
"script": "doc[\"created_at\"].date.getMinuteOfDay() < DateTime.now().getMinuteOfDay()"
}
}
]
}
}
}
}
}
Hope this helps!

Elasticsearch Delete Query By Date

I'm running the following query :
q='{
"filtered" : {
"query" : {
"match_all" : {}
},
"filter": {
"and": [
{
"range": {
"creation_time": {
"from": "2012-08-30",
"to": "2012-08-31",
"include_lower": true,
"include_upper": true
}
}
},
]
}
}
}'
My domain is an ec2 server
curl -XDELETE "http://#{mydomain}:9200/monitoring/mention_reports/_query?q=#{q}"
When I am hitting this query it gives me
curl: (3) [globbing] nested braces not supported at pos 118
Please help me thanks
If you’re trying to exec curl from the command line, it should be looking like:
q='YOUR_QUERY_CODE_GOES_HERE'
curl -v -H "Content-type: application/json" -H "Accept: application/json" \
-XDELETE -d $q http://localhost:9200/monitoring/mention_reports/_query
In case of inside-ruby execution, you should format the request as you do, but the silver bullet is still in headers:
-H "Content-type: application/json" -H "Accept: application/json"

Resources