elastic search query to compare two date fields while fetching - elasticsearch

I have 2 date fields updated_date and creation date. I'm trying to build a query using boolquery or searchbuilder where I need all records where updation_date is greater than creation date
tried with range query but it didn't work

Use script filter,
Use isAfter() for greater than in painless script.
"query": {
"bool": {
"filter": {
"script": {
"script": "doc['update_date'].date.isAfter(doc['created_date'].date)"
}
}
}
}

Not sure if you are aware about head plugin https://github.com/mobz/elasticsearch-head
Use the structured query tab to create the query. Select the check box 'Show query source' to see the generated query script.

Related

Getting a specific date in elasticsearch?

I have searched a lot of sites. This code is given. But by writing this all the entries containing "2021" are displayed when I need only the entries having date as "10-10-2021". pls guide what to do
{ "query": { "term": { "date": { "value": "10-10-2020" } } } }
Is your field indexed as a date or a keyword? Index it as a date and instead of a terms query you should use a range query to get documents within a specific span of time: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-range-query.html

Filtering documents by an unknown value of a field

I'm trying to create a query to filter my documents by one (can be anyone) value from a field (in my case "host.name"). The point is that I don't know previously the unique values of this field. I need found these and choose one to be used in the query.
I had tried the below query using a painless script, but I have not been able to achieve the goal.
{
"sort" : [{"#timestamp": "desc"}, {"host.name": "asc"}],
"query": {
"bool": {
"filter": {
"script": {
"script": {
"source": """
String k = doc['host.name'][0];
return doc['host.name'].value == k;
""",
"lang": "painless"
}
}
}
}
}
I'll appreciate if any can help me improving this idea of suggesting me a new one.
TL;DR you can't.
The script query context operates on one document at a time and so you won't have access to the other docs' field values. You can either use a scripted_metric aggregation which does allow iterating through all docs but it's just that -- an aggregation -- and not a query.
I'd suggest to first run a simple terms agg to figure out what values you're working with and then build your queries accordingly.

How to delete data from a specific index in elasticsearch after a certain period?

I have an index in elasticsearch with is occupied by some json files with respected to timestamp.
I want to delete data from that index.
curl -XDELETE http://localhost:9200/index_name
Above code deletes the whole index. My requirement is to delete certain data after a time period(for example after 1 week). Could I automate the deletion process?
I tried to delete by using curator.
But I think it deletes the indexes created by timestamp, not data with in an index. Can we use curator for delete data within an index?
It will be pleasure if I get to know that either of following would work:
Can Curl Automate to delete data from an index after a period?
Can curator Automate to delete data from an index after a period?
Is there any other way like python scripting to do the job?
References are taken from the official site of elasticsearch.
Thanks a lot in advance.
You can use the DELETE BY QUERY API: https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-delete-by-query.html
Basically it will delete all the documents matching the provided query:
POST twitter/_delete_by_query
{
"query": {
"match": {
"message": "some message"
}
}
}
But the suggested way is to implement indexes for different periods (days for example) and use curator to drop them periodically, based on the age:
...
logs_2019.03.11
logs_2019.03.12
logs_2019.03.13
logs_2019.03.14
Simple example using Delete By Query API:
POST index_name/_delete_by_query
{
"query": {
"bool": {
"filter": {
"range": {
"timestamp": {
"lte": "2019-06-01 00:00:00.0",
"format": "yyyy-MM-dd HH:mm:ss.S"
}
}
}
}
}
}
This will delete records which have a field "timestamp" which is the date/time (within the record) at which they occured. One can run the query to get a count for what will be deleted.
GET index_name/_search
{
"size": 1,
"query: {
-- as above --
Also it is nice to use offset dates
"lte": "now-30d",
which would delete all records older than 30 days.
You can always delete single documents by using the HTTP request method DELETE.
To know which are the id's you want to delete you need to query your data. Probably by using a range filter/query on your timestamp.
As you are interacting with the REST api you can do this with python or any other language. There is also a Java client if you prefer a more direct api.

ElasticSearch - Delete documents by specific field

This seemingly simple task is not well-documented in the ElasticSearch documentation:
We have an ElasticSearch instance with an index that has a field in it called sourceId. What API call would I make to first, GET all documents with 100 in the sourceId field (to verify the results before deletion) and then to DELETE same documents?
You probably need to make two API calls here. First to view the count of documents, second one to perform the deletion.
Query would be the same, however the end points are different. Also I'm assuming the sourceId would be of type keyword
Query to Verify
POST <your_index_name>/_search
{
"size": 0,
"query": {
"term": {
"sourceId": "100"
}
}
}
Execute the above Term Query and take a note at the hits.total of the response.
Remove the "size":0 in the above query if you want to view the entire documents as response.
Once you have the details, you can go ahead and perform the deletion using the same query as shown in the below query, notice the endpoint though.
Query to Delete
POST <your_index_name>/_delete_by_query
{
"query": {
"term": {
"sourceId": "100"
}
}
}
Once you execute the Deletion By Query, notice the deleted field in the response. It must show you the same number.
I've used term queries however you can also make use of any Match or any complex Bool Query. Just make sure that the query is correct.
Hope it helps!
POST /my_index/_delete_by_query?conflicts=proceed&pretty
{
"query": {
"match_all": {}
}
}
Delete all the documents of an index without deleting the mapping and settings:
See: https://opster.com/guides/elasticsearch/search-apis/elasticsearch-delete-by-query/

Update all documents of Elastic Search using existing column value

I have a field "published_date" in elastic search and there I have full date like yyyy-MM-dd'T'HH:mm:ss.
I want to create 3 more columns for year, month and date where I have to use the existing published_date to update new 3 columns.
Is there any inbuilt api to do this kind of work in e.s.? I am using elasticsearch 5.
You can use the update-by-query API in order to do this. It would simply boil down to running something like this:
POST your_index/_update_by_query
{
"script": {
"inline": "ctx._source.year = ctx._source.published_date.date.getYear(); ctx._source.month = ctx._source.published_date.date.getMonthOfYear(); ctx._source.day = ctx._source.published_date.date.getDayOfYear(); ",
"lang": "groovy"
},
"query": {
"match_all": {}
}
}
Also note that you need to enable dynamic scripting in order for this to work.

Resources