Update all documents of Elastic Search using existing column value - elasticsearch

I have a field "published_date" in elastic search and there I have full date like yyyy-MM-dd'T'HH:mm:ss.
I want to create 3 more columns for year, month and date where I have to use the existing published_date to update new 3 columns.
Is there any inbuilt api to do this kind of work in e.s.? I am using elasticsearch 5.

You can use the update-by-query API in order to do this. It would simply boil down to running something like this:
POST your_index/_update_by_query
{
"script": {
"inline": "ctx._source.year = ctx._source.published_date.date.getYear(); ctx._source.month = ctx._source.published_date.date.getMonthOfYear(); ctx._source.day = ctx._source.published_date.date.getDayOfYear(); ",
"lang": "groovy"
},
"query": {
"match_all": {}
}
}
Also note that you need to enable dynamic scripting in order for this to work.

Related

Insert data when no match by update_by_query in elastic search

I have this command that don't match any data in elastic search and I want to insert it after that.
//localhost:9200/my_index/my_topic/_update_by_query
{
"script": {
"source": "ctx._source.NAME = params.NAME",
"lang": "painless",
"params": {
"NAME": "kevin"
}
},
"query": {
"terms": {
"_id": [
999
]
}
}
}
I try using upsert but it return errors Unknown key for a START_OBJECT in [upsert].
I don't want using update + doc_as_upsert cause I have a case that I will don't send id in my update query.
How can I insert this with update_by_query. Thank you.
If elastic search don't support. I think I will check condition if have id or not, and use indexAPI to create and update to update.
_update_by_query runs on existing documents contained in an existing index. What _update_by_query does is scroll over all documents in your index (that optionally match a query) and perform some logic on each of them via a script or an ingest pipeline.
Hence, logically, you cannot create/upsert data that doesn't already exist in the index. The Index API will always overwrite your document. Upsert only works with in conjunction with the _update endpoint, which is what you should probably do.

Incrementing Datetime field by one day in Elasticsearch Production Cluster using painless script

I am facing difficulty in debugging a production level Elastic Search Index Date time field in yyyy-MM-dd format & i want to update/increment the datetime field by one day Eg- 2009-07-01 i want to update it to 2009-07-02 for all the documents in the index.
I also want to know whether i have to use the re index api or update by query api
Currently i tried to update the document using following painless script its not working
POST klaprod-11042022/_update_by_query
{
"script": {
"source": "def df = DateTimeFormatter.ofPattern('yyyy-MM-dd');def tmp = LocalDateTime.parse(ctx._source.debate_section_date,df);ctx._source.debate_section_date=tmp.plusDays(1);",
"lang": "painless"
},
"query": {"match_all":{}}
}
Any advise by the community is appreciated
After going through the Documentation the following worked for me , you dont have to call the DateTimeFormatter in painless as LocalDate can parse the date time string when in raw yyyy-MM-dd format
POST klaprod-11042022/_update_by_query
{
"script": {
"source": "def tmp = LocalDate.parse(ctx._source.debate_section_date);ctx._source.debate_section_date=tmp.plusDays(1);",
"lang": "painless"
},
"query": {"match_all":{}}
}
You are getting exception because your date is not contaning any time. You need to use atStartOfDay() method of LocalDate to add one day.
POST datecheck/_update_by_query
{
"script": {
"source": "def df = DateTimeFormatter.ofPattern('yyyy-MM-dd');def tmp = LocalDate.parse(ctx._source.date,df).atStartOfDay();ctx._source.date=tmp.plusDays(1);",
"lang": "painless"
},
"query": {"match_all":{}}
}
This will generate date with plus one date and add time as well. So if you not need time then you can again format date before setting to ctx._source.date field in script.
I also want to know whether i have to use the re index api or update
by query api
If you want to copy data to the new index then you can use reindex api otherwise _update_by_query will be used for same index update.

Filtering documents by an unknown value of a field

I'm trying to create a query to filter my documents by one (can be anyone) value from a field (in my case "host.name"). The point is that I don't know previously the unique values of this field. I need found these and choose one to be used in the query.
I had tried the below query using a painless script, but I have not been able to achieve the goal.
{
"sort" : [{"#timestamp": "desc"}, {"host.name": "asc"}],
"query": {
"bool": {
"filter": {
"script": {
"script": {
"source": """
String k = doc['host.name'][0];
return doc['host.name'].value == k;
""",
"lang": "painless"
}
}
}
}
}
I'll appreciate if any can help me improving this idea of suggesting me a new one.
TL;DR you can't.
The script query context operates on one document at a time and so you won't have access to the other docs' field values. You can either use a scripted_metric aggregation which does allow iterating through all docs but it's just that -- an aggregation -- and not a query.
I'd suggest to first run a simple terms agg to figure out what values you're working with and then build your queries accordingly.

elastic search query to compare two date fields while fetching

I have 2 date fields updated_date and creation date. I'm trying to build a query using boolquery or searchbuilder where I need all records where updation_date is greater than creation date
tried with range query but it didn't work
Use script filter,
Use isAfter() for greater than in painless script.
"query": {
"bool": {
"filter": {
"script": {
"script": "doc['update_date'].date.isAfter(doc['created_date'].date)"
}
}
}
}
Not sure if you are aware about head plugin https://github.com/mobz/elasticsearch-head
Use the structured query tab to create the query. Select the check box 'Show query source' to see the generated query script.

Elasticsearch query based on two values

I am trying to use elasticsearch in order to find documents with a rule based on two doc properties.
Lets say the documents are in the following structure:
{
"customer_payment_timestamp" : 14387930787,
"customer_delivery_timestamp" : 14387230787,
}
and i would like to query these kind of documents and find all documents where customer_payment_timestamp is greater than customer_delivery_timestamp.
Tried the official documentation, but I couldn't find any relevant example regarding the query itself or a pre-mapped field... is it even possible?
You can achieve this with a script filter like this:
POST index/_search
{
"query": {
"bool": {
"filter": {
"script": {
"script": "doc.customer_payment_timestamp.value > doc. customer_delivery_timestamp.value"
}
}
}
}
}
Note: you need to make sure that dynamic scripting is enabled

Resources