How to conditionally update the elasticsearch document? - elasticsearch

Using update query for document as follows:
/<indexname>/_update/<id>
{
doc: {
/// the doc here
}
}
Is there a way, I can put condition like created_at field in the existing doc is less than created_at we are passing?

You are probably looking for update by query api.
e.g.
POST <index>/_update_by_query
{
"query": {
"range": {
"created_at": {
"lte": "2019-01-10"
}
}
}
}

Related

Example from search interval in date range

I'm looking for a simple example of query from date to date. On Elasticsearch I have found some complex examples of query to handle data range.
You should always refer documentation. In fact, queries looks complex but not actually.
Sample from documentation:
{
"query": { //query handler
"range": { //type of query
"timestamp": { //field Name
"gte": "2020-01-01T00:00:00", //greater than jan1
"lte": "2020-01-21T00:00:00" // less than jan 21
}
}
}
}
Doc

Nested query in Strapi GraphQL

I have a document structured as follows, more or less:
post {
_id
title
isPublished
}
user {
_id
username
name
[posts]
}
I know I can query fields like postConnection and userConnection with the aggregate subfield in order to query a count of all objects. But how do I get the total count of all posts by a given user?
I was able to come up with this:
{
postsConnection(where: {isPublished: true}){
groupBy{
author{
key
connection{
aggregate{
count
}
}
}
}
}
}
But this returns (expectedly) something like this:
{
"data": {
"postsConnection": {
"groupBy": {
"author": [
{
"key": "5c9136976238de2cc029b5d3",
"connection": {
"aggregate": {
"count": 5
}
}
},
{
"key": "5c99d5d5fcf70010b75c07d5",
"connection": {
"aggregate": {
"count": 3
}
}
}
]
}
}
}
}
As you can see, it returns post counts for all authors in an array. What I need is to be able to return the count for only one specific user and not by _id (which is what the key field seems to map to) but by another unique field I have in the users collection, i.e. username.
Is that possible?
Need to pass in a parameter to either the query or the field to return specific data

ElasticSearch remove all documents with over 1000 fields

I'm getting this error:
While updating a dev ElasticSearch DB from a LIVE one. I believe it is being caused because the live DB is sending documents with over 1000 fields in them and the dev DB index.mapping.total_fields.limit is set to 1000
I know I can up the fields limit, but for now I would like to just remove all documents with 1000 or more fields.
I'm guessing make a Postman call to the _delete_by_query API with something like:
{
"query": {
"range": {
"fields": {
"gt": 1000
}
}
}
}
Does anyone know of a simple query that can accomplish this?
You can run a query like this against the LIVE cluster:
POST logger/_delete_by_query
{
"query": {
"script": {
"script": {
"source": "params._source.size() > 1000"
}
}
}
}
Provided you don't have nested fields/objects, this will delete all documents having more than 1000 fields.

How to search for exact date match in Elasticsearch

I have a couple of items in my ES database with fields containing 2020-02-26T05:24:55.757Z for example. Is it possible to (with the URI Search, _search?q=...) search for exact dates? For example, in this case, I would like to find items from 2020-02-26. Is that possible?
Yes, It is possible. You could refer to query string documentation for more info.
curl localhost:9200/your_index_name/_search?q=your_date_field:%7B2020-02-26%20TO%20*%7D
You would need to encode the url. query part looks like q=your_date_field:{2020-02-26 TO *}
Above query in REST api would look like
{
"query": {
"range": {
"your_date_field": {
"gte": "2020-02-26"
}
}
}
}
For exact dates following would work
curl localhost:9200/your_index_name/_search?q=your_date_field:2020-02-26
Although this question is old, I came across it, so maybe others will do so too.
If you want to only work in UTC, you can use a match query, like:
{
"query": {
"match": {
"your_date_field": {
"query": "2020-02-26"
}
}
}
}
If you need to consider things matching on a particular date in a different timezone, you have to use a range query, like:
{
"query": {
"range": {
"your_date_field": {
"gte": "2020-02-26",
"lte": "2020-02-26",
"time_zone": "-08:00"
}
}
}
}

Elastic Search Date Range

I have a query that properly parses date ranges. However, my database has a default value that all dates have a timestamp of 00:00:00. This means that items that are still valid today are shown as expired even if they should still be valid. How can I adjust the following to look at just the date and not the time of the item (expirationDate).
{
"range": {
"expirationDate": {
"gte": "now"
}
}
}
An example of the data is:
"expirationDate": "2014-06-24T00:00:00.000Z",
Did you look into the different format options for dates stored in ElasticSearch? If this does not work for you or you don't want to store dates without the time you can try this query, which will work for your exact use case I guess:
{
"range": {
"expirationDate": {
"gt": "now-1d"
}
}
}
You can also round down the time so that your query returns anything that occurred since the beginning of the day:
Assuming that
now is 2017-03-07T07:00:00.000,
now/d is 2017-03-07T00:00:00.000
Your query would be:
{
"range": {
"expirationDate": {
"gte": "now/d"
}
}
}
elastic search documentation on rounding times

Resources