Insert data when no match by update_by_query in elastic search - elasticsearch

I have this command that don't match any data in elastic search and I want to insert it after that.
//localhost:9200/my_index/my_topic/_update_by_query
{
"script": {
"source": "ctx._source.NAME = params.NAME",
"lang": "painless",
"params": {
"NAME": "kevin"
}
},
"query": {
"terms": {
"_id": [
999
]
}
}
}
I try using upsert but it return errors Unknown key for a START_OBJECT in [upsert].
I don't want using update + doc_as_upsert cause I have a case that I will don't send id in my update query.
How can I insert this with update_by_query. Thank you.
If elastic search don't support. I think I will check condition if have id or not, and use indexAPI to create and update to update.

_update_by_query runs on existing documents contained in an existing index. What _update_by_query does is scroll over all documents in your index (that optionally match a query) and perform some logic on each of them via a script or an ingest pipeline.
Hence, logically, you cannot create/upsert data that doesn't already exist in the index. The Index API will always overwrite your document. Upsert only works with in conjunction with the _update endpoint, which is what you should probably do.

Related

Filtering documents by an unknown value of a field

I'm trying to create a query to filter my documents by one (can be anyone) value from a field (in my case "host.name"). The point is that I don't know previously the unique values of this field. I need found these and choose one to be used in the query.
I had tried the below query using a painless script, but I have not been able to achieve the goal.
{
"sort" : [{"#timestamp": "desc"}, {"host.name": "asc"}],
"query": {
"bool": {
"filter": {
"script": {
"script": {
"source": """
String k = doc['host.name'][0];
return doc['host.name'].value == k;
""",
"lang": "painless"
}
}
}
}
}
I'll appreciate if any can help me improving this idea of suggesting me a new one.
TL;DR you can't.
The script query context operates on one document at a time and so you won't have access to the other docs' field values. You can either use a scripted_metric aggregation which does allow iterating through all docs but it's just that -- an aggregation -- and not a query.
I'd suggest to first run a simple terms agg to figure out what values you're working with and then build your queries accordingly.

Return which field got matched in Elastic Search

I am trying to find out what actually got matched for a search in a specific for which the doc is returned.
Ex. I have a table index where there are fields called table_name and column_name...
My search query is finding both those fields, now If I fire a search query and any one of them gets matched ,but I want to know what got matched .. whether its column_name or the table_name.
I am aware of the Explain API but that will require me to call another API...
You don't need to call the explain API. The search API supports the explain flag
GET stackoverflow/_search?explain=true
This will return the _explanation section along with the _source section.
Update
Another solution would be to use highlight. I've used this before, for manually evaluating queries. It's an easy way to get some feedback on what matched
GET stackoverflow/_search
{
"query": {
"match": {
"FIELD": "TEXT"
}
},
"highlight": {
"fields": {
"*": {}
}
}
}
Of course, you can have the explain flag set as well

$pull and $push Query in Elastic Search

MongoDB supports $pull and $push query to remove or add an element into an array. Can a similar type of query present in the elastic search which used to add or remove element from array.
You can use scripts for this, and update using update_by_query API.
{
"script": {
"inline": "ctx._source.<name_of_array>.add(params.<name_of_variable>)",
"params": {"<name_of_variable>": 1}
},
"query": {
// specify the query if want to apply on filtered documents.
}

ElasticSearch - Delete documents by specific field

This seemingly simple task is not well-documented in the ElasticSearch documentation:
We have an ElasticSearch instance with an index that has a field in it called sourceId. What API call would I make to first, GET all documents with 100 in the sourceId field (to verify the results before deletion) and then to DELETE same documents?
You probably need to make two API calls here. First to view the count of documents, second one to perform the deletion.
Query would be the same, however the end points are different. Also I'm assuming the sourceId would be of type keyword
Query to Verify
POST <your_index_name>/_search
{
"size": 0,
"query": {
"term": {
"sourceId": "100"
}
}
}
Execute the above Term Query and take a note at the hits.total of the response.
Remove the "size":0 in the above query if you want to view the entire documents as response.
Once you have the details, you can go ahead and perform the deletion using the same query as shown in the below query, notice the endpoint though.
Query to Delete
POST <your_index_name>/_delete_by_query
{
"query": {
"term": {
"sourceId": "100"
}
}
}
Once you execute the Deletion By Query, notice the deleted field in the response. It must show you the same number.
I've used term queries however you can also make use of any Match or any complex Bool Query. Just make sure that the query is correct.
Hope it helps!
POST /my_index/_delete_by_query?conflicts=proceed&pretty
{
"query": {
"match_all": {}
}
}
Delete all the documents of an index without deleting the mapping and settings:
See: https://opster.com/guides/elasticsearch/search-apis/elasticsearch-delete-by-query/

Elasticsearch DSL Query for Update

I understand that I am able to update a particular document by http://localhost:9200/[index_name]/[index_type]/[_id], but I have document where the _id has # symbols which Sense couldn't find them.
Understand that the Query DSL will be able to perform a search where I am able to indicate the _id not in the URL.
Resource: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-ids-query.html
Can I check with you, how can I do the same for updating document?
If you don't want to put the ID in the URL, the only option you have is to use the update by query API, like this:
POST index/_update_by_query
{
"query": {
"ids": {
"values": ["2323#23423"]
}
},
"script": {
"source": "do some update here"
}
}
Use localhost:9200/index/type/ID%23, %23 for #
So if '_id' is 10, url will look like localhost:9200/index/type/10%23

Resources