Elasticsearch Update by Query - elasticsearch

I am trying to update several documents based on a search query with ES version 2.3.4. My use case is to search for documents where two fields match certain values and then add a new field with a certain value. So let's say I want to search all employees with first name "John" and last name "Smith" and add a new field "job" to their profiles with the value "Engineer" in it.
So my first question is whether it is possible to do this using the "doc" option with the update_by_query API (the same way like with the update API).
If not, and script must be used (which is the way I'm doing it now) then maybe somebody can help me getting rid of the following error:
{"error":{"root_cause":[{"type":"class_cast_exception","reason":"java.lang.String cannot be cast to java.util.Map"}],"type":"class_cast_exception","reason":"java.lang.String cannot be cast to java.util.Map"},"status":500}
The code I'm using looks as follows:
curl -XPOST -s 'http://localhost:9200/test_index/_update_by_query?conflicts=proceed' -d'
{
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"bool": {
"must": [
{
"match": { "first_name" : "John" }
},
{
"match": { "last_name" : "Smith" }
}
]
}
}
}
},
"script" : "ctx._source.job = \"Engineer\""
}'
When sending the same query (without the "script" field) using the count API no error is reported and the correct number of documents is retuned.

The correct syntax is this:
"script": {
"inline": "ctx._source.job = \"Engineer\""
}

Related

Problem searching domain with elastic search

I have registered the following document
"ownDomainValue":"catalogonuevo1.com"
When I perform the following query the document is found, value is "catalogonuevo1"
[
{
"query": {
"bool": {
"filter": [
{
"term": {
"valor_dominio_propio": "catalogonuevo1"
}
}
]
}
},
"from": 0,
"size": 1
}
]
However, when the search value is "catalogonuevo1.com"
[
{
"query": {
"bool": {
"filter": [
{
"term": {
"valor_dominio_propio": "catalogonuevo1.com"
}
}
]
}
},
"from": 0,
"size": 1
}
]
it does not return any value, using MatchQueries the opposite happens, it always finds a wrong document, such as one with the value "catalogonuevo2.com" which is not what I am looking for since I need the search to be exact
It sounds like the problem is that the "term" query in Elasticsearch is not matching the exact value "catalogonuevo1.com" when it is included in the query.
This is likely because the "term" query is tokenizing the input string at the "." character, so it is matching on the token "catalogonuevo1" rather than the entire string "catalogonuevo1.com".
You can resolve this issue by using the "match_phrase" query instead of "term" query, as "match_phrase" query matches on the exact phrase rather than individual tokens.
Additionally, you can use keyword fields to store the domain values; this way, the values are not tokenized and the match phrase will work as expected.

Elastic Update By Query Updated Entire Index Instead

I am trying to get a query to work that will update a specific field in a document, provided it matches a query (in this example, where one field matches an exact value).
Here I am trying to query all documents that have the field "Foo" set to "Bar", and set the field "TextField5" in each of them to 1337. There are only a handfull in the index that match this. However, when I run this query, every document in the index has its TextField5 updated.
POST /threat_vuln/_update_by_query
{
"query": {
"match": {
"Foo": "Bar"
}
},
"script" : {
"source" : "ctx._source.TextField5='1337';",
"lang" : "painless"
}
}
I've gone over the Update API and Update By Query API and am still missing something. How can I change this to only update documents that match the query?
I'm on Kibana 7.4.0
EDIT: Also tried this, which still updates every document in the index instead of those matching the query:
POST /threat_vuln/_update_by_query
{
"query": {
"bool" : {
"must": [
{
"match": {
"Foo": "Bar"
}
}
]
}
},
"script" : {
"source" : "ctx._source.TextField5='1337';",
"lang" : "painless"
}
}
I got this to work as intended:
POST /threat_vuln/_update_by_query
{
"query": {
"bool" : {
"must": [
{
"match": {
"Foo.keyword": "Bar"
}
}
]
}
},
"script" : {
"source" : "ctx._source.TextField5='1337';",
"lang" : "painless"
}
}
I still don't understand how/why the examples in the question would just go ahead and update everything with what now appears to be a query that should return nothing, but I digress.

ElasticSearch must-terms does not return data

My ElasticSearch must-terms does not work, the data has clientId value "08d71bc7-c4ab-6e1d-f858-cf3448242e8b" but the result is empty. I am using elasticsearch:6.7.1. Do you know the problem here?
{
"from": 0,
"size": 20,
"query": {
"bool": {
"must": [
{ "terms": { "clientId": ["08d71bc7-c4ab-6e1d-f858-cf3448242e8b", "08d71bc7-c4ab-6e1d-f858-cf3448242e8c"] } },
{
"query_string": {
"query": "*d*",
"fields": ["name", "description", "title"]
}
},
{ "query_string": { "query": "1", "fields": ["type"] } }
]
}
}
}
I share sample data
I haven't worked enough with "query_string"... But if you don't put them and run your query, I'm sure it should at least give you some results. If so, your "query_string"s are the ones that are giving you this bad time
I first recommend you to use "filter" instead of "must".
Consider using the Regexp query your first "query_string". I found here how to query multiple fields with Regexp.
For the second, it would be enough to use "term" instead of "query_string".
Hope this is helpful! :D
The search results depends on the analysis type of clientId . If clientId is a 'keyword' your query should work as expected, but if the type of clientId is 'text' then the value might get tokenized to smaller parts (break at the dash).
You can check the clientId fields type in the index mappings, and also run the analyze API to check the tokenization: https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-analyze.html

Is it possible to return a specific field when running a query in sense for elasticsearch

I have loaded some data into elasticsearch and written a query against the data however the results contain all of the data for the matching queries. Is it possible to filter the results to show a particular field?
Example
Query to find all records for a specific country but to return a list of registration numbers.
All the data is available elasticsearch however I get a full json record back for each match.
I'm running this query in SENSE (within Kibana 4.5.0).
The query is...
GET _search
{
filter_path=reg_no.*,
"fields" : ["reg_no"],
"query" : {
"fields" : ["country_cd", "oprg_stat"],
"query" : "956 AND 9074"
}
}
If I remove the two lines
filter_path=reg_no.*,
"fields" : ["reg_no"],
the query runs but brings back all the data.
Try this query:
POST _search
{
"_source": [
"reg_no"
],
"query": {
"bool": {
"filter": [
{
"term": {
"country_cd": "956"
}
},{
"term": {
"oprg_stat": "9074"
}
}
]
}
}
}

Elasticsearch grouping facet by owner, mine vs others

I am using Elasticsearch to index documents that have an owner which is stored in a userId property of the source object. I can easily do a facet on the userId and get facets for each owner that there is, but I'd like to have the facets for owner show up like so:
Documents owned by me (X)
Documents owned by others (Y)
I could handle this on the client side and take all of the facets returned by elasticsearch and go through them and figure out those owned by the current user and not and display it appropriately, but I was hoping there was a way to tell elasticsearch to handle this in the query itself.
You can use filtered facets to do this:
curl -XGET "http://localhost:9200/_search" -d'
{
"query": {
"match_all": {}
},
"facets": {
"my_docs": {
"filter": {
"term": { "user_id": "my_user_id" }
}
},
"others_docs": {
"filter": {
"not": {
"term": { "user_id": "my_user_id" }
}
}
}
}
}'
One of the nice things about this is that the two terms filters are identical and so are only executed once. The not filter just inverts the results of the cached term filter.
You're right, ElasticSearch has a way to do that. Take a look to scripting term facets, specially to the second example ("using the boolean feature"). You should be able to do somthing like:
{
"query" : {
"match_all" : { }
},
"facets" : {
"userId" : {
"terms" : {
"field" : "userId",
"size" : 10,
"script" : "term == '<your user id>' ? true : false"
}
}
}
}

Resources