Elasticsearch Updating field value issue - elasticsearch

I am trying to work with update_by_query but cannot make it work.
Following is a simple query,
curl -X GET "172.17.0.3:9200/useripvsuserid/_search?pretty" -H 'Content-Type: application/json' -d'
{
"_source":"userid","query": {
"term": {
"userip": "10.0.30.181"
}
}
}
'
{
"took" : 3,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 10.803431,
"hits" : [
{
"_index" : "useripvsuserid",
"_type" : "_doc",
"_id" : "PhfBW3AB8mhGfmGvIs-j",
"_score" : 10.803431,
"_source" : {
"userid" : "hasan1855"
}
}
]
}
}
Following is the update_by_query that is not working. I am trying to replace userid value hasan1855 to arif. Where is the problem?
curl -X POST "172.17.0.3:9200/useripvsuserid/_update_by_query?pretty" -H 'Content-Type: application/json' -d'
{
"script": {
"source": "ctx._source.userid='arif';",
"lang": "painless"
},
"query": {
"term": {
"userip": "10.0.30.181"
}
}
}
'
{
"error" : {
"root_cause" : [
{
"type" : "script_exception",
"reason" : "compile error",
"script_stack" : [
"ctx._source.userid=arif;",
" ^---- HERE"
],
"script" : "ctx._source.userid=arif;",
"lang" : "painless"
}
],
"type" : "script_exception",
"reason" : "compile error",
"script_stack" : [
"ctx._source.userid=arif;",
" ^---- HERE"
],
"script" : "ctx._source.userid=arif;",
"lang" : "painless",
"caused_by" : {
"type" : "illegal_argument_exception",
"reason" : "Variable [arif] is not defined."
}
},
"status" : 400
}

It's the same issue as described here, i.e. the single quotes around arif conflict with the single quotes around the JSON query.
So you can either send your query in binary mode as explained in the link above, or escape the quotes, like this:
curl -X POST "172.17.0.3:9200/useripvsuserid/_update_by_query?pretty" -H 'Content-Type: application/json' -d'
{
"script": {
"source": "ctx._source.userid = \"arif\";", <---- escape quotes
"lang": "painless"
},
"query": {
"term": {
"userip": "10.0.30.181"
}
}
}
'

Related

ElasticSearch aggregation shows unexpected result for SUM

Trying to apply sum aggregation in ES 7.14 and get unexpected result
1. prepare dataset
$cat products.json
{"index":{"_id":"1"}}
{"productId": 10,"shopId": 45,"prices": {"retailPrice": 525000000.02,"sumRetailPrice": 5250000000.2},"count": 10}
{"index":{"_id":"2"}}
{"productId": 10,"shopId": 48,"prices": {"retailPrice": 26250000004,"sumRetailPrice": 5250000000.8},"count": 20}
2. bulk insert
curl -XPOST localhost:9200/25products/_bulk -H "Content-Type: application/x-ndjson" --data-binary #./products.json
3. view mapping
curl -XGET "http://localhost:9200/25products/_mapping?pretty"
{
"25products" : {
"mappings" : {
"properties" : {
"count" : {
"type" : "long"
},
"prices" : {
"properties" : {
"retailPrice" : {
"type" : "float"
},
"sumRetailPrice" : {
"type" : "float"
}
}
},
"productId" : {
"type" : "long"
},
"shopId" : {
"type" : "long"
}
}
}
}
}
4. Sum field "prices.sumRetailPrice" in Painless
curl --location --request POST 'http://localhost:9200/25products/_search?pretty' \
--header 'Content-Type: application/json' \
--data-raw '{
"aggs": {"sumSupplyPrice": {
"sum": {"script": {
"source": "(!doc.containsKey('\''prices.sumRetailPrice'\'') ? 0 : (doc['\''prices.sumRetailPrice'\''].size() == 0 ? 0: doc['\''prices.sumRetailPrice'\''].value))"
}}
}},
"query": {"bool": {
"filter": [
{"terms": {"shopId": [45]}},
{"terms": {"productId": [10]}}
]
}},
"from": 0, "size": 10
}'
result is
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 0.0,
"hits" : [
{
"_index" : "25products",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.0,
"_source" : {
"productId" : 10,
"shopId" : 45,
"prices" : {
"retailPrice" : 5.2500000002E8,
"sumRetailPrice" : 5.2500000002E9
},
"count" : 10
}
}
]
},
"aggregations" : {
"sumSupplyPrice" : {
"value" : 5.249999872E9
}
}
}
4. Expectation
as well as I have a single record, expecting to have the same value as sumRetailPrice
"aggregations" : {
"sumSupplyPrice" : {
"value" : **5.2500000002E9**
}
}
But, actual result is not as expected.
"aggregations" : {
"sumSupplyPrice" : {
"value" : **5.249999872E9**
}
}
Where am I wrong?
Thanks!

How do I create a default mapping for a field on my documents, that will not be made redundant in the next major version of Elasticsearch?

I'm on Elasticsearch 7.14.0 where mapping types have been removed.
Following from this question I have learned that the generic URI to PUT documents is /[index]/_doc/[id].
I want to create a default mapping for my documents on the name field:
curl -X PUT "localhost:9200/products?pretty" -H 'Content-Type: application/json' -d'
{
"mappings":{
"properties":{
"name":{
"analyzer":"edge_ngram_analyzer",
"search_analyzer":"standard",
"type":"text"
}
}
},
"settings":{
"analysis":{
"filter":{
"edge_ngram":{
"type":"edge_ngram",
"min_gram":"2",
"max_gram":"25",
"token_chars":[
"letter",
"digit"
]
}
},
"analyzer":{
"edge_ngram_analyzer":{
"filter":[
"lowercase",
"edge_ngram"
],
"tokenizer":"standard"
}
}
}
}
}
'
However creating a new document doesn't apply the analyzer:
curl -X PUT "localhost:9200/products/_doc/1?pretty" -H 'Content-Type: application/json' -d'
{
"name": "Toast"
}
'
curl -X GET "localhost:9200/products/_search?pretty"
{
"took" : 1026,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "products",
"_type" : "_doc",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"name" : "Toast"
}
}
]
}
}
I've tried creating the mapping under the _doc type, but am getting the following error:
curl -X PUT "localhost:9200/products?pretty" -H 'Content-Type: application/json' -d'
{
"mappings":{
"_doc":{
"properties":{
"name":{
"analyzer":"edge_ngram_analyzer",
"search_analyzer":"standard",
"type":"text"
}
}
}
},
"settings":{
"analysis":{
"filter":{
"edge_ngram":{
"type":"edge_ngram",
"min_gram":"2",
"max_gram":"25",
"token_chars":[
"letter",
"digit"
]
}
},
"analyzer":{
"edge_ngram_analyzer":{
"filter":[
"lowercase",
"edge_ngram"
],
"tokenizer":"standard"
}
}
}
}
}
'
{
"error" : {
"root_cause" : [
{
"type" : "illegal_argument_exception",
"reason" : "The mapping definition cannot be nested under a type [_doc] unless include_type_name is set to true."
}
],
"type" : "illegal_argument_exception",
"reason" : "The mapping definition cannot be nested under a type [_doc] unless include_type_name is set to true."
},
"status" : 400
}
However, I've read that:
Elasticsearch 8.x: Specifying types in requests is no longer supported. The include_type_name parameter is removed.
How do I create a default mapping for a field on my documents, that will not be made redundant in the next major version of Elasticsearch?
This question was due to a misunderstanding on my part (new to ES). I thought the returned result from a search would include the underlying analysis of any fields. When I perform a partially matching search, the document is correctly returned, so the above mapping works as intended:
curl -X GET "localhost:9200/products/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"match": {
"name": "To"
}
}
}
'
{
"took" : 7,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 0.41501677,
"hits" : [
{
"_index" : "products",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.41501677,
"_source" : {
"name" : "Toast"
}
}
]
}
}

cannot resolve symbol[string] when using updateByQuery with ElasticSearch

I have the following set-up:
mapping:
esClient.indices.putMapping({
index: 'tests',
body: {
properties: {
name: {
type: 'text',
},
lastName: {
type: 'text',
},
},
},
});
this is the result when I post an entry:
this is the result when I query the entries:
curl -X GET "localhost:9200/tests/_search?pretty" -H 'Content-Type: application/json' -d'
{
"size": 1000,
"query" : {
"match_all" : {}
}
}
'
{
"took" : 18,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "tests",
"_type" : "_doc",
"_id" : "KJbtj3kBRlqnip7VJLLI",
"_score" : 1.0,
"_source" : {
"lastName" : 1,
"name" : "lucas"
}
}
]
}
}
I tried to update the entry's last name with the following curl:
curl -X POST "localhost:9200/tests/_update_by_query?pretty" -H 'Content-Type: application/json' -d'
{
"script": {
"source": "ctx._source.lastName='johnson'",
"lang": "painless"
},
"query": {
"term": {
"name": "lucas"
}
}
}
'
AND THIS IS THE ERROR I'M GETTIN:
{
"error" : {
"root_cause" : [
{
"type" : "script_exception",
"reason" : "compile error",
"script_stack" : [
"ctx._source.lastName=johnson",
" ^---- HERE"
],
"script" : "ctx._source.lastName=johnson",
"lang" : "painless",
"position" : {
"offset" : 21,
"start" : 0,
"end" : 28
}
}
],
"type" : "script_exception",
"reason" : "compile error",
"script_stack" : [
"ctx._source.lastName=johnson",
" ^---- HERE"
],
"script" : "ctx._source.lastName=johnson",
"lang" : "painless",
"position" : {
"offset" : 21,
"start" : 0,
"end" : 28
},
"caused_by" : {
"type" : "illegal_argument_exception",
"reason" : "cannot resolve symbol [johnson]"
}
},
"status" : 400
}
If I put an integer instead of a string it updates it otherwise I keep getting that error.
Thanks a lot for your help.
You need to surround the new lastName field value with ' '.
Adding a working example
Index Data:
{
"name":"lucas",
"lastName":"erla"
}
Query:
POST _update_by_query
{
"script": {
"source": "ctx._source.lastName='johnson'",
"lang": "painless"
},
"query": {
"term": {
"name": "lucas"
}
}
}
After hitting the update by query API, the document will be updated to
GET /_doc/1
{
"_index": "67641538",
"_type": "_doc",
"_id": "1",
"_version": 3,
"_seq_no": 2,
"_primary_term": 1,
"found": true,
"_source": {
"lastName": "johnson",
"name": "lucas"
}
}
You need to surround the new lastName field value with " ".
I faced similar problem and I solved by adding " " instead of ' '. My ES version is 8.4.
QUERY.
curl -X POST "localhost:9200/tests/_update_by_query?pretty" -H 'Content-Type: application/json' -d'
{
"script": {
"source": "ctx._source.lastName=\"johnson\"",
"lang": "painless"
},
"query": {
"term": {
"name": "lucas"
}
}
}
'

illegal argument exception while performing a query on elastic search 6.6?

Hi I am having an instance of elastic search running on my machine . it has an index named mep-reports. when i do a query using curl command it is giving an error . the following is the curl command.
curl -X GET "10.10.9.1:9200/mep-reports*/_search?pretty&size=0" -H 'Content-Type: application/json' -d'{
"size": 0,
"query": {
"bool": {
"must": [
{
"range": {
"#timestamp": {
"from": "2019-01-31T23:59:59Z",
"to": "2020-02-17T23:59:59Z",
"include_lower": true,
"include_upper": false,
"format": "yyyy-MM-dd'T'HH:mm:ssZ",
"boost": 1.0
}
}
},
{
"term": {
"account_id": {
"value": "270d13e6-2f4f-4d51-99d5-92ffba5f0cb6",
"boost": 1.0
}
}
}
],
"adjust_pure_negative": true,
"boost": 1.0
}
},
"aggregations": {
"performance_over_time": {
"date_histogram": {
"field": "#timestamp",
"format": "yyyy-MM-dd'T'HH:mm:ssZ",
"interval": "1M",
"offset": 0,
"order": {
"_key": "asc"
},
"keyed": false,
"min_doc_count": 0
}
}
}
}'
Response
{
"error" : {
"root_cause" : [
{
"type" : "illegal_argument_exception",
"reason" : "Invalid format: [yyyy-MM-ddTHH:mm:ssZ]: Illegal pattern component: T"
}
],
"type" : "illegal_argument_exception",
"reason" : "Invalid format: [yyyy-MM-ddTHH:mm:ssZ]: Illegal pattern component: T",
"caused_by" : {
"type" : "illegal_argument_exception",
"reason" : "Illegal pattern component: T"
}
},
"status" : 400
}
The following a sample from my elastic search index
{
"took" : 14,
"timed_out" : false,
"_shards" : {
"total" : 12,
"successful" : 12,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 1073013,
"max_score" : 1.0,
"hits" : [
{
"_index" : "mep-reports-2019.09.11",
"_type" : "doc",
"_id" : "68e8e03f-baf8-4bfc-a920-58e26edf835c-353899837500",
"_score" : 1.0,
"_source" : {
"account_id" : "270d13e6-2f4f-4d51-99d5-92ffba5f0cb6",
"inventory" : "SMS",
"flight_name" : "test flight 001",
"status" : "ENROUTE",
"msg_text" : "Test !!!!!!!!!!!!!!1 elastic searchY",
"flight_id" : "68e8e03f-baf8-4bfc-a920-58e26edf835c",
"submission_ts" : "1568197286",
"recipient" : "353899837500",
"o_error" : null,
"nof_segments" : "-1",
"campaign_id" : "0fae8662-bee9-46ac-9b3e-062f4ba55966",
"campaign_name" : "Index search petri11",
"#version" : "1",
"sender" : "800111",
"delivery_ts" : "0",
"#timestamp" : "2019-09-11T10:21:26.000Z"
}
}
]
}
}
it something related to date format as i am trying to do a search on #timestamp field
really appreciate if you can help
thank you
The problem is because the JSON query is enclosed into single quotes, i.e. the same characters around the T in your date format.
What I suggest you to do is to store the query inside a file named query.json and then send it in binary-mode like this:
curl -X GET "10.10.9.1:9200/mep-reports*/_search?pretty&size=0" -H 'Content-Type: application/json' --data-binary #query.json
That should solve your issue

Boolean query does not return expected data in Elasticsearch

I have the following document in Elasticsearch as reported by Kibana:
{"deviceId":"C1976429369BFE063ED8B3409DB7C7E7D87196D9","appId":"DisneyDigitalBooks.PlanesAdventureAlbum","ostype":"iOS"}
Why the following query does not return success?
[root#myvm elasticsearch-1.0.0]# curl -XGET 'http://localhost:9200/unique_app_install/_search?pretty=1' -d '
{
"query" : {
"bool" : {
"must" : [ {
"term" : {
"deviceId" : "C1976429369BFE063ED8B3409DB7C7E7D87196D9"
}
}, {
"term" : {
"appId" : "DisneyDigitalBooks.PlanesAdventureAlbum"
}
}, {
"term" : {
"ostype" : "iOS"
}
} ]
}
}
}'
Here is the response from Elasticsearch:
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 0,
"max_score" : null,
"hits" : [ ]
}
}
As a side question, is this the fastest way to query the data in my case?
Thx in advance.
UPDATE:
Could it be related to the fact that I used the following mapping for this index?
curl -XPOST localhost:9200/unique_app_install -d '{
"settings" : {
"number_of_shards" : 5
},
"mappings" : {
"sdk_sync" : {
"properties" : {
"deviceId" : { "type" : "string" , "index": "not_analyzed"},
"appId" : { "type" : "string" , "index": "not_analyzed"},
"ostype" : { "type" : "string" , "index": "not_analyzed"}
}
}
}
}'
Check if the type of your document was right while inserting: sdk_sync.
I have used your items and for me it works. Using the following curl request give the right response for me:
curl -XPOST localhost:9200/unique_app_install/sdk_sync/1 -d '{
"settings" : {
"number_of_shards" : 5
},
"mappings" : {
"sdk_sync" : {
"properties" : {
"deviceId" : { "type" : "string" , "index": "not_analyzed"},
"appId" : { "type" : "string" , "index": "not_analyzed"},
"ostype" : { "type" : "string" , "index": "not_analyzed"}
}
}
}
}'
curl -XPOST localhost:9200/unique_app_install/sdk_sync/1 -d '{
"deviceId":"C1976429369BFE063ED8B3409DB7C7E7D87196D9",
"appId":"DisneyDigitalBooks.PlanesAdventureAlbum",
"ostype":"iOS"
}'
curl -XGET 'http://localhost:9200/unique_app_install/_search?pretty=1' -d '
{
"query" : {
"bool" : {
"must" : [ {
"term" : {
"deviceId" : "C1976429369BFE063ED8B3409DB7C7E7D87196D9"
}
}, {
"term" : {
"appId" : "DisneyDigitalBooks.PlanesAdventureAlbum"
}
}, {
"term" : {
"ostype" : "iOS"
}
} ]
}
}
}'
Unless you specify the field NOT to be analyzed, every fields are analyzed by default.
It means that deviceId "C1976429369BFE063ED8B3409DB7C7E7D87196D9" will be indexed as "c1976429369bfe063ed8b3409db7c7e7d87196d9" (lower case).
You have to use term query or term filter with string in LOWER CASE.
That is the reason why you should specify {"index": "not_analyzed"}
for the mapping.

Resources