Search by text field - elasticsearch

Here is my index:
λ curl -XGET -u elastic:elasticpassword -d'{"query":{"match_all":{}}}'
"took" : 5,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
"hits" : {
"total" : 2,
"max_score" : 1.0,
"hits" : [
"_index" : "test",
"_type" : "mytype",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"name" : "Dio",
"age" : 10
"_index" : "test",
"_type" : "mytype",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"name" : "Paul",
"pro" : {
"f" : "Cris",
"t" : "So"
Here is a default mapping:
λ curl -XGET -u elastic:elasticpassword
"test" : {
"mappings" : {
"mytype" : {
"properties" : {
"age" : {
"type" : "long"
"name" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
I can find by age field, but cannot by name field. Why ?
λ curl -XGET -u elastic:elasticpassword -d'{"query":{"term":{"age":10}}}'
"took" : 6,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
"hits" : {
"total" : 1,
"max_score" : 1.0,
"hits" : [
"_index" : "test",
"_type" : "mytype",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"name" : "Dio",
"age" : 10
λ curl -XGET -u elastic:elasticpassword -d'{"query":{"term":{"name":"Paul"}}}'
"took" : 5,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
"hits" : {
"total" : 0,
"max_score" : null,
"hits" : [ ]

The problem is that you name field is analyzed by default with the standard analyzer, which lowercases the field. You can either search for paul or search in name.keyword field with Paul.


elasticsearch reducing result to one column - return only 1 value for each document

I try to reduce the json result of elasticsearch to only the column or columns i suggested to get. Is there any way?
When I use the following command, I get the result nasted into "_source":
"from": "0", "size":"2",
"query": {
"match_all": {}
and there is no need for my use case.
I get this result:
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
"hits" : {
"total" : {
"value" : 10000,
"relation" : "gte"
"max_score" : 1.0,
"hits" : [
"_index" : "indexer_1",
"_type" : "type_indexer_1",
"_id" : "38142",
"_score" : 1.0,
"_source" : {
"id" : 38142
"_index" : "indexer_1",
"_type" : "type_indexer_1",
"_id" : "38147",
"_score" : 1.0,
"_source" : {
"id" : 38147
What I would like to have:
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
"hits" : {
"total" : {
"value" : 10000,
"relation" : "gte"
"max_score" : 1.0,
"hits" : [
"id" : 38142
"id" : 38147
And this json-result I would love:
"id" : 38142
"id" : 38147
Is there any way out of the box in ES to reduce the result set?
you can filter the output JSON look at the documentation : response filtering
GET /index/_search?filter_path=hits.hits._id
"from": "0",
"query": {
"match_all": {}

Null field in elasticsearch need to be replaced

How can i replace the "build_duration" : "null", with value 21600000 in elasticsearch?
DevTools > Console
GET myindex/_search
"query": {
"term": {
"build_duration": "null"
"took" : 10,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
"max_score" : 9.658761,
"hits" : [
"_index" : "myindex",
"_type" : "_doc",
"_id" : "40324749",
"_score" : 9.658761,
"_source" : {
"build_duration" : "null",
"build_end_time" : "2021-05-20 04:00:36",
"build_requester" : "",
"build_site" : "POL",
"build_id" : "40324749",
"#version" : "1"
With below query able to replace the filed value.
POST /myindex/_update/mydocid
"doc" : {
"build_duration": "21600000"

How do I apply reindex to new data values through filters?

This is basic_data(example) Output value
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
"hits" : {
"total" : {
"value" : 163,
"relation" : "eq"
"max_score" : 1.0,
"hits" : [
"_index" : "0513_final_test_instgram",
"_type" : "_doc",
"_id" : "6uShY3kBEkIlakOYovrR",
"_score" : 1.0,
"_source" : {
"host" : "DESKTOP-7MDCA36",
"path" : "C:/python_file/20210513_114123_instargram.csv",
"#version" : "1",
"message" : "hello",
"#timestamp" : "2021-05-13T02:50:05.962Z"
"_index" : "0513_final_test_instgram",
"_type" : "_doc",
"_id" : "EeShY3kBEkIlakOYovvm",
"_score" : 1.0,
"_source" : {
"host" : "DESKTOP-7MDCA36",
"path" : "C:/python_file/20210513_114123_instargram.csv",
"#version" : "1",
"message" : "python,
"#timestamp" : "2021-05-13T02:50:05.947Z"
First of all, out of various field values, only message values have been extracted.(under code example)
GET 0513_final_test_instgram/_search?_source=message&filter_path=hits.hits._source
"hits" : {
"hits" : [
"_source" : {
"message" : "hello"
"_source" : {
"message" : "python"
I got to know reindex that stores new indexes.
However, I don't know even if I look at the document.
0513 attempt code
POST _reindex
"source": {
"index": "0513_final_test_instgram"
"dest": {
"index": "new_data_index"
How do you use reindex to store data that only extracted message values in a new index?
update comment attempt
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
"hits" : {
"total" : {
"value" : 163,
"relation" : "eq"
"max_score" : 1.0,
"hits" : [
"_index" : "new_data_index",
"_type" : "_doc",
"_id" : "6uShY3kBEkIlakOYovrR",
"_score" : 1.0,
"_source" : {
"message" : "hello"
"_index" : "new_data_index",
"_type" : "_doc",
"_id" : "EeShY3kBEkIlakOYovvm",
"_score" : 1.0,
"_source" : {
"message" : "python"
You simply need to specify which fields you want to reindex into the new index:
"source": {
"index": "0513_final_test_instgram",
"_source": ["message"]
"dest": {
"index": "new_data_index"

How to perform the arthimatic operation on data from elasticsearch

I need to have average of cpuload on specific nodetype. For example if I give nodetype as tpt it should give the average of cpuload of nodetype's of all tpt available. I tried different methods but vain...
My data in elasticsearch is below:
"took" : 5,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
"hits" : {
"total" : 4,
"max_score" : 1.0,
"hits" : [
"_index" : "kpi",
"_type" : "kpi",
"_id" : "\u0003",
"_score" : 1.0,
"_source" : {
"kpi" : {
"CpuAverageLoad" : 13,
"NodeId" : "kishan",
"NodeType" : "Tpt",
"State" : "online",
"Static_limit" : 0
"_index" : "kpi",
"_type" : "kpi",
"_id" : "\u0005",
"_score" : 1.0,
"_source" : {
"kpi" : {
"CpuAverageLoad" : 15,
"NodeId" : "kishan1",
"NodeType" : "tpt",
"State" : "online",
"Static_limit" : 0
"_index" : "kpi",
"_type" : "kpi",
"_id" : "\u0004",
"_score" : 1.0,
"_source" : {
"kpi" : {
"MaxLbCapacity" : "700000",
"NodeId" : "kishan2",
"NodeType" : "bang",
"OnlineCSCF" : [
"State" : "Online",
"TdbGroup" : 1,
"TdGroup" : 0
"_index" : "kpi",
"_type" : "kpi",
"_id" : "\u0002",
"_score" : 1.0,
"_source" : {
"kpi" : {
"MaxLbCapacity" : "700000",
"NodeId" : "kishan3",
"NodeType" : "bang",
"OnlineCSCF" : [
"State" : "Online",
"TdLGroup" : 1,
"TGroup" : 0
And my query is
curl -XGET 'localhost:9200/_search?pretty' -H 'Content-Type: application/json' -d'
"query": {
"bool" : {
"must" : {
"script" : {
"script" : {
"source" : "kpi[CpuAverageLoad].value > params.param1",
"lang" : "painless",
"params" : {
"param1" : 5
but is falling as it is unable to find the exact source.
"error" : {
"root_cause" : [
"type" : "illegal_argument_exception",
"reason" : "[script] unknown field [source], parser not found"
"type" : "illegal_argument_exception",
"reason" : "[script] unknown field [source], parser not found"
"status" : 400

ElasticSearch doubles my _id field when filtering on it

I don't understand why ES doubles my _id field in an array when I'm filtering on it.
curl -X GET "http://localhost:9200/pgep-development_broadcasts/broadcast/_search?pretty=true" -d '{"query":{"match_all":{}},"fields":["_id", "title"]}'
"took" : 7,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
"hits" : {
"total" : 1,
"max_score" : 1.0,
"hits" : [ {
"_index" : "pgep-development_broadcasts",
"_type" : "broadcast",
"_id" : "50ed959dcc93282abc000062",
"_score" : 1.0,
"fields" : {
"_id" : [ "50ed959dcc93282abc000062", "50ed959dcc93282abc000062" ],
"title" : "24 heures d'info"
} ]
You are probably hitting this bug If this is the case, you can simply stop storing id field.
