Here is my index:
λ curl -XGET -u elastic:elasticpassword http://192.168.1.71:9200/test/mytype/_search?pretty -d'{"query":{"match_all":{}}}'
{
"took" : 5,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 2,
"max_score" : 1.0,
"hits" : [
{
"_index" : "test",
"_type" : "mytype",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"name" : "Dio",
"age" : 10
}
},
{
"_index" : "test",
"_type" : "mytype",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"name" : "Paul",
"pro" : {
"f" : "Cris",
"t" : "So"
}
}
}
]
}
}
Here is a default mapping:
λ curl -XGET -u elastic:elasticpassword http://192.168.1.71:9200/test/mytype/_mapping?pretty
{
"test" : {
"mappings" : {
"mytype" : {
"properties" : {
"age" : {
"type" : "long"
},
"name" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
I can find by age field, but cannot by name field. Why ?
λ curl -XGET -u elastic:elasticpassword http://192.168.1.71:9200/test/mytype/_search?pretty -d'{"query":{"term":{"age":10}}}'
{
"took" : 6,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 1,
"max_score" : 1.0,
"hits" : [
{
"_index" : "test",
"_type" : "mytype",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"name" : "Dio",
"age" : 10
}
}
]
}
}
λ curl -XGET -u elastic:elasticpassword http://192.168.1.71:9200/test/mytype/_search?pretty -d'{"query":{"term":{"name":"Paul"}}}'
{
"took" : 5,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 0,
"max_score" : null,
"hits" : [ ]
}
}
The problem is that you name field is analyzed by default with the standard analyzer, which lowercases the field. You can either search for paul or search in name.keyword field with Paul.
Related
I try to reduce the json result of elasticsearch to only the column or columns i suggested to get. Is there any way?
When I use the following command, I get the result nasted into "_source":
{
"from": "0", "size":"2",
"_source":["id"],
"query": {
"match_all": {}
}
}
'
and there is no need for my use case.
I get this result:
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 10000,
"relation" : "gte"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "indexer_1",
"_type" : "type_indexer_1",
"_id" : "38142",
"_score" : 1.0,
"_source" : {
"id" : 38142
}
},
{
"_index" : "indexer_1",
"_type" : "type_indexer_1",
"_id" : "38147",
"_score" : 1.0,
"_source" : {
"id" : 38147
}
}
]
}
}
What I would like to have:
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 10000,
"relation" : "gte"
},
"max_score" : 1.0,
"hits" : [
{
"id" : 38142
},
{
"id" : 38147
}
]
}
}
And this json-result I would love:
{
{
"id" : 38142
},
{
"id" : 38147
}
}
Is there any way out of the box in ES to reduce the result set?
you can filter the output JSON look at the documentation : response filtering
GET /index/_search?filter_path=hits.hits._id
{
"from": "0",
"size":"2",
"_source":["id"],
"query": {
"match_all": {}
}
}
How can i replace the "build_duration" : "null", with value 21600000 in elasticsearch?
DevTools > Console
GET myindex/_search
{
"query": {
"term": {
"build_duration": "null"
}
}
}
Output:-
{
"took" : 10,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 9.658761,
"hits" : [
{
"_index" : "myindex",
"_type" : "_doc",
"_id" : "40324749",
"_score" : 9.658761,
"_source" : {
"build_duration" : "null",
"build_end_time" : "2021-05-20 04:00:36",
"build_requester" : "daniel.su",
"build_site" : "POL",
"build_id" : "40324749",
"#version" : "1"
}
}
]
}
}
With below query able to replace the filed value.
POST /myindex/_update/mydocid
{
"doc" : {
"build_duration": "21600000"
}
}
This is basic_data(example) Output value
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 163,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "0513_final_test_instgram",
"_type" : "_doc",
"_id" : "6uShY3kBEkIlakOYovrR",
"_score" : 1.0,
"_source" : {
"host" : "DESKTOP-7MDCA36",
"path" : "C:/python_file/20210513_114123_instargram.csv",
"#version" : "1",
"message" : "hello",
"#timestamp" : "2021-05-13T02:50:05.962Z"
},
{
"_index" : "0513_final_test_instgram",
"_type" : "_doc",
"_id" : "EeShY3kBEkIlakOYovvm",
"_score" : 1.0,
"_source" : {
"host" : "DESKTOP-7MDCA36",
"path" : "C:/python_file/20210513_114123_instargram.csv",
"#version" : "1",
"message" : "python,
"#timestamp" : "2021-05-13T02:50:05.947Z"
}
First of all, out of various field values, only message values have been extracted.(under code example)
GET 0513_final_test_instgram/_search?_source=message&filter_path=hits.hits._source
{
"hits" : {
"hits" : [
{
"_source" : {
"message" : "hello"
}
},
{
"_source" : {
"message" : "python"
}
I got to know reindex that stores new indexes.
https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-reindex.html
However, I don't know even if I look at the document.
0513 attempt code
POST _reindex
{
"source": {
"index": "0513_final_test_instgram"
},
"dest": {
"index": "new_data_index"
}
}
How do you use reindex to store data that only extracted message values in a new index?
update comment attempt
output
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 163,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "new_data_index",
"_type" : "_doc",
"_id" : "6uShY3kBEkIlakOYovrR",
"_score" : 1.0,
"_source" : {
"message" : "hello"
}
},
{
"_index" : "new_data_index",
"_type" : "_doc",
"_id" : "EeShY3kBEkIlakOYovvm",
"_score" : 1.0,
"_source" : {
"message" : "python"
}
}
You simply need to specify which fields you want to reindex into the new index:
{
"source": {
"index": "0513_final_test_instgram",
"_source": ["message"]
},
"dest": {
"index": "new_data_index"
}
}
I need to have average of cpuload on specific nodetype. For example if I give nodetype as tpt it should give the average of cpuload of nodetype's of all tpt available. I tried different methods but vain...
My data in elasticsearch is below:
{
"took" : 5,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 4,
"max_score" : 1.0,
"hits" : [
{
"_index" : "kpi",
"_type" : "kpi",
"_id" : "\u0003",
"_score" : 1.0,
"_source" : {
"kpi" : {
"CpuAverageLoad" : 13,
"NodeId" : "kishan",
"NodeType" : "Tpt",
"State" : "online",
"Static_limit" : 0
}
}
},
{
"_index" : "kpi",
"_type" : "kpi",
"_id" : "\u0005",
"_score" : 1.0,
"_source" : {
"kpi" : {
"CpuAverageLoad" : 15,
"NodeId" : "kishan1",
"NodeType" : "tpt",
"State" : "online",
"Static_limit" : 0
}
}
},
{
"_index" : "kpi",
"_type" : "kpi",
"_id" : "\u0004",
"_score" : 1.0,
"_source" : {
"kpi" : {
"MaxLbCapacity" : "700000",
"NodeId" : "kishan2",
"NodeType" : "bang",
"OnlineCSCF" : [
"001",
"002"
],
"State" : "Online",
"TdbGroup" : 1,
"TdGroup" : 0
}
}
},
{
"_index" : "kpi",
"_type" : "kpi",
"_id" : "\u0002",
"_score" : 1.0,
"_source" : {
"kpi" : {
"MaxLbCapacity" : "700000",
"NodeId" : "kishan3",
"NodeType" : "bang",
"OnlineCSCF" : [
"001",
"002"
],
"State" : "Online",
"TdLGroup" : 1,
"TGroup" : 0
}
}
}
]
}
}
And my query is
curl -XGET 'localhost:9200/_search?pretty' -H 'Content-Type: application/json' -d'
{
"query": {
"bool" : {
"must" : {
"script" : {
"script" : {
"source" : "kpi[CpuAverageLoad].value > params.param1",
"lang" : "painless",
"params" : {
"param1" : 5
}
}
}
}
}
}
}'
but is falling as it is unable to find the exact source.
{
"error" : {
"root_cause" : [
{
"type" : "illegal_argument_exception",
"reason" : "[script] unknown field [source], parser not found"
}
],
"type" : "illegal_argument_exception",
"reason" : "[script] unknown field [source], parser not found"
},
"status" : 400
}
I don't understand why ES doubles my _id field in an array when I'm filtering on it.
curl -X GET "http://localhost:9200/pgep-development_broadcasts/broadcast/_search?pretty=true" -d '{"query":{"match_all":{}},"fields":["_id", "title"]}'
{
"took" : 7,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 1,
"max_score" : 1.0,
"hits" : [ {
"_index" : "pgep-development_broadcasts",
"_type" : "broadcast",
"_id" : "50ed959dcc93282abc000062",
"_score" : 1.0,
"fields" : {
"_id" : [ "50ed959dcc93282abc000062", "50ed959dcc93282abc000062" ],
"title" : "24 heures d'info"
}
} ]
}
}
You are probably hitting this bug https://github.com/elasticsearch/elasticsearch/issues/2161. If this is the case, you can simply stop storing id field.