only basic searches are working - elasticsearch

I'm playing with curl querying elasticsearch db through my mac console. But I have troubles to execute more complex searches. So far I can query for match_all like this:
curl -XGET 'localhost:9200/products/fashion/_search?pretty' -d'
{
"query" : { "match_all" : {} }
}'
And I receive the following data:
{
"took" : 7,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 915503,
"max_score" : 1.0,
"hits" : [
{
"_index" : "products",
"_type" : "fashion",
"_id" : "57d49ee494efcfdfe0f3abfe",
"_score" : 1.0,
"_source" : {
"doc" : {
"id" : "57d49ee494efcfdfe0f3abfe",
"name" : "Shorts",
"price" : {
"value" : 35
}
}
}
},
...........
}
I don't have problems to request a mapping like this:
curl -XGET 'localhost:9200/products/_mapping/fashion?pretty'
And the result for price is:
.......
"price" : {
"properties" : {
"currency" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"value" : {
"type" : "long"
}
}
},
....
But all my attempts to query with filter on "price.value" did not hit.
curl -XGET 'localhost:9200/products/fashion/_search?pretty' -d'
{
"query" : {
"constant_score" : {
"filter" : {
"term" : {
"price.value" : 35
}
}
}
}
}'
{
"took" : 26,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 0,
"max_score" : null,
"hits" : [ ]
}
}
This query I took from elasticsearch guide
I ran out of ideas and examples how to write this query to return what I obviously have in database. As you might noticed I have at least one document with price.value = 35

That's because your price field is within another field named doc, so you need to query doc.price.value like this:
curl -XPOST 'localhost:9200/products/fashion/_search?pretty' -d'
{
"query" : {
"constant_score" : {
"filter" : {
"term" : {
"doc.price.value" : 35
}
}
}
}
}'

Related

ElasticSearch aggregation shows unexpected result for SUM

Trying to apply sum aggregation in ES 7.14 and get unexpected result
1. prepare dataset
$cat products.json
{"index":{"_id":"1"}}
{"productId": 10,"shopId": 45,"prices": {"retailPrice": 525000000.02,"sumRetailPrice": 5250000000.2},"count": 10}
{"index":{"_id":"2"}}
{"productId": 10,"shopId": 48,"prices": {"retailPrice": 26250000004,"sumRetailPrice": 5250000000.8},"count": 20}
2. bulk insert
curl -XPOST localhost:9200/25products/_bulk -H "Content-Type: application/x-ndjson" --data-binary #./products.json
3. view mapping
curl -XGET "http://localhost:9200/25products/_mapping?pretty"
{
"25products" : {
"mappings" : {
"properties" : {
"count" : {
"type" : "long"
},
"prices" : {
"properties" : {
"retailPrice" : {
"type" : "float"
},
"sumRetailPrice" : {
"type" : "float"
}
}
},
"productId" : {
"type" : "long"
},
"shopId" : {
"type" : "long"
}
}
}
}
}
4. Sum field "prices.sumRetailPrice" in Painless
curl --location --request POST 'http://localhost:9200/25products/_search?pretty' \
--header 'Content-Type: application/json' \
--data-raw '{
"aggs": {"sumSupplyPrice": {
"sum": {"script": {
"source": "(!doc.containsKey('\''prices.sumRetailPrice'\'') ? 0 : (doc['\''prices.sumRetailPrice'\''].size() == 0 ? 0: doc['\''prices.sumRetailPrice'\''].value))"
}}
}},
"query": {"bool": {
"filter": [
{"terms": {"shopId": [45]}},
{"terms": {"productId": [10]}}
]
}},
"from": 0, "size": 10
}'
result is
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 0.0,
"hits" : [
{
"_index" : "25products",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.0,
"_source" : {
"productId" : 10,
"shopId" : 45,
"prices" : {
"retailPrice" : 5.2500000002E8,
"sumRetailPrice" : 5.2500000002E9
},
"count" : 10
}
}
]
},
"aggregations" : {
"sumSupplyPrice" : {
"value" : 5.249999872E9
}
}
}
4. Expectation
as well as I have a single record, expecting to have the same value as sumRetailPrice
"aggregations" : {
"sumSupplyPrice" : {
"value" : **5.2500000002E9**
}
}
But, actual result is not as expected.
"aggregations" : {
"sumSupplyPrice" : {
"value" : **5.249999872E9**
}
}
Where am I wrong?
Thanks!

elasticsearch does not return expected returns

I'm complete new on elasticsearch. I tried search API but it's not returning what I expected
What I did
POST /test/_doc/1
{
"name": "Hello World"
}
GET /test/_doc/1
Response:
{
"_index" : "test",
"_type" : "_doc",
"_id" : "1",
"_version" : 5,
"_seq_no" : 28,
"_primary_term" : 1,
"found" : true,
"_source" : {
"name" : "Hello World"
}
}
GET /test/_mapping
Response:
{
"test" : {
"mappings" : {
"properties" : {
"name" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"query" : {
"properties" : {
"term" : {
"properties" : {
"name" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
}
}
}
}
}
}
}
GET /test/_search
{
"query": {
"term": {
"name": "Hello"
}
}
}:
{
"took" : 4,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 0,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
}
}
GET /test/_search
{
"query": {
"term": {
"name": "Hello World"
}
}
}
Response:
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 0,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
}
}
My elasticsearch version is 7.3.2
The last two search should return me document 1, is that correct? Why does it hit nothing?
Problem is that you have term queries. Term queries are not analysed. Hence Hello didn't match the term hello in your index. Note the case difference.
Unlike full-text queries, term-level queries do not analyze search terms. Instead, term-level queries match the exact terms stored in a field.
Reference
Whereas match queries analyse the search term also.
{
"query": {
"match": {
"name": "Hello"
}
}
}
You can use _analyze to check how your terms are indexed.

how to Compare two fields in ElasticSearch

I need to compare two fields in Elasticsearch. I tried with below query using Kibana. But it's a gave runtime exception. Kindly help me how to compare these fields.
GET /eps/_search
{
"query": {
"bool" : {
"filter" : {
"script" : {
"script" : {
"source": "doc['output_record_count'].value < doc['input_record_count'].value",
"lang" : "painless"
}
}
}
}
}
}
Note:
For complete Match query I am getting below success response.
Query:
GET /eps/_search
{
"query": {
"match_all": {}
}
}
The Response is :
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 5,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "eps",
"_type" : "_doc",
"_id" : "9bNkeXEBLNJ-eURYKdv1",
"_score" : 1.0,
"_source" : {
"experience" : "EPS",
"#version" : "1",
"sdcids" : "013bb234-0840-11ea-8e7d-515f88cf3efa",
"output_record_count" : 13,
"input_record_count" : 10,
"#timestamp" : "2020-04-14T15:52:19.582Z",
"SDC_Ids" : "013bb234-0840-11ea-8e7d-515f88cf3efa"
}
}
]
}
}

Elasticsearch wildcard query with spaces

I'm trying to do a wildcard query with spaces. It easily matches the words on term basis but not on field basis.
I've read the documentation which says that I need to have the field as not_analyzed but with this type set, it returns nothing.
This is the mapping with which it works on term basis:
{
"denshop" : {
"mappings" : {
"products" : {
"properties" : {
"code" : {
"type" : "string"
},
"id" : {
"type" : "long"
},
"name" : {
"type" : "string"
},
"price" : {
"type" : "long"
},
"url" : {
"type" : "string"
}
}
}
}
}
}
This is the mapping with which the exact same query returns nothing:
{
"denshop" : {
"mappings" : {
"products" : {
"properties" : {
"code" : {
"type" : "string"
},
"id" : {
"type" : "long"
},
"name" : {
"type" : "string",
"index" : "not_analyzed"
},
"price" : {
"type" : "long"
},
"url" : {
"type" : "string"
}
}
}
}
}
}
The query is here:
curl -XPOST http://127.0.0.1:9200/denshop/products/_search?pretty -d '{"query":{"wildcard":{"name":"*test*"}}}'
Response with the not_analyzed property:
{
"took" : 3,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 0,
"max_score" : null,
"hits" : [ ]
}
}
Response without not_analyzed:
{
"took" : 4,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 5,
"max_score" : 1.0,
"hits" : [ {
...
EDIT: Adding requested info
Here is the list of documents:
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 5,
"max_score" : 1.0,
"hits" : [ {
"_index" : "denshop",
"_type" : "products",
"_id" : "3L1",
"_score" : 1.0,
"_source" : {
"id" : 3,
"name" : "Testovací produkt 2",
"code" : "",
"price" : 500,
"url" : "http://www.denshop.lh/damske-obleceni/testovaci-produkt-2/"
}
}, {
"_index" : "denshop",
"_type" : "products",
"_id" : "4L1",
"_score" : 1.0,
"_source" : {
"id" : 4,
"name" : "Testovací produkt 3",
"code" : "",
"price" : 666,
"url" : "http://www.denshop.lh/damske-obleceni/testovaci-produkt-3/"
}
}, {
"_index" : "denshop",
"_type" : "products",
"_id" : "2L1",
"_score" : 1.0,
"_source" : {
"id" : 2,
"name" : "Testovací produkt",
"code" : "",
"price" : 500,
"url" : "http://www.denshop.lh/damske-obleceni/testovaci-produkt/"
}
}, {
"_index" : "denshop",
"_type" : "products",
"_id" : "5L1",
"_score" : 1.0,
"_source" : {
"id" : 5,
"name" : "Testovací produkt 4",
"code" : "",
"price" : 666,
"url" : "http://www.denshop.lh/damske-obleceni/testovaci-produkt-4/"
}
}, {
"_index" : "denshop",
"_type" : "products",
"_id" : "6L1",
"_score" : 1.0,
"_source" : {
"id" : 6,
"name" : "Testovací produkt 5",
"code" : "",
"price" : 666,
"url" : "http://www.denshop.lh/tricka-tilka-tuniky/testovaci-produkt-5/"
}
} ]
}
}
Without the not_analyzed it returns with this:
curl -XPOST http://127.0.0.1:9200/denshop/products/_search?pretty -d '{"query":{"wildcard":{"name":"*testovací*"}}}'
But not with this (notice the space before asterisk):
curl -XPOST http://127.0.0.1:9200/denshop/products/_search?pretty -d '{"query":{"wildcard":{"name":"*testovací *"}}}'
When I add the not_analyzed to mapping, it returns no hits no matter what I put in the wildcard query.
Add a custom analyzer that should lowercase the text. Then in your search query, before passing the text to it have it lowercased in your client application.
To, also, keep the original analysis chain, I've added a sub-field to your name field that will use the custom analyzer.
PUT /denshop
{
"settings": {
"analysis": {
"analyzer": {
"keyword_lowercase": {
"type": "custom",
"tokenizer": "keyword",
"filter": [
"lowercase"
]
}
}
}
},
"mappings": {
"products": {
"properties": {
"name": {
"type": "string",
"fields": {
"lowercase": {
"type": "string",
"analyzer": "keyword_lowercase"
}
}
}
}
}
}
}
And the query will work on the sub-field:
GET /denshop/products/_search
{
"query": {
"wildcard": {
"name.lowercase": "*testovací *"
}
}
}

How to get Elasticsearch boolean match working for multiple fields

I need some expert guidance on trying to get a bool match working. I'd like the query to only return a successful search result if both 'message' matches 'Failed password for', and 'path' matches '/var/log/secure'.
This is my query:
curl -s -XGET 'http://localhost:9200/logstash-2015.05.07/syslog/_search?pretty=true' -d '{
"filter" : { "range" : { "#timestamp" : { "gte" : "now-1h" } } },
"query" : {
"bool" : {
"must" : [
{ "match_phrase" : { "message" : "Failed password for" } },
{ "match_phrase" : { "path" : "/var/log/secure" } }
]
}
}
} '
Here is the start of the output from the search:
{
"took" : 3,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 46,
"max_score" : 13.308596,
"hits" : [ {
"_index" : "logstash-2015.05.07",
"_type" : "syslog",
"_id" : "AU0wzLEqqCKq_IPSp_8k",
"_score" : 13.308596,
"_source":{"message":"May 7 16:53:50 s_local#logstash-02 sshd[17970]: Failed password for fred from 172.28.111.200 port 43487 ssh2","#version":"1","#timestamp":"2015-05-07T16:53:50.554-07:00","type":"syslog","host":"logstash-02","path":"/var/log/secure"}
}, ...
The problem is if I change '/var/log/secure' to just 'var' say, and run the query, I still get a result, just with a lower score. I understood the bool...must construct meant both match terms here would need to be successful. What I'm after is no result if 'path' doesn't exactly match '/var/log/secure'...
{
"took" : 3,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 46,
"max_score" : 10.354593,
"hits" : [ {
"_index" : "logstash-2015.05.07",
"_type" : "syslog",
"_id" : "AU0wzLEqqCKq_IPSp_8k",
"_score" : 10.354593,
"_source":{"message":"May 7 16:53:50 s_local#logstash-02 sshd[17970]: Failed password for fred from 172.28.111.200 port 43487 ssh2","#version":"1","#timestamp":"2015-05-07T16:53:50.554-07:00","type":"syslog","host":"logstash-02","path":"/var/log/secure"}
},...
I checked the mappings for these fields to check that they are not analyzed :
curl -X GET 'http://localhost:9200/logstash-2015.05.07/_mapping?pretty=true'
I think these fields are non analyzed and so I believe the search will not be analyzed too (based on some training documentation I read recently from elasticsearch). Here is a snippet of the output _mapping for this index below.
....
"message" : {
"type" : "string",
"norms" : {
"enabled" : false
},
"fields" : {
"raw" : {
"type" : "string",
"index" : "not_analyzed",
"ignore_above" : 256
}
}
},
"path" : {
"type" : "string",
"norms" : {
"enabled" : false
},
"fields" : {
"raw" : {
"type" : "string",
"index" : "not_analyzed",
"ignore_above" : 256
}
}
},
....
Where am I going wrong, or what am I misunderstanding here?
As mentioned in the OP you would need to use the "not_analyzed" view of the fields but as per the OP mapping the non-analyzed version of the field is message.raw, path.raw
Example:
{
"filter" : { "range" : { "#timestamp" : { "gte" : "now-1h" } } },
"query" : {
"bool" : {
"must" : [
{ "match_phrase" : { "message.raw" : "Failed password for" } },
{ "match_phrase" : { "path.raw" : "/var/log/secure" } }
]
}
}
}
.The link alongside gives more insight to multi-fields
.To expand further
The mapping in the OP for path is as follows:
"path" : {
"type" : "string",
"norms" : {
"enabled" : false
},
"fields" : {
"raw" : {
"type" : "string",
"index" : "not_analyzed",
"ignore_above" : 256
}
}
}
This specifies that the path field uses the default analyzer and field.raw is not analyzed.
If you want to set the path field to be not analyzed instead of raw it would be something on these lines:
"path" : {
"type" : "string",
"index" : "not_analyzed",
"norms" : {
"enabled" : false
},
"fields" : {
"raw" : {
"type" : "string",
"index" : <whatever analyzer you want>,
"ignore_above" : 256
}
}
}

Resources