Elasticsearch mappings doesn't seem to apply for query - elasticsearch

I use Elasticsearch with Spring Boot application. In this application there
I have index customer, and customer contains field secretKey. This secret key is string that is build from numbers and letters in way FOOBAR-000
My goal was to select exactly one customer by his secret key, so I changed mappings to NOT ANALYZE that fields but it seems not to work. What am I doing wrong?
Here's my mapping
curl -X GET 'http://localhost:9200/customer/_mapping'
{
"customer": {
"mappings": {
"customer": {
"properties": {
"secretKey": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
}
but after I will run query
curl -XGET "http:/localhost:9200/customer/_validate/query?explain" -d'
{
"query": {
"query_string": {
"query": "FOOBAR-3121"
}
}
}'
I get following explanation:
"explanations": [
{
"index": "customer",
"valid": true,
"explanation": "_all:foobar _all:3121"
},
]

From my understanding you have an index called "customer" and within this index, a document containing a "customer field. In your case the secretKey should be nested in the "customer" field. For some reasons Elasticsearch decided to have a strange behaviour if you encapsulate objects without specifying that they are of nested type. This is the article from the doc that explains the behaviour in details. If you specify it with the following :
{
"customer": {
"mappings": {
"_doc": {
"properties": {
"customer": {
"type": "nested"
}
}
}
}
}
}
Then it should work with your query

You need to specify field name in your query, without it ElasticSearch executes query against all field, so you see _all . Try this one:
curl -XGET "http:/localhost:9200/customer/_validate/query?explain" -d'
{
"query": {
"term": {
"secretKey": {
"value": "FOOBAR-3121"
}
}
}
}'

My goal was to select exactly one customer by his secret key
Your requirement is strict, so use MATCH query to select ONLY matched customer!
curl -XGET "http:/localhost:9200/customer/_validate/query?explain" -d'
{
"query": {
"match": {
"secretKey": "FOOBAR-3121"
}
}

Related

Kibana - missing text highlighting for multi-field mapping

I am experimenting with ECS - Elastic Common Schema.
We need to highlight text search for the field error.stack_trace . This field is a multi-field mapped defined here
I just did a simple test running Elasticsearch and Kibana 7.17.4 one field defined as multi-field and one with single field.
PUT simple-index-01
{
"mappings": {
"properties": {
"stack_trace01": { "type": "text" },
"stack_trace02": {
"fields": {
"text": {
"type": "text"
}
},
"type": "wildcard"
}
}
}
}
POST simple-index-01/_doc
{
"#timestamp" : "2022-06-07T08:21:05.000Z",
"stack_trace01": "java.lang.NullPointerException: null",
"stack_trace02": "java.lang.NullPointerException: null"
}
Is it a Kibana expected behavior not to highlight multi-fields?
wildcard type will be not available to search using full text query as mentioned in documentaion (it is part of keyword type family):
The wildcard field type is a specialized keyword field for
unstructured machine-generated content you plan to search using
grep-like wildcard and regexp queries.
So when you try below query it will not return result and this is the reason why it is not highlghting your stack_trace02 field in discover.
POST simple-index-01/_search
{
"query": {
"match": {
"stack_trace02": "null"
}
}
}
But below query will give result:
{
"query": {
"wildcard": {
"stack_trace02": {
"value": "*null*"
}
}
}
}
You can create index mapping something like below and your parent type field should text type:
PUT simple-index-01
{
"mappings": {
"properties": {
"stack_trace01": {
"type": "text"
},
"stack_trace02": {
"fields": {
"text": {
"type": "wildcard"
}
},
"type": "text"
}
}
}
}
You can now use stack_trace02.wildcard when you want to search wildcard type of query.
There is already open issue on similar behaviour but it is not for wildcard type.

Elasticsearch script query involving root and nested values

Suppose I have a simplified Organization document with nested publication values like so (ES 2.3):
{
"organization" : {
"dateUpdated" : 1395211600000,
"publications" : [
{
"dateCreated" : 1393801200000
},
{
"dateCreated" : 1401055200000
}
]
}
}
I want to find all Organizations that have a publication dateCreated < the organization's dateUpdated:
{
"query": {
"nested": {
"path": "publications",
"query": {
"bool": {
"filter": [
{
"script": {
"script": "doc['publications.dateCreated'].value < doc['dateUpdated'].value"
}
}
]
}
}
}
}
}
My problem is that when I perform a nested query, the nested query does not have access to the root document values, so doc['dateUpdated'].value is invalid and I get 0 hits.
Is there a way to pass in a value into the nested query? Or is my nested approach completely off here? I would like to avoid creating a separate document just for publications if necessary.
Thanks.
You can not access the root values from nested query context. They are indexed as separate documents. From the documentation
The nested clause “steps down” into the nested comments field. It no
longer has access to fields in the root document, nor fields in any
other nested document.
You can get the desired results with the help of copy_to parameter. Another way to do this would be to use include_in_parent or include_in_root but they might be deprecated in future and it will also increase the index size as every field of nested type will be included in root document so in this case copy_to functionality is better.
This is a sample index
PUT nested_index
{
"mappings": {
"blogpost": {
"properties": {
"rootdate": {
"type": "date"
},
"copy_of_nested_date": {
"type": "date"
},
"comments": {
"type": "nested",
"properties": {
"nested_date": {
"type": "date",
"copy_to": "copy_of_nested_date"
}
}
}
}
}
}
}
Here every value of nested_date will be copied to copy_of_nested_date so copy_of_nested_date will look something like [1401055200000,1393801200000,1221542100000] and then you could use simple query like this to get the results.
{
"query": {
"bool": {
"filter": [
{
"script": {
"script": "doc['rootdate'].value < doc['copy_of_nested_date'].value"
}
}
]
}
}
}
You don't have to change your nested structure but you would have to reindex the documents after adding copy_to to publication dateCreated

Elasticsearch match exact term

I have an Elasticsearch repo and a aplication that create documents for what we call 'assets'. I need to prevent users to create 'assets' with the same 'title'.
When the user tries to create an 'asset' I am querying the repo with the title and if there is a match an error message is shown to the user.
My problem is that when I query the title I am getting multiple results (for similar matches).
This is my query so far:
GET assets-1/asset/_search
{
"query": {
"match": {
"title": {
"query": "test",
"operator": "and"
}
}
}
}
I have many records with title: 'test 1', 'test 2', 'test bla' and only one with the title 'test'.
But I am getting all of the above.
Is there any condition or property I have to add to the query so I will exact match the term?
Your title field is probably analyzed and thus the test token will match any title containing that token.
In order to implement an exact match you need to have a not_analyzed field and do a term query on it.
You need to change the mapping of your title field to this:
curl -XPUT localhost:9200/assets-1/_mapping/asset -d '{
"asset": {
"properties": {
"title": {
"type": "string",
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
}'
Then you need to reindex your data and you'll then be able to run an exact match query like this:
curl -XPOST localhost:9200/assets-1/asset/_search -d '{
"query": {
"term": {
"title.raw": "test"
}
}
}'

How to specify or target a field from a specific document type in queries or filters in Elasticsearch?

Given:
Documents of two different types, let's say 'product' and 'category', are indexed to the same Elasticsearch index.
Both document types have a field 'tags'.
Problem:
I want to build a query that returns results of both types, but the documents of type 'product' are allowed to have tags 'X' and 'Y', and the documents of type 'category' are only allowed to have tag 'Z'. How can I achieve this? It appears I can't use product.tags and category.tags since then ES will look for documents' product/category field, which is not what I intend.
Note:
While for the example above there might be some kind of workaround, I'm looking for a general way to target or specify fields of a specific document type when writing queries. I basically want to 'namespace' the field names used in my query so only documents of the type I want to work with are considered.
I think field aliasing would be the best answer for you, but it's not possible.
Instead you can use "copy_to" but I it probably affects index size:
DELETE /test
PUT /test
{
"mappings": {
"product" : {
"properties": {
"tags": { "type": "string", "copy_to": "ptags" },
"ptags": { "type": "string" }
}
},
"category" : {
"properties": {
"tags": { "type": "string", "copy_to": "ctags" },
"ctags": { "type": "string" }
}
}
}
}
PUT /test/product/1
{ "tags":"X" }
PUT /test/product/2
{ "tags":"Y" }
PUT /test/category/1
{ "tags":"Z" }
And you can query one of fields or many of them:
GET /test/product,category/_search
{
"query": {
"term": {
"ptags": {
"value": "x"
}
}
}
}
GET /test/product,category/_search
{
"query": {
"multi_match": {
"query": "x",
"fields": [ "ctags", "ptags" ]
}
}
}

elasticsearch aggregations separated words

I simply run an aggregations in browser plugin(marvel) as you see in picture below there is only one doc match the query but aggregrated separated by spaces but it doesn't make sense I want aggregrate for different doc.. ın this scenario there should be only one group with count 1 and key:"Drow Ranger".
What is the true way of do this in elasticsearch..
It's probably because your heroname field is analyzed and thus "Drow Ranger" gets tokenized and indexed as "drow" and "ranger".
One way to get around this is to transform your heroname field to a multi-field with an analyzed part (the one you search on with the wildcard query) and another not_analyzed part (the one you can aggregate on).
You should create your index like this and specify the proper mapping for your heroname field
curl -XPUT localhost:9200/dota2 -d '{
"mappings": {
"agust": {
"properties": {
"heroname": {
"type": "string",
"fields": {
"raw: {
"type": "string",
"index": "not_analyzed"
}
}
},
... your other fields go here
}
}
}
}
Then you can run your aggregation on the heroname.raw field instead of the heroname field.
UPDATE
If you just want to try on the heroname field, you can just modify that field and not recreate the whole index. If you run the following command, it will simply add the new heroname.raw sub-field to your existing heroname field. Note that you still have to reindex your data though
curl -XPUT localhost:9200/dota2/_mapping/agust -d '{
"properties": {
"heroname": {
"type": "string",
"fields": {
"raw: {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
Then you can keep using heroname in your wildcard query, but your aggregation will look like this:
{
"aggs": {
"asd": {
"terms": {
"field": "heroname.raw", <--- use the raw field here
"size": 0
}
}
}
}

Resources