Elastic Search in a complex document - elasticsearch

I have a document stated below. I would like to do a search but I could not do it as I lacked the knowledge. Please help. How can I do searches in ElasticSearch in complex aggregates?
My Document
{
"_index": "vehicles",
"_type": "car",
"_id": "e16bd474-fa8e-4858-ab6c-3bbb3d0aa603",
"_version": 1,
"found": true,
"_source": {
"Type": {
"Name": "Mustang"
}
}
}
My Search Query
GET _search
{
"query":{
"filtered": {
"filter": {
"term": {
"Name": "Mustang"
}
}
}
},
"from":0,
"size":10
}

The Standard Analyzer is being applied to your Name field, so the term Mustang is being stored in the index as mustang. Change your query to use "Name": "mustang" and you should get a match.

If you only want the doc with "Name" : "Mustang" you can use
"query" : {
"bool" : {
"must" : {
"term" : {
"Name" : "Mustang"
}
}
}
}

There are two issues:
You are using term filter which is searching for Mustang token in the index, however the standard analyzer is being used so it is actually indexed as mustang.
You are searching in the wrong field. You should be using nested notation e.g. Type.Name
This query should work as expected:
{"query":{ "filtered": { "filter": {
"term": { "Type.Name": "mustang" }
}}}}

Related

What is difference between match and bool must match query in Elasticsearch

What is the difference between Only match and bool must match query in ES?
First, Only use the match query
{
"query":{
"match":{
"address":"mill"
}
}
}
Second, use compound query
{
"query": {
"bool": {
"must": [
{ "match": { "address": "mill" } }
]
}
}
}
Can you tell me everything?
What is difference between them?
When you use only one match inside a bool must clause then there is no difference, the bool clause is useful when you want to combine multiple(boolean) criteria, more info on official ES doc. It supports below criteria.
must
must_not
filter
should
Let me show by taking a small example from your question.
Index mapping with just address and first_name
{
"mappings": {
"properties": {
"address": {
"type": "text"
},
"first_name" :{
"type" : "text"
}
}
}
}
Index 3 docs, all having same address mill, but different first_name
{
"address" : "mill",
"first_name" : "Johnson"
}
{
"address" : "mill",
"first_name" : "Parker"
}
{
"address" : "mill",
"first_name" : "opster"
}
Search query to show all adresses of mill but must_not contain first_name as parker
{
"query": {
"bool": {
"must": [
{
"match": {
"address": "mill"
}
},
{
"must_not": {
"first_name": "parker"
}
}
]
}
}
}
Result only 2 address
"hits": [
{
"_index": "so-60620921-bool",
"_type": "_doc",
"_id": "2",
"_score": 0.13353139,
"_source": {
"address": "mill",
"first_name": "opster"
}
},
{
"_index": "so-60620921-bool",
"_type": "_doc",
"_id": "3",
"_score": 0.13353139,
"_source": {
"address": "mill",
"first_name": "Johnson"
}
}
]
Based on the OP comments, providing the query and filter context, to understand the performance aspects in details.
As written in your question, they will perform the same action.
The match query is a very straight forward full-text condition statement.
The bool query allows you to add multiple fields and multiple conditions such as exists (to validate a certain field is found in the documents), should (an OR equivalent) and must_not (a NOT equivalent).
Taking again your example, since the bool query only has a single must, match condition, it will only return all the documents with the value mill contained in the address field.
Hope this is helpful! :)

Use Elasticsearch percolate with specific type of field name

I'm making a subscription system for notifications using the percolate type of property of Elasticsearch 7.x. The problem is that I can't make a percolate query with certain types of fields.
This is an example of the indexed data. As you can see, I have a query indexed to be able to perform a percolate query. The difference I would like to mention is the name of the field in the query which can be state or created_by.full_name.raw
{
"_index": "widgets_2020",
"_type": "widget",
"_score": 1.0,
"_source": {
"created_at": "2020-01-09T21:58:14.123Z",
"query": {
"bool": {
"must": [],
"filter": [
{
"terms": {
"created_by.full_name.raw": [
"Ivan Ledner"
]
}
}
]
}
}
}
},
{
"_index": "widgets_2020",
"_type": "widget",
"_score": 1.0,
"_source": {
"created_at": "2020-01-09T22:02:24.133Z",
"query": {
"bool": {
"must": [],
"filter": [
{
"terms": {
"state": [
"done"
]
}
}
]
}
}
}
}
When I do something like this, Elasticsearch returns the documents I expect.
widgets_2020/_search
{
"query" : {
"percolate" : {
"field" : "query",
"document" : {
"state": ["created"]
}
}
}
}
But when I search this, It returns nothing.
widgets_2020/_search
{
"query" : {
"percolate" : {
"field" : "query",
"document" : {
"created_by.full_name.raw": ["Ivan Ledner"]
}
}
}
}
Is there a different way of dealing with these types of names? Thanks in advance!
The problem was that I enabled the option map_unmapped_fields_as_text and this mapped all my fields as text as the options say. The way I solved this is mapping all the attributes manually and the percolator started to work as expected.

Elasticsearch query_string search complex keyword by its terms

Now, I know that keyword is not supposed to comprise unstructured text, but let's say that for some reason it just so happened that such text was written into keyword field.
When searching such documents using match or term queries, the document is not found, but when searched using query_string the document is found by a partial match(a "term" inside keyword). I don't understand how this is possible when the documentation for Elasticsearch clearly states that keyword is inverse-indexed as is, without terms tokenization.
Example:
My index mapping:
PUT my_index
{
"mappings": {
"my_type": {
"properties": {
"full_text": {
"type": "text"
},
"exact_value": {
"type": "keyword"
}
}
}
}
}
Then I put a document in:
PUT my_index/my_type/2
{
"full_text": "full text search",
"exact_value": "i want to find this trololo!"
}
And imagine my surprise when I get a document by keyword term, not a full match:
GET my_index/my_type/_search
{
"query": {
"match": {
"exact_value": "trololo"
}
}
}
- no result;
GET my_index/my_type/_search
{
"query": {
"term": {
"exact_value": "trololo"
}
}
}
- no result;
POST my_index/_search
{"query":{"query_string":{"query":"trololo"}}}
- my document is returned(!):
"hits": {
"total": 1,
"max_score": 0.27233246,
"hits": [
{
"_index": "my_index",
"_type": "my_type",
"_id": "2",
"_score": 0.27233246,
"_source": {
"full_text": "full text search",
"exact_value": "i want to find this trololo!"
}
}
]
}
when you do a query_string query on elastic like below
POST index/_search
{
"query": {
"query_string": {
"query": "trololo"
}
}
}
This actually do a search on _all field which if you don't mention get analyzed by standard analyzer in elastic.
If you specify the field in query like the following you won't get records for keyword field.
POST my_index/_search
{
"query": {
"query_string": {
"default_field": "exact_value",
"query": "field"
}
}
}

Does analyzer prevent fields from highlighting?

could you help me with little problem regarding language-specific analyzers and highliting in elasticsearch?
I need search documents by a query string and highlight matched strings.
here is my mapping:
{
"usr": {
"properties": {
"text0": {
"type": "string",
"analyzer": "english"
},
"text1": {
"type": "string"
}
}
}
}
Note, that for "text0" field "english" analyzer is set, and for "text1" field is used standard analyzer by default.
In my index there is one document for now:
hits": [{
"_index": "tt",
"_type": "usr",
"_id": "AUxvIPAv84ayQMZV-3Ll",
"_score": 1,
"_source": {
"text0": "highlighted. need to be highlighted.",
"text1": "highlighted. need to be highlighted."
}
}]
Consider following query:
{
"query": {
"query_string" : {
"query" : "highlighted"
}
},
"highlight" : {
"fields" : {
"*" : {}
}
}
}
I've expected each field in the document to be highlighted, but highlighting appeared only in "text1" field (where is no analyzer set):
"hits": [{
"_type": "usr",
"_source": {
"text0": "highlighted. need to be highlighted.",
"text1": "highlighted. need to be highlighted."
},
"_score": 0.19178301,
"_index": "tt",
"highlight": {
"text1": [
"<em>highlighted</em>. need to be <em>highlighted</em>."
]
},
"_id": "AUxvIPAv84ayQMZV-3Ll"
}]
Let's consider the following query(I expected "highlighted" matches "highlight" because of analyzer):
{
"query": {
"query_string" : {
"query" : "highlight"
}
},
"highlight" : {
"fields" : {
"*" : {}
}
}
}
But there was no hist in response at all: (Did the english analyzer even work here?)
"hits": {
"hits": [],
"total": 0,
"max_score": null
}
At last, consider some curl commands (requests and responses):
curl "http://localhost:9200/tt/_analyze?field=text0" -d "highlighted"
{"tokens":[{
"token":"highlight",
"start_offset":0,
"end_offset":11,
"type":"<ALPHANUM>",
"position":1
}]}
curl "http://localhost:9200/tt/_analyze?field=text1" -d "highlighted"
{"tokens":[{
"token":"highlighted",
"start_offset":0,
"end_offset":11,
"type":"<ALPHANUM>",
"position":1
}]}
We see, by passing text through the english and standard analyzers, the result is different.
Finally, the question: does analyzer prevent fields from highlighting? How can I get my fields highlighted while full-text search?
P.S. I use elasticsearch v1.4.4 on my local machine with windows 8.1.
It has to do with your query. You are using the query_string query and you are not specifying the field so it is searching on the _all field by default.
That is why you're seeing the strange results. Change your query to a multi_match query that searches on both fields:
{
"query": {
"multi_match": {
"fields": [
"text1",
"text0"
],
"query": "highlighted"
}
},
"highlight": {
"fields": {
"*": {}
}
}
}
Now highlight results for both fields will returned in the response.

Elasticsearch exact match of specific field(s)

I'm trying to filter my elasticsearch index by specific fields, the "country" field to be exact. However, I keep getting loads of other results (other countries) back that are not exact.
Please could someone point me in the right direction.
I've tried the following searches:
GET http://127.0.0.1:9200/decision/council/_search
{
"query": {
"filtered": {
"filter": {
"term": {
"country": "Algeria"
}
}
}
}
}
Here is an example document:
{
"_index": "decision",
"_id": "54290140ec882c6dac5ae9dd",
"_score": 1,
"_type": "council",
"_source": {
"document": "DEV DOCUMENT"
"id": "54290140ec882c6dac5ae9dd",
"date_updated": 1396448966,
"pdf_file": null,
"reported": true,
"date_submitted": 1375894031,
"doc_file": null,
"country": "Algeria"
}
}
You can use the match_phrase query instead
POST http://127.0.0.1:9200/decision/council/_search
{
"query" : {
"match_phrase" : { "country" : "Algeria"}
}
}

Resources