Elasticsearch query_string search complex keyword by its terms - elasticsearch

Now, I know that keyword is not supposed to comprise unstructured text, but let's say that for some reason it just so happened that such text was written into keyword field.
When searching such documents using match or term queries, the document is not found, but when searched using query_string the document is found by a partial match(a "term" inside keyword). I don't understand how this is possible when the documentation for Elasticsearch clearly states that keyword is inverse-indexed as is, without terms tokenization.
Example:
My index mapping:
PUT my_index
{
"mappings": {
"my_type": {
"properties": {
"full_text": {
"type": "text"
},
"exact_value": {
"type": "keyword"
}
}
}
}
}
Then I put a document in:
PUT my_index/my_type/2
{
"full_text": "full text search",
"exact_value": "i want to find this trololo!"
}
And imagine my surprise when I get a document by keyword term, not a full match:
GET my_index/my_type/_search
{
"query": {
"match": {
"exact_value": "trololo"
}
}
}
- no result;
GET my_index/my_type/_search
{
"query": {
"term": {
"exact_value": "trololo"
}
}
}
- no result;
POST my_index/_search
{"query":{"query_string":{"query":"trololo"}}}
- my document is returned(!):
"hits": {
"total": 1,
"max_score": 0.27233246,
"hits": [
{
"_index": "my_index",
"_type": "my_type",
"_id": "2",
"_score": 0.27233246,
"_source": {
"full_text": "full text search",
"exact_value": "i want to find this trololo!"
}
}
]
}

when you do a query_string query on elastic like below
POST index/_search
{
"query": {
"query_string": {
"query": "trololo"
}
}
}
This actually do a search on _all field which if you don't mention get analyzed by standard analyzer in elastic.
If you specify the field in query like the following you won't get records for keyword field.
POST my_index/_search
{
"query": {
"query_string": {
"default_field": "exact_value",
"query": "field"
}
}
}

Related

Some weird problem with fuzzy query in elasticsearch

I have one doc in es
"_type": "_doc",
"_id": "109487",
"_score": null,
"_source": {
"id": "109487",
"title": "Interstellar",
"year": 2014,
"genre": [
"Sci-Fi",
"IMAX"
]
},
"sort": [
"Interstellar"
]
}
I am searching with a fuzzy query like
{
"query": {
"fuzzy": {
"title": {"value": "intersteller", "fuzziness": 1}
}
}
}
But the weird thing is if i am searching with small i in intersteller then i am getting the desired record with title as Interstellar but if i am searching with Capital I ie if my query is
"query": {
"fuzzy": {
"title": {"value": "Intersteller", "fuzziness": 1}
}
}
}
then am not getting and docs from db .. just wanted to understand what is happening behind the scenes
The fuzzy query does not analyze the text. Mostly fuzzy query acts like a term query itself.
In your case "title" field must be using standard analyzer. So "Intersteller" is indexed as "intersteller". Now when you are performing a fuzzy query on "intersteller", you will get the result but not with "Intersteller"
To know more about fuzzy query refer to this elasticsearch blog
It is better to use a match query along with the fuzziness parameter
{
"query": {
"match": {
"title": {
"query": "Intersteller",
"fuzziness": "auto"
}
}
}
}
If you want use fuzzy query, then you need to increase the fuzziness parameter, to get your document to match
{
"query": {
"fuzzy": {
"title": {
"value": "Intersteller",
"fuzziness": 3
}
}
}
}

Query and exclude in ElasticSearch

I'm trying to use the match_phrase_prefix query with an exclude query, so that it matches all terms except for the terms to be exclude. I have it figured out in a basic URI query, but not the regular JSON query. How do I convert this URI into a JSON type query?
"http://127.0.0.1:9200/topics/_search?q=name:"
+ QUERY + "* AND !name=" + CURRENT_TAGS
Where CURRENT_TAGS is a list of tags not to match with.
This is what I have so far:
{
"query": {
"bool": {
"must": {
"match_phrase_prefix": {
"name": "a"
}
},
"filter": {
"terms": {
"name": [
"apple"
]
}
}
}
}
}
However, when I do this apple is still included in the results. How do I exclude apple?
You are almost there, you can use must_not, which is part of boolean query to exclude the documents which you don't want, below is working example on your sample.
Index mapping
{
"mappings": {
"properties": {
"name": {
"type": "text"
}
}
}
}
Index sample docs as apple and amazon worlds biggest companies which matches your search criteria :)
Search query to exclude apple
{
"query": {
"bool": {
"must": {
"match_phrase_prefix": {
"name": "a"
}
},
"must_not": {
"match": {
"name": "apple"
}
}
}
}
}
Search results
"hits": [
{
"_index": "matchprase",
"_type": "_doc",
"_id": "2",
"_score": 0.6931471,
"_source": {
"name": "amazon"
}
}
]

Returning documents that match multiple wildcard string queries

I'm new to Elasticsearch and would greatly appreciate help on this
In the query below I only want the first document to be returned, but instead both documents are returned. How can I write a query to search for two wildcard strings on two separate fields, but only return documents that match?
I think what's being returned currently is score dependent, but I don't need the score.
POST /pr/_doc/1
{
"type": "Type ONE",
"currency":"USD"
}
POST /pr/_doc/2
{
"type": "Type TWO",
"currency":"USD"
}
GET /pr/_search
{
"query": {
"bool": {
"must": [
{
"simple_query_string": {
"query": "Type ON*",
"fields": ["type"],
"analyze_wildcard": true
}
},
{
"simple_query_string": {
"query": "US*",
"fields": ["currency"],
"analyze_wildcard":true
}
}
]
}
}
}
Use below query which uses the default_operator: AND and query string for in depth information and further reading.
Search query
{
"query": {
"query_string": {
"query": "(Type ON*) AND (US*)",
"fields" : ["type", "currency"],
"default_operator" : "AND"
}
}
}
Index your sample docs and it returns your expected doc only:
"hits": [
{
"_index": "multiplequery",
"_type": "_doc",
"_id": "1",
"_score": 2.1823215,
"_source": {
"type": "Type ONE",
"currency": "USD"
}
}
]

Elasticsearch: How to highlight any field in document which contain searched string?

I am trying to build a full-text search query over thousand of documents with dynamic structure.
But the highlight method works only for specifically named fields.
If I want to use search over _all or _source it doesn't show any hihlighted result.
I already tried many various and tried to "googling" but with no success.
Basic query:
POST tracking*/_search
{
"query": {
"query_string": {
"query": "ci1483967534008.6100622#czcholsint372_te"
}
},
"highlight": {
"require_field_match": false
}
}
will return:
"hits": {
"total": 8,
"max_score": 13.482396,
"hits": [
{
"_index": "tracking-2017.01.09",
"_type": "cyclone",
"_id": "Cyclone1-UAT-ci1483967534008.6100622#czcholsint372_te-Messaging.Message.MessageUnpackaged.Request",
"_score": 13.482396,
"_source": {
... truncated ...
"received": "2017-01-09T13:12:14.008Z",
"tags": [],
"#timestamp": "2017-01-09T13:12:14.008Z",
"size": "3169",
"pairing": " ci1483967534008.6100622#czcholsint372_te <60a93b9-159835b287e-159835b79041a66cd1#ip.net> ErpExJets_RDC1_ProcessPurchaseOrder_9.4.1_20170109131207169 ErpExJets_RDC1_ProcessPurchaseOrder_9.4.1_20170109131207169",
}
},
but no highlight even if the searched string is in the pairing field.
Is it possible at all?
Thanks
Reddy
Elastic doumentation mentions this as Note - "in order to perform highlighting, the actual content of the field is required. If the field in question is stored (has store set to true in the mapping) it will be used, otherwise, the actual _source will be loaded and the relevant field will be extracted from it."
So unless you have _all set to true use the following query.
{
"query": {
"query_string": {
"query": "ci1483967534008.6100622#czcholsint372_te"
}
},
"highlight": {
"require_field_match": false,
"fields": {
"pairing": {}
}
}
}
If you have _all set to true for source docuemnt use the following
{
"query": {
"query_string": {
"query": "ci1483967534008.6100622#czcholsint372_te"
}
},
"highlight": {
"require_field_match": false,
"fields": {
"_all": {}
}
}
}
Hope this helps.

Elastic Search in a complex document

I have a document stated below. I would like to do a search but I could not do it as I lacked the knowledge. Please help. How can I do searches in ElasticSearch in complex aggregates?
My Document
{
"_index": "vehicles",
"_type": "car",
"_id": "e16bd474-fa8e-4858-ab6c-3bbb3d0aa603",
"_version": 1,
"found": true,
"_source": {
"Type": {
"Name": "Mustang"
}
}
}
My Search Query
GET _search
{
"query":{
"filtered": {
"filter": {
"term": {
"Name": "Mustang"
}
}
}
},
"from":0,
"size":10
}
The Standard Analyzer is being applied to your Name field, so the term Mustang is being stored in the index as mustang. Change your query to use "Name": "mustang" and you should get a match.
If you only want the doc with "Name" : "Mustang" you can use
"query" : {
"bool" : {
"must" : {
"term" : {
"Name" : "Mustang"
}
}
}
}
There are two issues:
You are using term filter which is searching for Mustang token in the index, however the standard analyzer is being used so it is actually indexed as mustang.
You are searching in the wrong field. You should be using nested notation e.g. Type.Name
This query should work as expected:
{"query":{ "filtered": { "filter": {
"term": { "Type.Name": "mustang" }
}}}}

Resources