Elasticsearch exact matches when query text is a substring - elasticsearch

I have data in my Elasticsearch with a field
PUT /logs/visited_domains/1
{
"visited_domain":"microsoft.com"
}
PUT /logs/visited_domains/2
{
"visited_domain":"not-microsoft.com"
}
The mapping is:
{
"properties": {
"visited_domain": {
"type": "string",
"index": "not_analyzed"
}
}
}
When I do an ElasticSearch of
{
"query": {
"filtered": {
"filter": {
"term": {
"visited_domain": "microsoft.com"
}
}
}
}
}
I will get both results. But I only want the exact match. Any ideas of how I alter the query or improve the mapping?
EDIT: I changed one of my examples from notmicrosoft.com to not-microsoft.com because this dash is causing alot of the trouble. notmicrosoft.com does not return, but not-microsoft.com does, when searching for microsoft.com.

Use query_string which gives exact match when used with quotes
"query": {
"query_string": {
"default_field": "visited_domain",
"query": "\"microsoft.com\""
}
}

Related

Elasticsearch nested object query_string

I have question about query_string query in ElasticSearch. I want create fulltext search over all types and fields in index. Is query_string string performed against nested objects ? For example I have this mapping
{
"my_index": {
"mappings": {
"my_type": {
"properties": {
"group": {
"type": "string"
},
"user": {
"type": "nested",
"properties": {
"first": {
"type": "string"
},
"last": {
"type": "string"
}
}
}
}
}
}
}
}
And the query
GET /my_index/_search
{
"query": {
"query_string" : {
"query" : "paul"
}
}
}
So when I call the query, will ES search across all fields including nested or only in my_type object and for nested search I will have to use nested query ?
You cannot reference nested fields from a query_string at the root. i.e. this won't work:
{
"query": {
"query_string": {
"query": "myNestedObj.myTextField:food"
}
}
}
To search in specific nested fields, you must use the nested clause:
{
"query": {
"nested": {
"path": "myNestedObj",
"query": {
"query_string": {
"query": "myNestedObj.myTextField:food"
}
}
}
}
}
}
However, I've found that the pseudo-field "_all" does include nested fields, so this query would find documents containing 'food' in myNestedObj.myTextField (as well as anywhere else)
{
"query": {
"query_string": {
"query": "_all:food"
}
}
}
Try:
GET my_index/_search?q=paul

Exact match in elastic search query

I want to exactly match the string ":Feed:" in a message field and go back a day pull all such records. The json I have seems to also match the plain word " feed ". I am not sure where I am going wrong. Do I need to add "constant_score" to this query JSON? The JSON I have currently is as shown below:
{
"query": {
"bool": {
"must": {
"query_string": {
"fields": ["message"],
"query": "\\:Feed\\:"
}
},
"must": {
"range": {
"timestamp": {
"gte": "now-1d",
"lte": "now"
}
}
}
}
}
}
As stated here: Finding Exact Values, since the field has been analyzed when indexed - you have no way of exact-matching its tokens (":"). Whenever the tokens should be searchable the mapping should be "not_analyzed" and the data needs to be re-indexed.
If you want to be able to easily match only ":feed:" inside the message field you might want to costumize an analyzer which doesn't tokenize ":" so you will be able to query the field with a simple "match" query instead of wild characters.
Not able to do this with query_string but managed to do so by creating a custom normalizer and then using a "match" or "term" query.
The following steps worked for me.
create a custom normalizer (available >V5.2)
"settings": {
"analysis": {
"normalizer": {
"my_normalizer": {
"type": "custom",
"filter": ["lowercase"]
}
}
}
}
Create a mapping with type "keyword"
{
"mappings": {
"default": {
"properties": {
"title": {
"type": "text",
"fields": {
"normalize": {
"type": "keyword",
"normalizer": "my_normalizer"
},
"keyword" : {
"type": "keyword"
}
}
}
}
}
}
use match or term query
{
"query": {
"bool": {
"must": [
{
"match": {
"title.normalize": "string to match"
}
}
]
}
}
}
Use match phrase
GET /_search
{
"query": {
"match_phrase": {
"message": "7000-8900"
}
}
}
In java use matchPhraseQuery of QueryBuilder
QueryBuilders.matchPhraseQuery(fieldName, searchText);
Simple & Sweet Soln:
use term query..
GET /_search
{
"query": {
"term": {
"message.keyword": "7000-8900"
}
}
}
use term query instead of match_phrase,
match_phrase this find/match with ES-document stored sentence, It will not exactly match. It matches with those sentence words!

Find documents with boolean value on elasticsearch

I am new to Elasticsearch. I have a mapping which has a boolean field:
{
...
"bool_field": {
"type": "boolean"
},
...
}
How possible to find documents on a boolean value without specifying the name of the field?
I tried one of the following, but without result:
{
"query": {
"match_all": {}
},
"filter": {
"query": {
"query_string": {
"query": "true"
}
}
}
}
Thanks!
Not that easy, by default, because boolean fields are not include_in_all (the _all field is used by default by query_string). This explains why your query doesn't work.
What you can do, though, is to use copy_to to create your own custom _all field and use that in the query_string.
Something like this:
"bool_field": {
"type": "boolean",
"copy_to": "_all_booleans"
}
And then
"query_string": {
"default_field": "_all_booleans",
"query": "true"
}
or
"query_string": {
"query": "_all_booleans:true"
}

Elasticsearch: how to disable scoring on a field?

I am new to Elasticsearch and please forgive me if the answer is obvious.
Here is what I have for the mapping of the field in question:
"condition" : { "type" : "string", "store" : "no", "index": "not_analyzed", "omit_norms" : "true" }
I need search on this field, but I need 100% string match (no stemming, etc.) on a sub-string (blank separated). An example of this field in a document is as follows:
{
"condition": "abc xyz"
}
An example query is:
/_search?q=condition:xyz
Is the above mapping correct? I also used omit_norms (true). Is this a correct thing to do in my case?
How can I disable scoring on this field? Can I do it in mapping? What is the best way of doing it? (Actually I need to disable scoring on more than one. I do have fields that need scoring)
Thanks and regards!
Using omit_norms:true will not take the length of the field into consideration for the scoring, Elasticsearch won't index the norms information. So if you don't want to use scoring that is a good thing to do as it will save you some disk space.
If you're not interested in scoring in your queries use a filtered query:
{
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"bool": {
"must": {
"term": {
"condition": "abc xyz"
}
}
}
}
}
}
}
The new syntax for a filtered query is now:
{
"query": {
"bool": {
"must": {
"match_all": {}
},
"filter": {
"term": {
"condition": "abc"
}
}
}
}
}

Couchbase - Elasticsearch search issue

I've followed the Couchbase - ElasticSearch tutorial integration and I'm testing it with the beer-sample bucket.
I have an issue.
I can do a query like:
{
"query": {
"match": {
"doc.name": "IPA"
}
}
}
but if I search like that:
{
"query": {
"filtered": {
"query": {
"match_all": { }
},
"filter": {
"term": { "doc.name": "IPA" }
}
}
}
}
I don't obtain any result.
With other string field I don't have problems, for example, the "type" : "beer"
{
"query": {
"match": {
"doc.type": "beer"
}
}
}
{
"query": {
"filtered": {
"query": {
"match_all": { }
},
"filter": {
"term": { "doc.name": "beer" }
}
}
}
}
I don't know why.
Thanks in advance
It is because of your analyzer. For strings, the default analyzer lowercases the imput. So, IPA is indexed as ipa.
A term filter does not analyze your imput, and thus, you search for IPA and in your index, you have ipa --> IPA != ipa , and thus, the document do not match.
The match query, on the other end, analyzes your input using the analyzer that was set for the field, thus, your input is lowercased and you search for ipa.
I hope it makes sense.

Resources