Elasticsearch - startswith filter? - elasticsearch

I am trying to get a simple startswith functionality in Elasticsearch. For example, I want the query "char" to match "charlotte", but I don't want it to match "dacharlotte". Using an edgeNgram filter gave me the latter result. I only want it to match results that START with the query terms, not just have them in them.

The simplest way to do what you want would be to use the prefix query:
{
"query": {
"prefix":{ "name" : "char" }
}
}
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-prefix-query.html

Related

I'm having a problem with elasticsearch, how do I query for these conditions

I'm having a problem with elasticsearch, how do I query for these conditions
beginsWith
endsWith
contains
You can use Wildcard Query to perform such queries, A wildcard operator is a placeholder that matches one or more characters. For example, the * wildcard operator matches zero or more characters. You can combine wildcard operators with other characters to create a wildcard pattern.
in your case you can use wildCard query like below for example to check if string start contain or endwith 'od':
beginsWith : od*
endsWith: *od
contains: *od*
Rest API call example for all terms contains "od":
GET /_search
{
"query": {
"wildcard": {
"text": {
"value": "*od*"
}
}
}
}
for more information you can check ES official documentation here
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-wildcard-query.html

Elasticsearch: What is the difference between a match and a term in a filter?

I was following an ES tutorial, and at some point I wrote a query using term in the filter instead the recommended solution using match. My understanding is that match was used in the query part to get scoring, while term was used in the filter part to just remove hits before enter the query part. To my surprise match also works in the filter part.
What is the difference between:
GET blogs/_search
{
"query": {
"bool": {
"filter": {
"match": {
"category.keyword": "News"
}
}
}
}
}
and:
GET blogs/_search
{
"query": {
"bool": {
"filter": {
"term": {
"category.keyword": "News"
}
}
}
}
}
Both returns the same hits, and the score is 0 for all hits.
What is the behaviour or match in a filter clause? I would expect it to yield some score, but it does not.
What I thought:
term : does not analyze either the parameter or the field, and it is a yes/no scenario.
match : analyzes parameter and field and calculates a score of how good they match.
But when using match against a keyword in the filter part of the query, how does it behave?
The match query is a high-level query that resorts to using a term query if it needs to.
Scoring has nothing to do with using match instead of term. Scoring kicks in when you use bool/must/should instead of bool/filter.
Here is how the match query works:
First, it checks the type of the field.
If it's a text field then the value will be analyzed, either with the analyzer specified in the query (if any), or with the search- or index-time analyzer specified in the mapping.
If it's a keyword field (like in your case), then the input is not analyzed and taken "as is"
Since you're using the match query on a keyword field and your input is a single term, nothing is analyzed and the match query resorts to using a term query underneath. This is why you're seeing the same results.
In general, it's always best to use a match query as it is smart enough to know what to do given the field you're querying and the input data you're searching for.
You can read more about the difference between the two here.

Elastic Search - Conditional field query if no match found for another field

Is it possible to do conditional field query if match was not found for another field ?
for eg: if I have a 3 fields in the index local_rating , global_rating and default_rating , I need to first check in local_rating and if there is no match then try for global_rating and finally for default_rating .
is this possible to do with one query ? or any other ways to achieve this
thanks in advance
Not sure about any existing features of Elasticsearh to fulfill your current requirements but you can try with fields and per-fields boosting, Individual fields can be boosted with the caret (^)notation. Also I don't know boosting is possible with numeric value or not?
GET /_search
{
"query": {
"multi_match" : {
"query" : 10,
"fields" : [ "local_rating^6", "global_rating^3","default_rating"]
}
}
}
See: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-multi-match-query.html#field-boost

Search text in elastic search ignoring uppercase and lowercase alphabet

First of all i am new to elastic search. I have field skillName:"Android Sdk". I map this field as keyword in elastic search. But problem is that when i search by something like
POST _search
{
"query": {
"match" : { "skillName" : "Android sdk" }
}
}
sdk is small in search query. It does not give me any result. How can i search ignoring lower or upper case of text when field is mapped as keyword
Yes, it's ignoring the case different from the original, since you used keyword analyzer, which didn't do anything with the token, but rather preserving it as it is. In your case it will do a match only if you query exact same token
So, I would propose to change this behaviour and at least apply lowercase token filter, so you will be able to match terms with different register.
To search case insensitive on a keyword field you need to use a normalizer, which was introduced in 5.2.0. See here for an example.
You can apply different analyzers to same « field » and have one for full text search and another one for sorting, aggregations.
Try the following:
{
"query": {
"query_string": {
"fields": [
"skillName"
],
"query": "Android sdk"
}
}
}

Is it possible to chain fquery filters in elastic search with exact matches?

I have been having trouble writing a method that will take in various search parameters in elasticsearch. I was working with queries that looked like this:
body:
{query:
{filtered:
{filter:
{and:
[
{term: {some_term: "foo"}},
{term: {is_visible: true}},
{term: {"term_two": "something"}}]
}
}
}
}
Using this syntax I thought I could chain these terms together and programatically generate these queries. I was using simple strings and if there was a term like "person_name" I could split the query into two and say "where person_name match 'JOHN'" and where person_name match 'SMITH'" getting accurate results.
However, I just came across the "fquery" upon asking this question:
Escaping slash in elasticsearch
I was not able to use this "and"/"term" filter searching a value with slashes in it, so I learned that I can use fquery to search for the full value, like this
"fquery": {
"query": {
"match": {
"by_line": "John Smith"
But how can I search like this for multiple items? IT seems that when i combine fquery and my filtered/filter/and/term queries, my "and" term queries are ignored. What is the best practice for making nested / chained queries using elastic search ?
As in the comment below, yes I can just add fquery to the "and" block like so
{:filtered=>
{:filter=>
{:and=>[
{:term=>{:is_visible=>true}},
{:term=>{:is_private=>false}},
{:fquery=>
{:query=>{:match=>{:sub_location=>"New JErsey"}}}}]}}}
Why would elasticsearch also return results with "sub_location" = "new York"? I would like to only return "new jersey" here.
A match query analyzes the input and by default it is a boolean OR query if there are multiple terms after the analysis. In your case, "New JErsey" gets analyzed into the terms "new" and "jersey". The match query that you are using will search for documents in which the indexed value of field "sub_location" is either "new" or "jersey". That is why your query also matches documents where the value of field "sub_location" is "new York" because of the common term "new".
To only match for "new jersey", you can use the following version of the match query:
{
"query": {
"match": {
"sub_location": {
"query": "New JErsey",
"operator": "and"
}
}
}
}
This will not match documents where the value of field "sub_location" is "New York". But, it will match documents where the value of field "sub_location" is say "York New" because the query finally translates into a boolean query like "York" AND "New". If you are fine with this behaviour, well and good, else read further.
All these issues arise because you are using the default analyzer for the field "sub_location" which breaks tokens at word boundaries and indexes them. If you really do not care about partial matches and want to always match the entire string, you can make use of custom analyzers to use Keyword Tokenizer and Lowercase Token Filter. Mind you, going ahead with this approach will need you to re-index all your documents again.

Resources