Proximity-Relevance in elasticsearch - elasticsearch

I have an json record in the elastic search with fields
"streetName": "5 Street",
"name": ["Shivam Apartments"]
I tried the below query but it does not return anything if I add streetName bool in the query
{
"query": {
"bool": {
"must": [
{
"bool": {
"must": {
"match": {
"name": {
"query": "shivam apartments",
"minimum_should_match": "80%"
}
}
}
}
},
{
"bool": {
"must": {
"match": {
"streetName": {
"query": "5 street",
"minimum_should_match": "80%"
}
}
}
}
}
]
}
}
}
Document Mapping
{
"rabc_documents": {
"mappings": {
"properties": {
"name": {
"type": "text",
"analyzer": "autocomplete_analyzer",
"position_increment_gap": 0
},
"streetName": {
"type": "keyword"
}
}
}
}
}

Based on the E.S Documentation (Keywords in Elastic Search)
"Keyword fields are only searchable by their exact value".
Along with that keywords are case sensitive as well.
Taking aforementioned into account:
Searching for "5 street" will not match "5 Street" ('s' vs 'S') on keyword field
minimum_should_match will not work on a keyword field.
Suggestion: For partial matches use "text" mapping instead of "keyword". Keywords are meant to be used for filtering, aggregation based on term, etc.

Related

Search-as-you-type inside arrays

I am trying to implement a search-as-you-type query inside an array.
This is the structure of the documents:
{
"guid": "6f954d53-df57-47e3-ae9e-cb445bd566d3",
"labels":
[
{
"name": "London",
"lang": "en"
},
{
"name": "Llundain",
"lang": "cy"
},
{
"name": "Lunnainn",
"lang": "gd"
}
]
}
and up to now this is what I came with:
{
"query": {
"multi_match": {
"fields": ["labels.name"],
"query": name,
"type": "phrase_prefix"
}
}
which works exactly as requested.
The problem is that I would like to search also by language.
What I tried is:
{
"query": {
"bool": {
"must": [
{
"multi_match": {
"fields": ["labels.name"],
"query": "london",
"type": "phrase_prefix"
}
},
{
"term": {
"labels.lang": "gd"
}
}
]
}
}
}
but these queries act on separate values of the array.
So, for example, I would like to search only Welsh language (cy). That means that my query that contains the city name should match only values that have "cy" on the "lang" tag.
How do I write this kind of query?
Internally, ElasticSearch flattens nested JSON objects, so it can't correlate the lang and name of a specific element in the labels array. If you want this kind of correlation, you'll need to index your documents differently.
The usual way to do this is to use the nested data type with a matching nested query.
The query would end up looking something like this:
{
"query": {
"nested": {
"path": "labels",
"query": {
"bool": {
"must": [
{
"multi_match": {
"fields": ["labels.name"],
"query": "london",
"type": "phrase_prefix"
}
},
{
"term": {
"labels.lang": "gd"
}
}
]
}
}
}
}
}
But note that you'll need to also specify nested mappings for your labels, e.g.:
"properties": {
"labels": {
"type": "nested",
"properties": {
"name": {
"type": "text"
/* you might want to add other mapping-related configuration here */
},
"lang": {
"type": "keyword"
}
}
}
}
Other ways to do this include:
Indexing each label as a separate document, repeating the guid field
Using parent/child documents
You should use Nested datatype in mapping instead of Object datatype. For detail explanation refer this:
https://www.elastic.co/guide/en/elasticsearch/reference/current/nested.html
So, you should define mapping of your field something like this:
{
"properties": {
"labels": {
"type": "nested",
"properties": {
"name": {
"type": "text"
},
"lang": {
"type": "keyword"
}
}
}
}
}
After this you could query using Nested Query as:
{
"query": {
"nested": {
"path": "labels",
"query": {
"bool": {
"must": [
{
"multi_match": {
"fields": ["labels.name"],
"query": "london",
"type": "phrase_prefix"
}
},
{
"term": {
"labels.lang": "gd"
}
}
]
}
}
}
}
}

ElasticSearch combine must-match with multi-match

I have been trying to combine MUST-MATCH with MULTI-MATCH but couldn't get it to work. Basically I want these MUST conditions:
"must": [{ "match": { "city": $city } },
{ "match": { "is_displayed": 1 } },
{ "match": { "status": "active" } }]
and I want these matches:
"multi_match": {
"query": $query,
"type": $selectedType,
"fields": fieldArray,
}
where $query is the textbox values $selectedType is one of the multi-match query types and fieldArray is the fields to search for. For example, when the text box value is "hello world" and fieldArray is ['title', 'cuisine'], either "hello" and/or "world" must match either or all of the specified fields. Any insight and advice is appreciated.
I guess adding another clause in must block will do the needful.
{
"query": {
"bool": {
"must": [
{
"match": {
"city": "$city"
}
},
{
"match": {
"is_displayed": 1
}
},
{
"match": {
"status": "active"
}
},
"query_string": {
"fields": fieldArray,
"query": "*$query*"
}
}
]
}
}
}

terms query does not support minimum_match in Elasticsearch 2.3.3

I want to search documents which match at least two key words using terms query like this:
{
"query": {
"terms": {
"title": ["java","编程","思想"],
"minimum_match": 2
}
},
"highlight": {
"fields": {
"title": {}
}
}
}
It returns "terms query does not support minimum_match".What's wrong with my query?
The correct name was minimum_should_match and that setting has been deprecated in ES 2.0.
What you can do instead is to use a bool/should query with three term queries and the minimum_should_match setting for bool/should queries:
{
"query": {
"bool": {
"minimum_should_match": 2,
"should": [
{
"term": {
"title": "java"
}
},
{
"term": {
"title": "编程"
}
},
{
"term": {
"title": "思想"
}
}
]
}
},
"highlight": {
"fields": {
"title": {}
}
}
}

exact match query in elasticsearch

I'm trying to run an exact match query in ES
in MYSQL my query would be:
SELECT * WHERE `content_state`='active' AND `author`='bob' AND `title` != 'Beer';
I looked at the ES docs here:
https://www.elastic.co/guide/en/elasticsearch/guide/current/_finding_exact_values.html
and came up with this:
{
"from" : '.$offset.', "size" : '.$limit.',
"filter": {
"and": [
{
"and": [
{
"term": {
"content_state": "active"
}
},
{
"term": {
"author": "bob"
}
},
{
"not": {
"filter": {
"term": {
"title": "Beer"
}
}
}
}
]
}
]
}
}
but my results are still coming back with the title = Beer, it doesn't seem to be excluding the titles that = Beer.
did I do something wrong?
I'm pretty new to ES
I figured it out, I used this instead...
{
"from" : '.$offset.', "size" : '.$limit.',
"query": {
"bool": {
"must": [
{
"query_string": {
"default_field": "content_state",
"query": "active"
}
},
{
"query_string": {
"default_field": "author",
"query": "bob"
}
}
],
"must_not": [
{
"query_string": {
"default_field": "title",
"query": "Beer"
}
}
]
}
}
}
Query String Query is a pretty good concept to handle various relationship between search criteria. Have a quick look into Query string query syntax to understand in detail about this concept
{
"query": {
"query_string": {
"query": "(content_state:active AND author:bob) AND NOT (title:Beer)"
}
}
}
Filters are supposed to work on exact values, if you had defined your mapping in a manner where title was a non-analyzed field, your previous attempt ( with filters) would have worked as well.
{
"mappings": {
"test": {
"_all": {
"enabled": false
},
"properties": {
"content_state": {
"type": "string"
},
"author": {
"type": "string"
},
"title": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}

Elastic Search : Match Query not working in Nested Bool Filters

I am able to get data for the following elastic search query :
{
"query": {
"filtered": {
"query": [],
"filter": {
"bool": {
"must": [
{
"bool": {
"should": [
{
"term": {
"gender": "malE"
}
},
{
"term": {
"sentiment": "positive"
}
}
]
}
}
]
}
}
}
}
}
However, If I query using "match" - I get error message with 400 status response
{
"query": {
"filtered": {
"query": [],
"filter": {
"bool": {
"must": [
{
"bool": {
"should": [
{
"match": {
"gender": "malE"
}
},
{
"term": {
"sentiment": "positive"
}
}
]
}
}
]
}
}
}
}
}
Is match query not supported in nested bool filters ?
Since the term query looks for the exact term in the field’s inverted index and I want to query gender data as case_insensitive field - Which approach shall I try ?
Settings of the index :
{
"settings": {
"index": {
"analysis": {
"analyzer": {
"analyzer_keyword": {
"tokenizer": "keyword",
"filter": "lowercase"
}
}
}
}
}
}
Mapping for field Gender:
{"type":"string","analyzer":"analyzer_keyword"}
The reason you're getting an error 400 is because there is no match filter, only match queries, even though there are both term queries and term filters.
Your query can be as simple as this, i.e. no need for a filtered query, simply put your term and match queries into a bool/should:
{
"query": {
"bool": {
"should": [
{
"match": {
"gender": "male"
}
},
{
"term": {
"sentiment": "positive"
}
}
]
}
}
}
This answer is for ElasticSearch 7.x. As I understand from the question, you would like to use a match query for the gender field and a term query for the sentiment field. The mappings for each of these field should look like below:
"sentiment": {
"type": "keyword"
},
"gender": {
"type": "text"
}
The corresponding search API would be:
"query": {
"bool": {
"must": [
{
"terms": {
"sentiment": [
"very positive", "positive"
]
}
},
{
"match": {
"gender": "malE"
}
}
]
}
}
This search API returns all the documents where gender is "Male"/"MALE"/"mALe" etc. So, you may have indexed the gender field holding "mALe", but, the match query for "gender": "malE" will still be able to retrieve it. In the latest version of ElasticSearch, if the query is a match type, the value (which is "gender": "malE") will be automatically lower cased internally before search begins. But, it should not be that tough for a client of the API to pass a lowercase to the match query at the onset itself. Coming to the sentiment field, since, its a keyword field, you can search for values that contain spaces too like very positive.

Resources