How to get Elastic search to return both exact matched and then other matches in result - elasticsearch

Need help with Elasticsearch. I try to get first exact match result then those documents that have one field matched using the following query but with no luck. Basically, trying to get top score hits first and then less accurate and only matched by one field in the total search result.
The mapping is as following:
{
"palsx1493": {
"mappings": {
"pals": {
"properties": {
"aboutme": {
"type": "string"
},
"dob": {
"type": "date",
"format": "date"
},
"fccode": {
"type": "string"
},
"fcname": {
"type": "string"
},
"learning": {
"type": "nested",
"properties": {
"skillslevel": {
"type": "string"
},
"skillsname": {
"type": "string"
}
}
},
"name": {
"type": "string"
},
"rating": {
"type": "string"
},
"teaching": {
"type": "nested",
"properties": {
"skillslevel": {
"type": "string"
},
"skillsname": {
"type": "string"
}
}
},
"trate": {
"type": "string"
},
"treg": {
"type": "string"
}
}
}
}
}
}
When Searching, I need the result to return the exact matched documents followed by lower score matched with the teaching skillname in that prioritized order. what happens now is that I get the exact matches correctly first and then I get the learning.skillname matched, and then teaching.skillname matched. I want these two last ones swapped having the teaching.skillname coming after the exact matched results.
Exact match:
1. fcname (is crom country name and can be either a specific name or just set to "Any Country".
2. dob: Date of birth is a range value - a range value is given as input
3. teaching: skillname
4. learning: skillname
This is what I have tried with no luck:
{
"query": {
"bool": {
"should": [
{ "match": { "fcname": "spain"}},
{ "range": {
"bod": {
"from": "1950-10-10",
"to": "1967-12-12"
}
}
},
{
"nested": {
"path": "learning",
"score_mode": "max",
"query": {
"bool": {
"must": [
{ "match": { "learning.skillname": learningSkillName}}
]
}
}
}
},
{
"nested": {
"path": "teaching",
"query": {
"bool": {
"must": [
{ "match": { "teaching.skillname": teachingSkillName}}
]
}
}
}
}
]
}
}
}

Please look into indices. The default is a full text search which does inverted indexing to store data. So it would store the string according to the analyzer.
Fo exact string match please use : index = 'not_analyzed'
eg.
"nick"{
"type": "string",
"index":"not_analyzed"
},
https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-core-types.html

I figured it out. Solution was to use function_score feature to override/ add score to a document with certain matched field. Replacing the nested part above with following gave me the correct result:
"nested": {
"path": "teaching",
"query": {
"function_score": {
"query": {
"bool": {
"must": [
{ "match": { "teaching.skillname": "xxx"}}
]
}
},
"functions": [
{
"script_score": {
"script": "_score + 2"
}
}],

Related

Search-as-you-type inside arrays

I am trying to implement a search-as-you-type query inside an array.
This is the structure of the documents:
{
"guid": "6f954d53-df57-47e3-ae9e-cb445bd566d3",
"labels":
[
{
"name": "London",
"lang": "en"
},
{
"name": "Llundain",
"lang": "cy"
},
{
"name": "Lunnainn",
"lang": "gd"
}
]
}
and up to now this is what I came with:
{
"query": {
"multi_match": {
"fields": ["labels.name"],
"query": name,
"type": "phrase_prefix"
}
}
which works exactly as requested.
The problem is that I would like to search also by language.
What I tried is:
{
"query": {
"bool": {
"must": [
{
"multi_match": {
"fields": ["labels.name"],
"query": "london",
"type": "phrase_prefix"
}
},
{
"term": {
"labels.lang": "gd"
}
}
]
}
}
}
but these queries act on separate values of the array.
So, for example, I would like to search only Welsh language (cy). That means that my query that contains the city name should match only values that have "cy" on the "lang" tag.
How do I write this kind of query?
Internally, ElasticSearch flattens nested JSON objects, so it can't correlate the lang and name of a specific element in the labels array. If you want this kind of correlation, you'll need to index your documents differently.
The usual way to do this is to use the nested data type with a matching nested query.
The query would end up looking something like this:
{
"query": {
"nested": {
"path": "labels",
"query": {
"bool": {
"must": [
{
"multi_match": {
"fields": ["labels.name"],
"query": "london",
"type": "phrase_prefix"
}
},
{
"term": {
"labels.lang": "gd"
}
}
]
}
}
}
}
}
But note that you'll need to also specify nested mappings for your labels, e.g.:
"properties": {
"labels": {
"type": "nested",
"properties": {
"name": {
"type": "text"
/* you might want to add other mapping-related configuration here */
},
"lang": {
"type": "keyword"
}
}
}
}
Other ways to do this include:
Indexing each label as a separate document, repeating the guid field
Using parent/child documents
You should use Nested datatype in mapping instead of Object datatype. For detail explanation refer this:
https://www.elastic.co/guide/en/elasticsearch/reference/current/nested.html
So, you should define mapping of your field something like this:
{
"properties": {
"labels": {
"type": "nested",
"properties": {
"name": {
"type": "text"
},
"lang": {
"type": "keyword"
}
}
}
}
}
After this you could query using Nested Query as:
{
"query": {
"nested": {
"path": "labels",
"query": {
"bool": {
"must": [
{
"multi_match": {
"fields": ["labels.name"],
"query": "london",
"type": "phrase_prefix"
}
},
{
"term": {
"labels.lang": "gd"
}
}
]
}
}
}
}
}

Term query on nested fields returns no result in Elasticsearch

I have a nested type field in my mapping. When I use Term search query on my nested field no result is returned from Elasticsearch whereas when I change Term to Match query, it works fine and Elasticsearch returns expected result
here is my mapping, imagine I have only one nested field in my type mapping
{
"homing.estatefiles": {
"mappings": {
"estatefile": {
"properties": {
"DynamicFields": {
"type": "nested",
"properties": {
"Name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"ValueBool": {
"type": "boolean"
},
"ValueDateTime": {
"type": "date"
},
"ValueInt": {
"type": "long"
}
}
}
}
}
}
}
}
And here is my term query (which returns no result)
{
"from": 50,
"size": 50,
"query": {
"bool": {
"filter": [
{
"nested": {
"query": {
"bool": {
"must": [
{
"term": {
"DynamicFields.Name":{"value":"HasParking"}
}
},
{
"term": {
"DynamicFields.ValueBool": {
"value": true
}
}
}
]
}
},
"path": "DynamicFields"
}
}
]
}
}
}
And here is my query which returns expected result (by changing Term query to Match query)
{
"from": 50,
"size": 50,
"query": {
"bool": {
"filter": [
{
"nested": {
"query": {
"bool": {
"must": [
{
"match": {
"DynamicFields.Name":"HasParking"
}
},
{
"term": {
"DynamicFields.ValueBool": {
"value": true
}
}
}
]
}
},
"path": "DynamicFields"
}
}
]
}
}
}
This is happening because the capital letters with the analyzer of elastic.
When you are using term the elastic is looking for the exact value you gave.
up until now it sounds good, but before it tries to match the term, the value you gave go through an analyzer of elastic which manipulate your value.
For example in your case it also turn the HasParking to hasparking.
And than it will try to match it and of course will fail. They have a great explanation in the documentation in the "Why doesn’t the term query match my document" section. This analyzer not being activated on the value when you query using match and this why you get your result.

Elastic Search: filter query results by entry its field into another query results

I found the question about the IN equivalent operator:
ElasticSearch : IN equivalent operator in ElasticSearch
But I would to find equivalent to the another more complicated request:
SELECT * FROM table WHERE id IN (SELECT id FROM anotherTable WHERE something > 0);
Mapping:
First index:
{
"mappings": {
"products": {
"properties": {
"id": { "type": "integer" },
"name": { "type": "text" },
}
}
}
}
Second index:
{
"mappings": {
"reserved": {
"properties": {
"id": { "type": "integer" },
"type": { "type": "text" },
}
}
}
}
I want to get products which ids are contained in reserved index and have the specific type of a reserve.
First step - get all relevant ids from reserved index:
{
"size": 0,
"query": {
"bool": {
"must": [
{
"term": {
"type": "TYPE_HERE"
}
}
]
}
},
"aggregations": {
"ids": {
"terms": {
"field": "id"
}
}
}
}
--> see: Terms Aggregations, Bool Query and Term Query.
--> _source will retrieve only relevant field id.
Second step - get all relevant documents from products index:
{
"query": {
"bool": {
"must": [
{
"terms": {
"id": [
"ID_1",
"ID_2",
"AND_SO_ON..."
]
}
}
]
}
}
}
--> take all the ids from first step and put them as a list under terms:id[...]
--> see Terms Query.

ElasticSearch filtered query and filter term

I'm trying to use a filter on a filtered query, this is what I'm trying with Sense:
GET myindex/catalog/_search
{
"query": {
"filtered": {
"query": {
"query_string": {
"analyze_wildcard": true,
"query": "test",
"fields": ["title^3.5", "contributions.authors.name^5", "publisher^2", "formats.productCode^0.5", "description^0.1"],
"use_dis_max": true
}
},
"filter": {
"term": {
"sku": "test-687"
}
}
}
}
}
This query hasn't any hit, but if I remove the filter property I get exactly the item with sku = test-687.
I cannot understand why the query with the filter doesn't give me the same result.
Mapping:
{
"myindex": {
"mappings": {
"catalog": {
"properties": {
"sku": {
"type": "string"
},
"title": {
"type": "string"
},
"updated_at": {
"type": "date",
"format": "strict_date_optional_time||epoch_millis"
}
}
}
}
}
}
the full query is:
GET myindex/catalog/_search {
"query": {
"filtered": {
"query": {
"query_string": {
"analyze_wildcard": true,
"query": "test",
"fields": ["title^3.5", "contributions.authors.name^5", "publisher^2", "formats.productCode^0.5", "description^0.1"],
"use_dis_max": true
}
},
"filter": {
"bool": {
"must": {
"query": {
"match": {
"sku": "test-687"
}
}
}
}
}
}
}
}
With default mapping the "Standard Analyser is used" :
An analyzer of type standard is built using the Standard Tokenizer with the Standard Token Filter, Lower Case Token Filter, and Stop Token Filter.
(More details her )
Term is case sensitive, match not

How to query for two fields in one and the same tuple in an array in ElasticSearch?

Let's say there are some documents in my index which look like this:
{
"category":"2020",
"properties":[
{
"name":"foo",
"value":"2"
},
{
"name":"boo",
"value":"2"
}
]
},
{
"category":"2020",
"properties":[
{
"name":"foo",
"value":"8"
},
{
"name":"boo",
"value":"2"
}
]
}
I'd like to query the index in a way to return only those documents that match "foo":"2"but not "boo":"2".
I tried to write a query that matches both properties.name and properties.value, but then I'm getting false positives. I need a way to tell ElasticSearch that name and value have to be part of the same properties tuple.
How can I do that?
You need to map properties as a nestedtype. So your mapping would look similar to this:
{
"your_type": {
"properties": {
"category": {
"type": "string"
},
"properties": {
"type": "nested",
"properties": {
"name": {
"type": "string"
},
"value": {
"type": "string"
}
}
}
}
}
}
Then, your query to match documents having "foo=2" in the same tuple but not "boo=2" in the same tuple would need to use the nested query accordingly, like the one below.
{
"query": {
"bool": {
"must": [
{
"nested": {
"path": "properties",
"query": {
"bool": {
"must": [
{
"match": {
"properties.name": "foo"
}
},
{
"match": {
"properties.value": "2"
}
}
]
}
}
}
}
],
"must_not": [
{
"nested": {
"path": "properties",
"query": {
"bool": {
"must": [
{
"match": {
"properties.name": "boo"
}
},
{
"match": {
"properties.value": "2"
}
}
]
}
}
}
}
]
}
}
}
#Val's answer is as good as it gets. One thing I would add, though, since it makes the difference between one type of query and others that might benefit from nesteds "opposite" feature.
In Elasticsearch, the default type for "properties":[{"name":"foo","value":"2"},{"name":"boo","value":"2"}] that is used to auto-create such a field is object. The object has the drawback that it doesn't associate one sub-field's value with another sub-field's value, meaning foo is not necessarily associated with 2. name is just an array of values and value is the again another array of values with not association between the two.
If one needs the above association to work then nested is a must.
But, I have encountered situations where both these features were needed. If you need both of these, you can set include_in_parent: true for the mapping so that you can take advantage of both. One of the situations that I have seen is here.
"properties": {
"type": "nested",
"include_in_parent": true,
"properties": {
"name": {
"type": "string"
...

Resources