Elasticsearch match nested field against array of values - elasticsearch

I'm trying to apply a terms query on a nested field using mongoid-elasticsearch and ElasticSearch 2.0. This has come to be quite frustrating since the trial-error didn't pay off much and the docs on the subject are rather sparse.
Here is my query:
{
"query": {
"nested": {
"path": "awards",
"query": {
"bool": {
"must": [
{ "match": { "awards.year": "2010"}}
]
}
}
},
"nested":{
"path": "procuring_entity",
"query": {
"bool": {
"must": [
{ "terms": { "procuring_entity.country": ["ES", "PL"]}}
]
}
}
}
}
}
While "match" and "term", work just fine, when combined with the "terms" query it returns no results, even thought it should. My mappings looks like this:
elasticsearch!({
prefix_name: false,
index_name: 'documents',
index_options: {
mappings: {
document: {
properties: {
procuring_entity: {
type: "nested"
},
awards: {
type: "nested"
}
}
}
}
},
wrapper: :load
})
If "nested" doesn't count as analyzer (which as far as I know doesn't), than there's no problem with that. As for the second example,I don't think it's the case since the array of values that it's matched against comes from exterior.
Is terms query possible on nested fields? Am I doing something wrong?
Is there any other way to match a nested field against multiple values?
Any thoughts would be much appreciated.

I think you would need to change your mappings for your nested types for this - the terms query only works on not_analyzed fields. If you update your mapping to something like:
elasticsearch!({
prefix_name: false,
index_name: 'documents',
index_options: {
mappings: {
document: {
properties: {
procuring_entity: {
type: 'nested',
properties: {
country: {
'type': 'string',
'index': 'not_analyzed'
}
}
},
awards: {
type: 'nested'
}
}
}
}
},
wrapper: :load
})
I think the query should work if you do that.

Related

elasticsearch search - search multiple fields

I am new to elastic search and I am trying to search my index that has the following properties:
// user index
{
profile: {
name: string,
description: string,
city: string
state: string
},
services: [
{
serviceName: string
},
{
serviceName: string
},
{
serviceName: string
},
...
]
}
I am trying to combine my query_string and nested, but I cannot seem to get it to work.
body: {
query: {
query_string: {
query: `*${searchTerm}*`,
fields: [
// these profile fields work
'profile.name',
'profile.description',
'profile.city',
'profile.state',
// doesnt work, need to use nested
'services.serviceName'
]
},
// if i use nested here, elasticsearch throws error
// can I combine these two queries like this?
nested: {
}
}
}
I cannot find any good examples to have both a nested AND query_string search query.
Anyone have any suggestions?
Anything helps, even links to good docs / examples for the thing I am trying to do.
You can combine query_string and nested query, using the boolean query
{
"query": {
"bool": {
"must": [
{
"query": {
"query_string": {
"query": "search term",
"fields": []
}
}
},
{
"query": {
"nested": {
"path": "services",
"query": {
"services.serviceName": "blue"
}
}
}
}
]
}
}
}

Elastic ngram prioritise whole words

I am trying to build an autocomplete with several million possible values. I have managed to do it with two different methods match and ngram. The problem is that match requires the user to type whole words and ngram returns poor results. Is there a way to only return ngram results if there are no match results?
Method 1: match
Returns very relevant results but requires user to type a full word
//mapping
analyzer: {
std_english: {
type: 'standard',
stopwords: '_english_',
},
}
//search
query: {
bool: {
must: [
{ term: { semanticTag: type } },
{ match: { search } }
]}
}
Method 2: ngram
Returns poor matches
//mapping
analysis: {
filter: {
autocomplete_filter: {
type: 'edge_ngram',
min_gram: 1,
max_gram: 20,
},
},
analyzer: {
autocomplete: {
type: 'custom',
tokenizer: 'standard',
filter: ['lowercase', 'autocomplete_filter'],
},
},
//search
query: {
bool: {
must: [
{ term: { semanticTag: type } },
{ match: {
term: {
query: search,
operator: 'and',
}
}
}
]}
}
Try changing query to something like this -
{
"query": {
"bool": {
"must": [
{
"term": {
"semanticTag": "type"
}
},
{
"match_phrase_prefix": {
"fieldName": {
"query": "valueToSearch"
}
}
}
]
}
}
}
You can use match_phrase_prefix, by using this user will not need to type the whole word, anything that user types and which starts with indexed field data will get returned.
Just a note that this will also pull results from any available middle words from indexed documents as well.
For e.g. If data indexed in one of field is like - "lorem ipsum" and user type "ips" then you will get this whole document along with other documents that starts with "ips"
You can go with either standard or custom analyzer, you have to check which analyzer better suits your use case. According to information available in question, given above approach works well with standard analyzer.

Elasticsearch custom function score

i am trying to do a search with custom functions to modify document score.
I have a mapping with specialities stored inside a hospital and every speciality has a priority with it:
Something like:
hospital:{
name: 'Fortis',
specialities: [
{
name: 'Cardiology',
priority: 10
},
{
name: 'Oncology',
priority: 15
}
]
}
Now i have a function score :
functions: [{
filter: {terms: {'specialities.name' => params[:search_text]}},
script_score: {script: "_score * doc['specialities.priority'].value"}
},
I have a filter query to match the search text to any speciality.
Like if i search Oncology, it will match and then I have specified a script_score to take priority of that speciality and add it to final score of document.
But, it is taking the priority of the first speciality it encounters that is 10 and a score of 1 for the filter matched and the end score is 11 not 21 (priority of oncology + 1 for filter match)
I solved it using nested mapping in elasticsearch.
Lucene internally has no concept of storing object mappings by default, so if I am looking to store priority for every speciality I should have a mapping like this:
hospital: {
properties: {
specialities: {
type: nested,
properties: {
name: {
type: 'string'
}priority: {
type: 'long'
}
}
}
}
}
Reference: https://www.elastic.co/guide/en/elasticsearch/reference/2.0/nested.html
After that, I was able to define function score with nested query and my query looks like this:
"query": {
"filtered": {
"query": {
"bool": {
"must": [
{
"nested": {
"path": "specialities",
"query": {
"function_score": {
"score_mode": "sum",
"boost_mode": "sum",
"filter": {
"terms": {
"specialities.name.raw": ["Oncology"]
}
},
"functions": [
{
"field_value_factor": {
"field": "specialities.priority"
}
}
]
}
}
}
}
]
}
}
}
}

search documents on the basis of matching fields and length in array - Elasticseach

I have a document structure like bellow in elasticsearch,
{
_id: 1,
name: 'abc',
post: [{
type: 'text',
url: '__url___'
}, {
type: 'image',
url: '__url___'
}, {
type: 'text',
url: '__url___'
}, {
type: 'video',
url: '__url___'
}, {
type: 'text',
url: '__url___'
}]
}
And I want to search on documents that have posts with type as text appearing more than two times. Is it possible in Elasticsearch?
Option 1
You need to use a script for this type of search, for a field called post and a sub-field called type:
{
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"script": {
"script": "_source.post.type.count(param1)>2",
"params": {
"param1": "text"
}
}
}
}
}
}
And make sure you enable inline scripts in your configuration file:
script.engine.groovy.inline.search: on
Option 2
This operation can, also, be done at indexing time to save some time when searching, using a transform. Something like this:
{
"mappings": {
"test": {
"transform": {
"script": "if(ctx._source.post.type.count(param1)>2) ctx._source['count_texts']=ctx._source.post.type.count(param1);",
"params": {
"param1": "text"
}
},
"properties": {
"name": {
"type": "string"
},
"count_texts": {
"type": "integer"
},
...
Making sure you enable the proper scripting settings in the configuration file:
script.engine.groovy.inline.mapping: on
And, at search time, a query like this should do it:
{
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"range": {
"count_texts": {
"gte": 2
}
}
}
}
}
}
The advantage of transform is that the heavy script operation is performed at indexing time, rather than at search time and potentially the search is faster than when using the script at searching time.
The disadvantage of transform is that you cannot actually specify a different value for param1 unless you define another transform in the mapping itself. Meaning, what if you want to count videos as well? You need to add another transform and another field count_videos for example.

Multiple types in Elasticsearch Type Filter

I have a filtered query like this
query: {
filtered: {
query: {
bool: {
should: [{multi_match: {
query: #query,
fields: ['title', 'content']
}
},{fuzzy: {
content: {
value: #query,
min_similarity: '1d',
}
}}]
}
},
filter: {
and: [
type: {
value: #type
}]
}}}
That works fine if #type is a string, but does not work if #type is an array. How can I search for multiple types?
This worked, but I'm not happy with it:
filter: {
or: [
{ type: { value: 'blog'} },
{ type: { value: 'category'} },
{ type: { value: 'miscellaneous'} }
]
}
I'd love to accept a better answer
You can easily specify multiple types in your search request's URL, e.g. http://localhost:9200/twitter/tweet,user/_search, or with type in the header if using _msearch, as documented here.
These are then added as filters for you by Elasticsearch.
Also, you usually want to be using bool to combine filters, for reasons described in this article: all about elasticsearch filter bitsets
This worked for me:
Within the filter parameter, wrap multiple type queries as should clauses for a bool query
e.g
{
"query": {
"bool": {
"must": {
"term": { "foo": "bar" }
},
"filter": {
"bool": {
"should": [
{ "type": { "value": "myType" } },
{ "type": { "value": "myOtherType" } }
]
}
}
}
}
}
Suggested by dadoonet in the Elasticsearch github issue Support multiple types in Type Query

Resources