How can I use query_string to match both nested and non-nested fields at the same time? - elasticsearch

I have an index with a mapping something like this:
"email" : {
"type" : "nested",
"properties" : {
"from" : {
"type" : "text",
"analyzer" : "lowercase_keyword",
"fielddata" : true
},
"subject" : {
"type" : "text",
"analyzer" : "lowercase_keyword",
"fielddata" : true
},
"to" : {
"type" : "text",
"analyzer" : "lowercase_keyword",
"fielddata" : true
}
}
},
"textExact" : {
"type" : "text",
"analyzer" : "lowercase_standard",
"fielddata" : true
}
I want to use query_string to search for matches in both the nested and the non-nested field at the same time, e.g.
email.to:foo#example.com AND textExact:bar
But I can't figure out how to write a query that will search both fields at once. The following doesn't work, because query_string searches do not return nested documents:
"query": {
"query_string": {
"fields": [
"textExact",
"email.to"
],
"query": "email.to:foo#example.com AND textExact:bar"
}
}
I can write a separate nested query, but that will only search against nested fields. Is there any way I can use query_string to match both nested and non-nested fields at the same time?
I am using Elasticsearch 6.8. Cross-posted on the Elasticsearch forums.

Nested documents can only be queried with the nested query.
You can follow below two approaches.
1. You can combine nested and normal query in must clause, which works like "and" for different queries.
{
"query": {
"bool": {
"must": [
{
"nested": {
"path": "email",
"query": {
"term": {
"email.to": "foo#example.com"
}
}
}
},
{
"match": {
"textExact": "bar"
}
}
]
}
}
}
2. copy-to
The copy_to parameter allows you to copy the values of multiple fields into a group field, which can then be queried as a single field.
{
"mappings": {
"properties": {
"textExact":{
"type": "text"
},
"to_email":{
"type": "keyword"
},
"email":{
"type": "nested",
"properties": {
"to":{
"type":"keyword",
"copy_to": "to_email" --> copies to non-nested field
},
"from":{
"type":"keyword"
}
}
}
}
}
}
Query
{
"query": {
"query_string": {
"fields": [
"textExact",
"to_email"
],
"query": "to_email:foo#example.com AND textExact:bar"
}
}
}
Result
"_source" : {
"textExact" : "bar",
"email" : [
{
"to" : "sdfsd#example.com",
"from" : "a#example.com"
},
{
"to" : "foo#example.com",
"from" : "sdfds#example.com"
}
]
}

Related

Place an Analyzer on a a specific array item in a nested object

I have the following mapping
"mappings":{
"properties":{
"name": {
"type": "text"
},
"age": {
"type": "integer"
},
"customProps":{
"type" : "nested",
"properties": {
"key":{
"type": "keyword"
},
"value": {
"type" : "keyword"
}
}
}
}
}
example data
{
"name" : "person1",
"age" : 10,
"customProps":[
{"hairColor":"blue"},
{"height":"120"}
]
},
{
"name" : "person2",
"age" : 30,
"customProps":[
{"jobTitle" : "software engineer"},
{"salaryAccount" : "AvGhj90AAb"}
]
}
so i want to be able to search for document by salary account case insensitive, i am also searching using wild card
example query is
{
"query": {
"bool": {
"should": [
{
"nested": {
"path": "customProps",
"query": {
"bool": {
"must": [
{ "match": { "customProps.key": "salaryAccount" } },
{ "wildcard": { "customProps.value": "*AvG*"
}
}
]}}}}]}}}
i tried adding analyzer with PUT using the following syntax
{
"settings":{
"index":{
"analysis":{
"analyzer":{
"analyzer_case_insensitive" : {
"tokenizer":"keyword",
"filter":"lowercase"
}
}
}
}
},
"mappings":{
"people":{
"properties":{
"customProps":{
"properties":{
"value":{
"type": "keyword",
"analyzer": "analyzer_case_insensitive"
}
}
}
}
}
}
}
im getting the following error
"type" : "mapper_parsing_exception",
"reason" : "Root mapping definition has unsupported parameters: [people: {properties={customProps={properties={value={analyzer=analyzer_case_insensitive, type=keyword}}}}}]"
any idea how to do the analyzer for the salary account object in the array when it exists?
Your use case is quite clear, that you want to search on the value of salaryAccount only when this key exists in customProps array.
There are some issues with your mapping definition :
You cannot define a custom analyzer for keyword type field, instead you can use a normalizer
Based on the mapping definition you added at the beginning of the question, it seems that you are using elasticsearch version 7.x. But the second mapping definition that you provided, in that you have added mapping type also (i.e people), which is deprecated in 7.x
There is no need to add the key and value fields in the index mapping.
Adding a working example with index mapping, search query, and search result
Index Mapping:
PUT myidx
{
"mappings": {
"properties": {
"customProps": {
"type": "nested"
}
}
}
}
Search Query:
You need to use exists query, to check whether a field exists or not. And case_insensitive param in Wildcard query is available since elasticsearch version 7.10. If you are using a version below this, then you need to use a normalizer, to achieve case insensitive scenarios.
POST myidx/_search
{
"query": {
"bool": {
"should": [
{
"nested": {
"path": "customProps",
"query": {
"bool": {
"must": [
{
"exists": {
"field": "customProps.salaryAccount"
}
},
{
"wildcard": {
"customProps.salaryAccount.keyword": {
"value": "*aVg*",
"case_insensitive": true
}
}
}
]
}
}
}
}
]
}
}
}
Search Result:
"hits" : [
{
"_index" : "myidx",
"_type" : "_doc",
"_id" : "2",
"_score" : 2.0,
"_source" : {
"name" : "person2",
"age" : 30,
"customProps" : [
{
"jobTitle" : "software engineer"
},
{
"salaryAccount" : "AvGhj90AAb"
}
]
}
}
]

Can't get Elasticsearch filter to work

using Elasticsearch 2 with Rails 4, using elasticsearch-model gem
Everything is fine and even geo-point distance is working. However, I can't work out for the life of me how to make a simple boolean filter work. I have a simple boolean 'exclude_from_search_results' that (when true) should cause the record to be filtered from the results.
Here's my query in rails controller (without the filter):
#response = Firm.search(
query: {
bool: {
should: [
{ multi_match: {
query: params[:search],
fields: ['name^10', 'address_1', 'address_2', 'address_3', 'address_4', 'address_5', 'address_6'],
operator: 'or'
}
}
]
}
},
aggs: {types: {terms: {field: 'firm_type'}}}
)
I've added this both within the bool or query section, or outside it, but I either get NO documents, or all documents. (9000 should match)
Example:
#response = Firm.search(
query: {
bool: {
should: [
{ multi_match: {
query: params[:search],
fields: ['name^10', 'address_1', 'address_2', 'address_3', 'address_4', 'address_5', 'address_6'],
operator: 'or'
}
}
],
filter: {
term: {"exclude_from_search_results": "false"}
}
}
},
aggs: {types: {terms: {field: 'firm_type'}}}
)
I've also tried putting the filter clause in different places but either get error or no results. What am I doing wrong?? Probably missing something simple...
Here's my mapping:
"mappings" : {
"firm" : {
"dynamic" : "false",
"properties" : {
"address_1" : {
"type" : "string",
"index_options" : "offsets",
"analyzer" : "english"
},
"address_2" : {
"type" : "string",
"index_options" : "offsets",
"analyzer" : "english"
},
"address_3" : {
"type" : "string",
"index_options" : "offsets",
"analyzer" : "english"
},
"address_4" : {
"type" : "string",
"index_options" : "offsets",
"analyzer" : "english"
},
"address_5" : {
"type" : "string",
"index_options" : "offsets",
"analyzer" : "english"
},
"address_6" : {
"type" : "string",
"index_options" : "offsets",
"analyzer" : "english"
},
"exlude_from_search_results" : {
"type" : "boolean"
},
"firm_type" : {
"type" : "string",
"index" : "not_analyzed"
},
"location" : {
"type" : "geo_point"
},
"name" : {
"type" : "string",
"index_options" : "offsets",
"analyzer" : "english"
}
Any pointers greatly appreciated...
Your current query is doing a OR between your filter and multi-match query. Thats a reason you either get all documents.
I suppose you want to do AND between filter and multi-match query.
If this is the case then following query works for me.
{
"query": {
"bool": {
"must": [
{
"multi_match": {
"query": "address1",
"fields": [
"name^10",
"address1",
"address2",
"address3",
"address4",
"address5",
"address6"
],
"operator": "or"
}
},
{
"term": {
"exclude_from_search_results": {
"value": "false"
}
}
}
]
}
},
"aggs": {
"types": {
"terms": {
"field": "name"
}
}
}
}
Hope this help, Thanks.

Aggregating a Key/Value list in ElasticSearch

General problem is that I've created a Name/Value mapping in elastic search to deal with a potentially huge user input of tags - as opposed to allowing an open schema where people can just create documents with new properties.
I've got an elastic search mapping that looks like this:
"Tags" : {
"properties" : {
"Value" : {
"analyzer" : "keyword",
"type" : "string"
},
"Name" : {
"analyzer" : "keyword",
"type" : "string"
}
}
},
With records that look like this
"Tags" : [
{
"Name" : "group",
"Value" : "foobar"
},
{
"Name" : "season",
"Value" : "winter"
}
],
What I'm trying to do with an elastic search query is to write a script that will aggregate only the season entries.
...
"script" : "for (int i = 0; i < doc['Tags.Value'].values.length; i++) {
if (doc['Tags.Value'].values[i] == 'season') {
return doc['Tags.Names'].values[i]
} }"
...
I've gone through about 200 permutations of the above script and it's not quite returning the results that I would like to see.
Your Tags field should be nested so that you can write a nested query to only select the season tags and then you can aggregate on those values only. That would allow you to ditch that script which is going to perform very badly if you have a huge amount of tags.
So your mapping needs to look like this:
"Tags" : {
"type": "nested", <---- add this
"properties" : {
"Value" : {
"analyzer" : "keyword",
"type" : "string"
},
"Name" : {
"analyzer" : "keyword",
"type" : "string"
}
}
},
Then your query should include a nested clause on the season tag names, so that your terms aggregation can simply work on those values.
{
"query": {
"filtered": {
"filter": {
"nested": {
"path": "Tags",
"filter": {
"term": {
"Tags.Name": "season"
}
}
}
}
}
},
"aggs": {
"season_tags": {
"nested": {
"path": "Tags"
},
"aggs": {
"season_values": {
"terms": {
"field": "Tags.Value"
}
}
}
}
}
}

Elasticsearch terms aggregate duplicates

I have a field using a ngram analyzer and trying to use a terms aggregate on the field to return unique documents by the field. The returned keys in the aggregates don't match the documents fields being returned and I'm getting duplicate fields.
"analysis" : {
"filter" : {
"autocomplete_filter" : {
"type" : "edge_ngram",
"min_gram" : "1",
"max_gram" : "20"
}
},
"analyzer" : {
"autocomplete" : {
"type" : "custom",
"filter" : [ "lowercase", "autocomplete_filter" ],
"tokenizer" : "standard"
}
}
}
}
"name" : {
"type" : "string",
"analyzer" : "autocomplete",
"fields" : {
"raw" : {
"type" : "string",
"index" : "not_analyzed"
}
}
}
{
"query": {
"query_string": {
"query":"bra",
"fields":["name"],
"use_dis_max":true
}
},
"aggs": {
"group_by_name": {
"terms": { "field":"name.raw" }
}
}
}
I'm getting back the following names and keys.
Braingeyser, Brainstorm, Braingeyser, Brainstorm, Brainstorm, Brainstorm, Bramblecrush, Brainwash, Brainwash, Braingeyser
{"key":"Bog Wraith","doc_count":18}
{"key":"Birds of Paradise","doc_count":15}
{"key":"Circle of Protection: Black","doc_count":15}
{"key":"Lightning Bolt","doc_count":15}
{"key":"Grizzly Bears","doc_count":14}
{"key":"Black Knight","doc_count":13}
{"key":"Bad Moon","doc_count":12}
{"key":"Boomerang","doc_count":12}
{"key":"Wall of Bone","doc_count":12}
{"key":"Balance","doc_count":11}
How can I get elasticsearch to only return unique fields from the aggregate?
To remove duplicates being returned in your aggregate you can try:
"aggs": {
"group_by_name": {
"terms": { "field":"name.raw" },
"aggs": {
"remove_dups": {
"top_hits": {
"size": 1,
"_source": false
}
}
}
}
}

Elasticsearch postings highlighter failing for some search strings

I have a search that works well with most search strings, but fails spectacularly on others. Experimenting, it appears to fail when at least one word in the query doesn't match (like this made up search phrase), with the error:
{
"error": "SearchPhaseExecutionException[Failed to execute phase [query_fetch], all shards failed; shardFailures {[w3zfoix_Qi-xwpVGbCbQWw][ia_test][0]: ElasticsearchIllegalArgumentException[the field [content] should be indexed with positions and offsets in the postings list to be used with postings highlighter]}]",
"status": 400
}
The simplest search which gives this error is the one below:
POST /myindex/_search
{
"from" : 0,
"size" : 25,
"query": {
"filtered" : {
"query" : {
"multi_match" : {
"type" : "most_fields",
"fields": ["title", "content", "content.english"],
"query": "Box Fexye"
}
}
}
},
"highlight" : {
"fields" : {
"content" : {
"type" : "postings"
}
}
}
}
My query is more complicated than this, and I need to use the "postings" highlighter to pull out the best matching sentence from a document.
Indexing of the relevant fields looks like:
"properties" : {
"title" : {
"type" : "string",
"fields": {
"shingles": {
"type": "string",
"analyzer": "my_shingle_analyzer"
}
}
},
"content" : {
"type" : "string",
"analyzer" : "standard",
"fields": {
"english": {
"type": "string",
"analyzer": "my_english"
},
"shingles": {
"type": "string",
"analyzer": "my_shingle_analyzer"
}
},
"index_options" : "offsets",
"term_vector" : "with_positions_offsets"
}
}

Resources