I had used Elasticsearch few years ago(version 6.4.0) and they had no provision to provide highlight the "copy_to" field. I would like to know if they have this provision now?
Yes, highlight can enabled on copy_to field in latest version.
Please check below example which i have tried.
Index Mapping:
PUT my-index-000001
{
"mappings": {
"properties": {
"first_name": {
"type": "text",
"copy_to": "full_name"
},
"last_name": {
"type": "text",
"copy_to": "full_name"
},
"full_name": {
"type": "text"
}
}
}
}
Document Index:
PUT my-index-000001/_doc/1
{
"first_name": "John",
"last_name": "Smith"
}
Query:
GET my-index-000001/_search
{
"query": {
"match": {
"full_name": {
"query": "John Smith",
"operator": "and"
}
}
},
"highlight": {
"fields": {
"full_name": {}
}
}
}
Result:
"hits" : [
{
"_index" : "my-index-000001",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.5753642,
"_source" : {
"first_name" : "John",
"last_name" : "Smith"
},
"highlight" : {
"full_name" : [
"<em>Smith</em>",
"<em>John</em>"
]
}
}
]
Update 1: Search using copy_to field and highlight match to particular field
In below example, search will be happen on full_name field which is copy field and highlight will be happen on first_name field.
Query:
GET my-index-000001/_search
{
"query": {
"match": {
"full_name": {
"query": "John Smith",
"operator": "and"
}
}
},
"highlight": {
"require_field_match": "false",
"fields": {
"first_name": {}
}
}
}
Result:
{
"_index" : "my-index-000001",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.5753642,
"_source" : {
"first_name" : "John",
"last_name" : "Smith"
},
"highlight" : {
"first_name" : [
"<em>John</em>"
]
}
}
Related
I would like to apply any analyser that satisfy below search. Let's take an example. Suppose I have entered below text in a document
I have store similar kind of sentence as specialization in opensearch.
Cardiologist Doctor.
Cardiac surgeon.
neuro surgeon.
cardiac specialist.
nursing care
Anatomy.
Anaesthesiology.
So, if I search cardiac surgeon result should be ['cardiologist', 'cardiac surgeon', 'cardiac specialist'] and it should not return 'neuro surgeon', 'nursing care'.
Also, if I search anatomy result should be ['anatomoy'] and it should not return Anaesthesiology.
I have tried with ngram_filter, but when I search cardiologist it's returning cardiologist and nursing care both instead of cardiologist only.
"ngram_filter": {
"type": "edge_ngram",
"min_gram": 3,
"max_gram": 15
},
My suggestion using synonyms:
PUT synonyms
{
"settings": {
"index": {
"analysis": {
"analyzer": {
"synonyms_analyzer": {
"tokenizer": "standard",
"filter": [
"lowercase",
"synonyms_filter"
]
}
},
"filter": {
"synonyms_filter": {
"type": "synonym",
"synonyms": [
"cardiac surgeon, cardiologist, cardiac surgeon, cardiac specialist"
]
}
}
}
}
},
"mappings": {
"properties": {
"name": {
"type": "text",
"search_analyzer": "synonyms_analyzer"
}
}
}
}
POST _bulk
{ "index" : { "_index" : "synonyms", "_id" : "1"}}
{ "name" : "Cardiac surgeon" }
{ "index" : { "_index" : "synonyms", "_id" : "2"}}
{ "name" : "Cardiologist Doctor" }
{ "index" : { "_index" : "synonyms", "_id" : "3"}}
{ "name" : "neuro surgeon" }
{ "index" : { "_index" : "synonyms", "_id" : "4"}}
{ "name" : "cardiac specialist" }
{ "index" : { "_index" : "synonyms", "_id" : "5"}}
{ "name" : "nursing care" }
{ "index" : { "_index" : "synonyms", "_id" : "6"}}
{ "name" : "Anatomy" }
{ "index" : { "_index" : "synonyms", "_id" : "7"}}
{ "name" : "Anaesthesiology" }
GET synonyms/_search
{
"query": {
"match": {
"name": "cardiac surgeon"
}
}
}
Hits:
"hits": [
{
"_index": "synonyms",
"_id": "1",
"_score": 13.066887,
"_source": {
"name": "Cardiac surgeon"
}
},
{
"_index": "synonyms",
"_id": "4",
"_score": 7.9681025,
"_source": {
"name": "cardiac specialist"
}
},
{
"_index": "synonyms",
"_id": "2",
"_score": 1.567127,
"_source": {
"name": "Cardiologist Doctor"
}
}
]
Is it possible to configure the mapping of an index, or the discover view of this in index in a way that an array inside the documents is / will be sorted?
Background: I have a es index with documents containing an array:
This array is updated from time to time with new entries (objects containing a timestamp), and I would like this arrays to be sorted according to the timestamp inside the objects.
If your field is define as nested type then you can use inner_hits to sort the array of object. it will return the sorted object array inside inner_hits for each document.
You can define field as nested like below:
{
"mappings": {
"properties": {
"name": {
"type": "keyword"
},
"openTimes": {
"type": "nested",
"properties": {
"date": {
"type": "date"
},
"name": {
"type": "keyword"
}
}
}
}
}
}
Let consider below is your sample data:
{"index": { } }
{ "name": "second on 6th (3rd on the 5th)", "openTimes": [ { "date": "2018-12-05T12:00:00" ,"name":"abc"}, { "date": "2018-12-06T11:00:00","name":"xyz" }] }
{"index": { } }
{ "name": "third on 6th (1st on the 5th)", "openTimes": [ {"date": "2018-12-05T10:00:00","name":"abc"}, { "date": "2018-12-06T12:00:00","name":"xyz" }] }
{"index": { } }
{ "name": "first on the 6th (2nd on the 5th)", "openTimes": [ {"date": "2018-12-05T11:00:00","name":"abc" }, { "date": "2018-12-06T10:00:00","name":"xyz" }] }
Below is Query:
{
"query": {
"nested": {
"path": "openTimes",
"query": {
"match_all": {}
},
"inner_hits": {
"sort": {
"openTimes.date": "desc"
}
}
}
}
}
Sample Response:
{
"_index" : "nested-listings",
"_type" : "_doc",
"_id" : "u0fw338BMCbs63yKkqi0",
"_score" : 1.0,
"_source" : {
"name" : "second on 6th (3rd on the 5th)",
"openTimes" : [
{
"date" : "2018-12-05T12:00:00",
"name" : "abc"
},
{
"date" : "2018-12-06T11:00:00",
"name" : "xyz"
}
]
},
"inner_hits" : {
"openTimes" : {
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : null,
"hits" : [
{
"_index" : "nested-listings",
"_type" : "_doc",
"_id" : "u0fw338BMCbs63yKkqi0",
"_nested" : {
"field" : "openTimes",
"offset" : 1
},
"_score" : null,
"_source" : {
"date" : "2018-12-06T11:00:00",
"name" : "xyz"
},
"sort" : [
1544094000000
]
},
{
"_index" : "nested-listings",
"_type" : "_doc",
"_id" : "u0fw338BMCbs63yKkqi0",
"_nested" : {
"field" : "openTimes",
"offset" : 0
},
"_score" : null,
"_source" : {
"date" : "2018-12-05T12:00:00",
"name" : "abc"
},
"sort" : [
1544011200000
]
}
]
}
}
}
}
I have index like this:
PUT job_offers
{
"mappings": {
"properties": {
"location": {
"properties": {
"slug": {
"type": "keyword"
},
"name": {
"type": "text"
}
},
"type": "nested"
},
"experience": {
"properties": {
"slug": {
"type": "keyword"
},
"name": {
"type": "text"
}
},
"type": "nested"
}
}
}
}
I insert this object:
POST job_offers/_doc
{
"title": "Junior Ruby on Rails Developer",
"location": [
{
"slug": "new-york",
"name": "New York"
},
{
"slug": "atlanta",
"name": "Atlanta"
},
{
"slug": "remote",
"name": "Remote"
}
],
"experience": [
{
"slug": "junior",
"name": "Junior"
}
]
}
This query returns 0 documents.
GET job_offers/_search
{
"query": {
"terms": {
"location.slug": [
"remote",
"new-york"
]
}
}
}
Can you explain me why? I thought it should return documents where location.slug is remote or new-york.
Nested- Query have a different syntax
GET job_offers/_search
{
"query": {
"nested": {
"path": "location",
"query": {
"terms": {
"location.slug": ["remote","new-york"]
}
}
}
}
}
Result:
"hits" : [
{
"_index" : "job_offers",
"_type" : "_doc",
"_id" : "wWjoXnEBs0rCGpYsvUf4",
"_score" : 1.0,
"_source" : {
"title" : "Junior Ruby on Rails Developer",
"location" : [
{
"slug" : "new-york",
"name" : "New York"
},
{
"slug" : "atlanta",
"name" : "Atlanta"
},
{
"slug" : "remote",
"name" : "Remote"
}
],
"experience" : [
{
"slug" : "junior",
"name" : "Junior"
}
]
}
}
]
It will return entire document where location.slug matches "remote" or "new-york". If you want to get matched nested document , you need to use inner_hits
GET job_offers/_search
{
"query": {
"nested": {
"path": "location",
"query": {
"terms": {
"location.slug": ["remote","new-york"]
}
},
"inner_hits": {} --> note
}
}
}
Result:
"hits" : [
{
"_index" : "job_offers",
"_type" : "_doc",
"_id" : "wWjoXnEBs0rCGpYsvUf4",
"_score" : 1.0,
"_source" : {
"title" : "Junior Ruby on Rails Developer",
"location" : [
{
"slug" : "new-york",
"name" : "New York"
},
{
"slug" : "atlanta",
"name" : "Atlanta"
},
{
"slug" : "remote",
"name" : "Remote"
}
],
"experience" : [
{
"slug" : "junior",
"name" : "Junior"
}
]
},
"inner_hits" : { --> will give matched nested object
"location" : {
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "job_offers",
"_type" : "_doc",
"_id" : "wWjoXnEBs0rCGpYsvUf4",
"_nested" : {
"field" : "location",
"offset" : 0
},
"_score" : 1.0,
"_source" : {
"slug" : "new-york",
"name" : "New York"
}
},
{
"_index" : "job_offers",
"_type" : "_doc",
"_id" : "wWjoXnEBs0rCGpYsvUf4",
"_nested" : {
"field" : "location",
"offset" : 2
},
"_score" : 1.0,
"_source" : {
"slug" : "remote",
"name" : "Remote"
}
}
]
}
}
}
}
]
Also I see that you are using two fields for same data with different types. if data is same in both fields(name and slug) and only data type is different, you can use fields for that
It is often useful to index the same field in different ways for
different purposes. This is the purpose of multi-fields. For instance,
a string field could be mapped as a text field for full-text search,
and as a keyword field for sorting or aggregations:
In that case your mapping will become below
PUT job_offers
{
"mappings": {
"properties": {
"location": {
"properties": {
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword"
}
}
}
},
"type": "nested"
},
"experience": {
"properties": {
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword"
}
}
}
},
"type": "nested"
}
}
}
}
I want to pass a list of emails in Elastic Search Query, So I tried below query to achieve that, but didn't get any result.
{
"query": {
"terms": {
"email": [ "andrew#gmail.com", "michel#gmail.com" ]
}
}
}
When I used id instead of emails, that worked !
{
"query": {
"terms": {
"id": [ 43, 67 ]
}
}
}
Could you please explain what's wrong with my email query and how make it works
If you want to recognize email addresses as single tokens you should use uax_url_email tokenizer.
UAX URL Email Tokenizer
A working example:
Mappings
PUT my_index
{
"settings": {
"analysis": {
"analyzer": {
"my_email_analyzer": {
"type": "custom",
"tokenizer": "my_tokenizer",
"filter": ["lowercase", "stop"]
}
},
"tokenizer": {
"my_tokenizer":{
"type": "uax_url_email"
}
}
}
},
"mappings": {
"properties": {
"email": {
"type": "text",
"analyzer": "my_email_analyzer",
"search_analyzer": "my_email_analyzer",
"fields": {
"keyword":{
"type":"keyword"
}
}
}
}
}
}
POST few documents
POST my_index/_doc/1
{
"email":"andrew#gmail.com"
}
POST my_index/_doc/2
{
"email":"michel#gmail.com"
}
Search Query
GET my_index/_search
{
"query": {
"multi_match": {
"query": "andrew#gmail.com michel#gmail.com",
"fields": ["email"]
}
}
}
Results
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 0.6931472,
"hits" : [
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.6931472,
"_source" : {
"email" : "andrew#gmail.com"
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "2",
"_score" : 0.6931472,
"_source" : {
"email" : "michel#gmail.com"
}
}
]
}
Another option is to use keyword type.
Search Query
GET my_index/_search
{
"query": {
"terms": {
"email.keyword": [
"andrew#gmail.com",
"michel#gmail.com"
]
}
}
}
In my opinion using the uax_url_email tokenizer is a better solution.
Hope this helps
I got elasticsearch version 7.3 and two indexes, profiles and purchases,
here is their mappings:
\purchases
{
"purchases": {
"mappings": {
"properties": {
"product": {
"type": "keyword"
},
"profile": {
"type": "join",
"eager_global_ordinals": true,
"relations": {
"profiles": "purchases"
}
}
}
}
}
}
\profiles
{
"profiles": {
"mappings": {
"properties": {
"user": {
"type": "keyword"
}
}
}
}
}
I added one profile with user:abc, _id:1 and two purchases this way
{
"profile": {"name": "profiles", "parent": "1"},
"product" : "tomato",
}
{
"profile": {"name": "profiles", "parent": "1"},
"product" : "tomato 2",
}
Then I do search query for purchases
{
"query": {
"has_parent": {
"parent_type": "profiles",
"query": {
"query_string": {
"query": "user:abc"
}
}
}
}
}
And I get empty result, what is wrong?
As stated in the documentation of the Join datatype you can not create parent-child-relationships over multiple indices:
The join datatype is a special field that creates parent/child relation within documents of the same index.
If you would like to use the join datatype, you have to model it in one index.
UPDATE
This is how your mapping and the Indexing of the documents would look like:
PUT profiles-purchases-index
{
"mappings": {
"properties": {
"user":{
"type": "keyword"
},
"product":{
"type": "keyword"
},
"profile":{
"type": "join",
"relations":{
"profiles": "purchases"
}
}
}
}
}
Index parent document:
PUT profiles-purchases-index/_doc/1
{
"user": "abc",
"profile": "profiles"
}
Index child documents:
PUT profiles-purchases-index/_doc/2?routing=1
{
"product": "tomato",
"profile":{
"name": "purchases",
"parent": 1
}
}
PUT profiles-purchases-index/_doc/3?routing=1
{
"product": "tomato 2",
"profile":{
"name": "purchases",
"parent": 1
}
}
Run Query:
GET profiles-purchases-index/_search
{
"query": {
"has_parent": {
"parent_type": "profiles",
"query": {
"match": {
"user": "abc"
}
}
}
}
}
Response:
{
...
"hits" : [
{
"_index" : "profiles-purchases-index",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.0,
"_routing" : "1",
"_source" : {
"product" : "tomato",
"profile" : {
"name" : "purchases",
"parent" : 1
}
}
},
{
"_index" : "profiles-purchases-index",
"_type" : "_doc",
"_id" : "3",
"_score" : 1.0,
"_routing" : "1",
"_source" : {
"product" : "tomato 2",
"profile" : {
"name" : "purchases",
"parent" : 1
}
}
}
]
}
}
Notice that you have to set the routing parameter to index the child documents. But please refer to the documentation for that.