Elasticsearch Field Preference for result sequence - elasticsearch

I have created the index in elasticsearch with the following mapping:
{
"test": {
"mappings": {
"documents": {
"properties": {
"fields": {
"type": "nested",
"properties": {
"uid": {
"type": "keyword"
},
"value": {
"type": "text",
"copy_to": [
"fulltext"
]
}
}
},
"fulltext": {
"type": "text"
},
"tags": {
"type": "text"
},
"title": {
"type": "text",
"fields": {
"raw": {
"type": "keyword"
}
}
},
"url": {
"type": "text",
"fields": {
"raw": {
"type": "keyword"
}
}
}
}
}
}
}
}
While searching I want to set the preference of fields for example if search text found in title or url then that document comes first then other documents.
Can we set a field preference for search result sequence(in my case preference like title,url,tags,fields)?
Please help me into this?

This is called "boosting" . Prior to elasticsearch 5.0.0 - boosting could be applied in indexing phase or query phase( added as part of field mapping ). This feature is deprecated now and all mappings after 5.0 are applied in query time .
Current recommendation is to to use query time boosting.
Please read this documents to get details on how to use boosting:
1 - https://www.elastic.co/guide/en/elasticsearch/guide/current/_boosting_query_clauses.html
2 - https://www.elastic.co/guide/en/elasticsearch/guide/current/_boosting_query_clauses.html

Related

Elasticsearch: Schema without mapping?

According to Elasticsearch's roadmap, mapping types are going to be completely removed at 7.x
How are we going to give a schema structure to Documents without mapping?
For example how would we replace this (A Doc/mapping_type with 3 fields of specific data type):
PUT twitter
{
"mappings": {
"user": {
"properties": {
"name": { "type": "text" },
"user_name": { "type": "keyword" },
"email": { "type": "keyword" }
}
}
}
They are going to remove types (user in you example) from mapping, because there is only 1 type per index now, the rest will be the same:
PUT twitter
{
"mappings": {
"_doc": {
"properties": {
"name": { "type": "text" },
"user_name": { "type": "keyword" },
"email": { "type": "keyword" }
}
}
}
}
As you can see, there is no user type anymore.

elasticsearch query child list containing specific value

I writing a query to return the products that has a specific promotionCode. In my index, product has following property indexed
"offers": [
{
"promotionCode": "MV"
},
{
"promotionCode": "LI"
},
.....
]
My initial thought the following would be the answer to
GET alias-live-dev/_search
{
"query": {
"match": {
"offers.promotionCode":"MV"
}
}
}
However, this always return 0 hit, I am guessing, it failed because offers is a list. Could anyone please advise what would the right query for this scenario. Thanks in advance.
In mapping,
"productId": {
"type": "keyword"
},
"offers": {
"type": "nested",
"properties": {
......
"promotionCode": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},

Lucene search using Kibana does return my results

Using Kibana, I have created the following index:
put newsindex
{
"settings" : {
"number_of_shards":3,
"number_of_replicas":2
},
"mappings" : {
"news": {
"properties": {
"NewsID": {
"type": "integer"
},
"NewsType": {
"type": "text"
},
"BodyText": {
"type": "text"
},
"Caption": {
"type": "text"
},
"HeadLine": {
"type": "text"
},
"Approved": {
"type": "text"
},
"Author": {
"type": "text"
},
"Contact": {
"type": "text"
},
"DateCreated": {
"type": "date",
"format": "date_time"
},
"DateSubmitted": {
"type": "date",
"format": "date_time"
},
"LastModifiedDate": {
"type": "date",
"format": "date_time"
}
}
}
}
}
I have populated the index with Logstash. If I just perform a match_all query, all my records are returned as you'd expect. However, when I try to perform a targeted query such as:
get newsindex/_search
{
"query":{"match": {"headline": "construct abnomolies"}
}
}
I can see headline as a property of _source, but my query is ignored i.e. I still receive everything, regardless of whats in the headline. How do I need to change my index to make headline searchable. I'm using Elasticsearch 5.6.3
I needed to change the name property on my index to be lowercase. I noticed in the output windows the the properties under _source where lowercase. In Kibana the predictive text was offering my notation and lowercase. I've dropped my index and re-populated and it now works.

Elasticsearch: Merge result of aggregation by bucket key

I've indexed entities in Elasticsearch, which occur in my documents. The mapping for the entities looks like the following:
"Entities": {
"properties": {
"EntFrequency": {
"type": "long"
},
"EntId": {
"type": "long"
},
"EntType": {
"type": "string",
"analyzer": "english",
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed"
}
}
},
"Entname": {
"type": "string",
"analyzer": "english",
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
},
[...]
Furthermore, I use this aggregation query to determine the most-occurring entities:
GET cable/document/_search
{
"size" :0,
"query": {
"match_all": {}
},
"aggs" : {
"entities_agg" : {
"terms" : {
"field" : "Entities.EntId"
}
}
}
}
}
Response
"buckets": [
{
"key": 323644,
"doc_count": 231038
},
[...]
However, some of those entity mentions refer to the same entity e.g. "USA" and "United States" and I do know their ids. How do I merge the buckets and the counts of these duplicates in ES?
I cannot use a client-side solution since there are too many entities and retrieving all of them and merging would be probably too slow for my application. The knowledge about duplicates is acquired through runtime. Thus, I cannot use this knowledge for the initial creation of my ES index.
Thanks for your help and comments!

Update and search in multi field properties in ElasticSearch

I'm trying to use multi field properties for multi language support. I created following mapping for this:
{
"mappings": {
"product": {
"properties": {
"prod-id": {
"type": "string"
},
"prod-name": {
"type": "string",
"fields": {
"en": {
"type": "string",
"analyzer": "english"
},
"fr": {
"type": "string",
"analyzer": "french"
}
}
}
}
}
}
}
I created test record:
{
"prod-id": "1234567",
"prod-name": [
"Test product",
"Produit d'essai"
]
}
and tried to query using some language:
{
"query": {
"bool": {
"must": [
{"match": {
"prod-name.en": "Produit"
}}
]
}
}
}
As a result I got my document. But I expected that I will have empty result when I use French but choose English. It seems ElasticSearch ignores which field I specified in query. There is no difference in search result when I use "prod-name.en" or "prod-name.fr" or just "prod-name". Is this behaviour expected? Should I do some special things to have searching just in one language?
Another problem with updating multi field property. I can't update just one field.
{
"doc" : {
"prod-name.en": "Test"
}
}
I got following error:
{
"error": {
"root_cause": [
{
"type": "mapper_parsing_exception",
"reason": "Field name [prod-name.en] cannot contain '.'"
}
],
"type": "mapper_parsing_exception",
"reason": "Field name [prod-name.en] cannot contain '.'"
},
"status": 400
}
Is there any way to update just one field in multi field property?
In your mapping, the prod-name.en field will simply be analyzed using the english analyzer and the same for the french field. However, ES will not choose for you which value to put in which field.
Instead, you need to modify your mapping like this
{
"mappings": {
"product": {
"properties": {
"prod-id": {
"type": "string"
},
"prod-name": {
"type": "object",
"properties": {
"en": {
"type": "string",
"analyzer": "english"
},
"fr": {
"type": "string",
"analyzer": "french"
}
}
}
}
}
}
}
and input document to be like this and you'll get the results you expect.
{
"prod-id": "1234567",
"prod-name": {
"en": "Test product",
"fr": "Produit d'essai"
}
}
As for the updating part, your partial document should be like this instead.
{
"doc" : {
"prod-name": {
"en": "Test"
}
}
}

Resources