Elasticsearch: sort based on rating - elasticsearch

I have a sample data stored similar to below in Elasticsearch 5.5. I can create index, search based on match_all, gte etc. using postman.
{
"name":"Apple",
"address": {
"city":"Cupertino",
"state":"CA",
"country":"USA"
},
"rating":"4.9"
}
I need to sort all the entities based on rating, so I am using below
{
"query":{
"match_all":{}
},
"sort" : [
{
"rating" : {
"order" : "desc"
}
}
]
}
But I see below error in postman
"type": "illegal_argument_exception",
"reason": "Fielddata is disabled on text fields by default. Set fielddata=true on [rating] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead."
Any suggestion?

It seems you need to change the rating field to numeric in order to perform the sort on this field.
{
"name":"Apple",
"address": {
"city":"Cupertino",
"state":"CA",
"country":"USA"
},
"rating":4.9
}
otherwise, you can enable fielddata on an existing text field using the PUT mapping API as follows:
PUT my_index/_mapping/my_type
{
"properties": {
"rating": {
"type": "text",
"fielddata": true
}
}
}

Related

Elastic Search Sort by text error: illegal_argument_exception

I have the below index configuration for Products in Elastic Search.
{
"settings": {
"index": {
"number_of_shards": 3,
"number_of_replicas": 0
}
},
"mappings": {
"properties": {
"name": {
"fields": {
"original": {
"type": "keyword"
}
},
"type": "text",
"fielddata": true,
"analyzer": "portuguese"
},
"product_data": {
"type": "object"
}
}
}
}
HERE I UPSERT THE DATA
http://127.0.0.1:9200/product2/_update/IMOB01
BODY
{
"doc": {
"name":"Test",
"product_data": {
"symbol": "IMOB01",
"release_date": "2013-01-01T00:00:00"
}
},
"doc_as_upsert": true
}
RESPONSE
201 created
The problem is, if I try _search and sort by name everything is OK.
BUT if for instance, I try use _Search with sort by **product_data.symbol ** I receive the error below...
localhost:9200/product/_search
BODY
{
"from": 0,
"size": 50,
"query": {
"bool": {
"must": {
"match": {
"product_data.symbol": "imob01"
}
}
}
},
"sort": [
{ "product_data.symbol": {"order": "asc", "unmapped_type" : "text"}}
]
}
REPONSE
{
"error": {
"root_cause": [
{
"type": "illegal_argument_exception",
"reason": "Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [product_data.symbol] in order to load field data by uninverting the inverted index. Note that this can use significant memory."
}
],
"type": "search_phase_execution_exception",
"reason": "all shards failed",
"phase": "query",
"grouped": true,
"failed_shards": [
{
"shard": 0,
"index": "product",
"node": "PfwDOAwzTZm63IHq_rt1TA",
"reason": {
"type": "illegal_argument_exception",
"reason": "Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [product_data.symbol] in order to load field data by uninverting the inverted index. Note that this can use significant memory."
}
}
],
"caused_by": {
"type": "illegal_argument_exception",
"reason": "Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [product_data.symbol] in order to load field data by uninverting the inverted index. Note that this can use significant memory.",
"caused_by": {
"type": "illegal_argument_exception",
"reason": "Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [product_data.symbol] in order to load field data by uninverting the inverted index. Note that this can use significant memory."
}
}
},
"status": 400
}
What I`m doing wrong ?
OBS: The reason that I created the product_data as object is because I have some products with different types of return, they can have symbol or other fields, symbol is just a example.
I tried change the type of product_data for fielddata:true with no success.
Text fields are not optimised for operations that require per-document
field data like aggregations and sorting, so these operations are
disabled by default. Please use a keyword field instead.
Alternatively, set fielddata=true on [product_data.symbol] in order to
load field data by uninverting the inverted index. Note that this can
use significant memory.
Try user keyword type:
"sort": [
{ "product_data.symbol.keyword": {"order": "asc", "unmapped_type" : "text"}}
]

Merging fields in Elastic Search

I am pretty new to Elastic Search. I have a dataset with multiple fields like name, product_info, description etc., So while searching a document, the search term can come from any of these fields (let us call them as "search core fields").
If I start storing the data in elastic search, should I derive a field which is a concatenated term of all the "search core fields" ? and then index this field alone ?
I came across _all mapping concept and little confused. Does it do the same ?
no, you don't need to create any new field with concatenated terms.
You can just use _all with match query to search a text from any field.
About _all, yes, it searches the text from any field
The _all field has been removed in ES 7, so it would only work in ES 6 and previous versions. The main reason for this is that it used too much storage space.
However, you can define your own all field using the copy_to feature. You basically specify in your mapping which fields should be copied to your custom all field and then you can search on that field.
You can define your mapping like this:
PUT my-index
{
"mappings": {
"properties": {
"name": {
"type": "text",
"copy_to": "custom_all"
},
"product_info": {
"type": "text",
"copy_to": "custom_all"
},
"description": {
"type": "text",
"copy_to": "custom_all"
},
"custom_all": {
"type": "text"
}
}
}
}
PUT my-index/_doc/1
{
"name": "XYZ",
"product_info": "ABC product",
"description": "this product does blablabla"
}
And then you can search on your "all" field like this:
POST my-index/_search
{
"query": {
"match": {
"custom_all": {
"query": "ABC",
"operator": "and"
}
}
}
}

Elastic Search Date Range Query

I am new to elastic search and I am struggling with date range query. I have to query the records which fall between some particular dates.The JSON records pushed into elastic search database are as follows:
"messageid": "Some message id",
"subject": "subject",
"emaildate": "2020-01-01 21:09:24",
"starttime": "2020-01-02 12:30:00",
"endtime": "2020-01-02 13:00:00",
"meetinglocation": "some location",
"duration": "00:30:00",
"employeename": "Name",
"emailid": "abc#xyz.com",
"employeecode": "141479",
"username": "username",
"organizer": "Some name",
"organizer_email": "cde#xyz.com",
I have to query the records which has start time between "2020-01-02 12:30:00" to "2020-01-10 12:30:00". I have written a query like this :
{
"query":
{
"bool":
{
"filter": [
{
"range" : {
"starttime": {
"gte": "2020-01-02 12:30:00",
"lte": "2020-01-10 12:30:00"
}
}
}
]
}
}
}
This query is not giving results as expected. I assume that the person who has pushed the data into elastic search database at my office has not set the mapping and Elastic Search is dynamically deciding the data type of "starttime" as "text". Hence I am getting inconsistent results.
I can set the mapping like this :
PUT /meetings
{
"mappings": {
"dynamic": false,
"properties": {
.
.
.
.
"starttime": {
"type": "date",
"format":"yyyy-MM-dd HH:mm:ss"
}
.
.
.
}
}
}
And the query will work but I am not allowed to do so (office policies). What alternatives do I have so that I can achieve my task.
Update :
I assumed the data type to be "Text" but by default Elastic Search applies both "Text" and "Keyword" so that we can implement both Full Text and Keyword based searches. If it is also set as "Keyword" . Will this benefit me in any case. I do not have access to lots of stuff in the office that's why I am unable to debug the query.I only have the search API for which I have to build the query.
GET /meetings/_mapping output :
'
'
'
"starttime" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
'
'
'
Date range queries will not work on text field, for that, you have to use the date field
Since you are working on date fields , best practice is to use the date field.
I would suggest you to reindex your index to another index so that you can change the type of your text field to date field
Step1-: Create index2 using index1 mapping and make sure to change the type of your date field which is text to date type
Step 2-: Run the elasticsearch reindex and reindex all your data from index1 to index2. Since you have changed your field type to date field type. Elasticsearch will now recognize this field as date
POST _reindex
{
"source":{ "index": "index1" },
"dest": { "index": "index2" }
}
Now you can run your Normal date queries on index2
As #jzzfs suggested the idea is to add a date sub-field to the starttime field. You first need to modify the mapping like this:
PUT meetings/_mapping
{
"properties": {
"starttime" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
},
"date": {
"type" : "date",
"format" : "yyyy-MM-dd HH:mm:ss",
}
}
}
}
}
When done, you need to reindex your data using the update by query API so that the starttime.date field gets populated and index:
POST meetings/_update_by_query
When the update is done, you'll be able to leverage the starttime.date sub-field in your query:
{
"query": {
"bool": {
"filter": [
{
"range": {
"starttime.date": {
"gte": "2020-01-02 12:30:00",
"lte": "2020-01-10 12:30:00"
}
}
}
]
}
}
}
There are ways of parsing text fields as dates at search time but the overhead is impractical... You could, however, keep the starttime as text by default but make it a multi-field and query it using starttime.as_date, for example.

Full Text Search as well as Terms Search on same filed of Elasticsearch

I'm from MySql background. So I don't know much about elasticsearch and it's working.
Here is my requirements
There will be table of resulted records with sorting option on all the column. There will be filter option from where user will select multiple values for multiple columns (e.g, City should be from City1, City2, City3 and Category should be from Cat2, Cat22, Cat6). There will be also search bar where user will enter some text and full text search will be applied on some fields (i.e, City, Area etc).
This image will give better understanding.
Where I'm facing problem is Full Text Search. I have tried some mapping but every time I have to compromise either on Full Text Search or Terms Search. So I think there is no any way to apply both search on same field. But as I told, I don;t know much about elasticsearch. So if any one have solution, it will be appreciated.
Here is what I have applied currently which makes sorting and Terms Searching enable but Full Text Search is not working.
{
"mappings":{
"my_type":{
"properties":{
"city":{
"type":"string",
"index":"not_analyzed"
},
"category":{
"type":"string",
"index":"not_analyzed"
},
"area":{
"type":"string",
"index":"not_analyzed"
},
"zip":{
"type":"string",
"index":"not_analyzed"
},
"state":{
"type":"string",
"index":"not_analyzed"
}
}
}
}
}
You can update the mapping with multifields with two mappings one for full text and another for terms search. Here's a sample mapping for city.
{
"city": {
"type": "string",
"index": "not_analyzed",
"fields": {
"fulltext": {
"type": "string"
}
}
}
}
Default mapping is for terms search, so when terms search is required, you could simple query in "city" field. But, you need full-text search, query must be performed on "city.fulltext". Hope this helps.
Full-text search won't work on not_analyzed fields and sorting won't work on analyzed fields.
You need to use multi-fields.
It is often useful to index the same field in different ways for different purposes. This is the purpose of multi-fields. For instance, a string field could be mapped as a text field for full-text search, and as a keyword field for sorting or aggregations:
For example :
{
"mappings": {
"my_type": {
"properties": {
"city": {
"type": "text",
"fields": {
"raw": {
"type": "keyword"
}
}
} ...
}
}
}
}
Use the dot notation to sort by city.raw :
{
"query": {
"match": {
"city": "york"
}
},
"sort": {
"city.raw": "asc"
}
}

Mapping in elasticsearch

Good morning, In my code I can't search data which contain separate words. If I search on one word all good. I think problem in mapping. I use postman. When I put in URL http://192.168.1.153:9200/sport_scouts/video/_mapping and use method GET I get:
{
"sport_scouts": {
"mappings": {
"video": {
"properties": {
"hashtag": {
"type": "string"
},
"id": {
"type": "long"
},
"sharing_link": {
"type": "string"
},
"source": {
"type": "string"
},
"title": {
"type": "string"
},
"type": {
"type": "string"
},
"user_id": {
"type": "long"
},
"video_preview": {
"type": "string"
}
}
}
}
}
}
All good title have type string but if I search on two or more words I get empty massive. My code in Trait:
public function search($data) {
$this->client();
$params['body']['query']['filtered']['filter']['or'][]['term']['title'] = $data;
$search = $this->client->search($params)['hits']['hits'];
dump($search);
}
Then I call it in my Controller. Can you help me with this problem?
The reason that your indexed data can't be found is caused by a mismatch of the analyzing during indexing and a strict term filter when querying the data.
With your mapping configuration, you are using the default analyzing which (besides many other operations) does a tokenizing. So every multi-word data you insert is split at punctuation or whitespaces. If you insert for example "some great sentence", elasticsearch maps the following terms to your document: "some", "great", "sentence", but not the term "great sentence". So if you do a term filter on "great sentence" or any other part of the original value containing a whitespace, you will not get any results.
Please see the elasticsearch docs on how to configure your mapping for indexing without analyzing (https://www.elastic.co/guide/en/elasticsearch/guide/current/mapping-intro.html#_index_2) or consider doing a match query instead of a term filter on the existing mapping (https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-match-query.html).
Please be aware that if you switch to not_analyzed you will be disabling many of the great fuzzy fulltext query functionality. Of course you can set up a mapping that does both, analyzed and not_analyzed in different fields. Then it's up on you to decide on which field you want to query on.

Resources