Related
I'm trying to sort by location. Similar data and tests work correctly in Elasticsearch but fail using Elastic App Search (latest 8.3 version). The results should be in the following order "Item-2, Item-3, Item-1", instead they are in this order "Item-1, Item-2, Item-3".
Request body:
{
"query": "",
"sort": {
"location": {
"center": [
0,
14
],
"order": "asc"
}
},
"page": {
"size": 10,
"current": 1
}
}
Response body
{
"meta": {
"alerts": [],
"warnings": [],
"precision": 2,
"engine": {
"name": "test-core-item",
"type": "default"
},
"page": {
"current": 1,
"total_pages": 1,
"total_results": 6,
"size": 10
},
"request_id": "c8f5aaaa71d9f152f203f5effd995031"
},
"results": [
{
"location": {
"raw": "0.0,0.0"
},
"_meta": {
"id": "Item-1",
"engine": "test-core-item",
"score": null
},
"id": {
"raw": "Item-1"
}
},
{
"location": {
"raw": "0.0,10.0"
},
"_meta": {
"id": "Item-2",
"engine": "test-core-item",
"score": null
},
"id": {
"raw": "Item-2"
}
},
{
"location": {
"raw": "0.0,20.0"
},
"_meta": {
"id": "Item-3",
"engine": "test-core-item",
"score": null
},
"id": {
"raw": "Item-3"
}
}
]
}
According to the docs Geolocation raw data can be repsesented in different ways, e.g. as a string of ", " or as an array with elements [, ] (see docs). Notice that when passing as string latitude is passed first but when passing as an array longitude is first.
Please mind the difference in the order of coordinates and either pass
"center": [14, 0] or
"center": "0, 14"
So in the example above, the center should have been passed as [14, 0] instead of [0, 14].
I only want to return this course if 'Grade' = 'G6' and Type = 'Open' are a match in the SAME audience tag, they must exist in the SAME tag to return this course. Currently this course is returned if it finds G6 and OPEN is DIFFERENT audiences which is not what I want.
This is not correct and i am getting incorrect data back, I need to query to apply in each audience and only return data if it is true in the same audience
here is my json:
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 71,
"max_score": 3.3118114,
"hits": [
{
"_index": "courses",
"_type": "course",
"_id": "LBTBWdzyRw-jgiiYssjv8A",
"_score": 3.3118114,
"_source": {
"id": "LBTBWdzyRw-jgiiYssjv8A",
"title": "1503 regression testing",
"shortDescription": "asdf",
"description": "asdf",
"learningOutcomes": "",
"modules": [],
"learningProvider": {
"id": "ig2-zIY_QkSpMC4O0Lm0hw",
"name": null,
"termsAndConditions": [],
"cancellationPolicies": []
},
"audiences": [
{
"id": "VfDpsS_5SXi8iZubzTkUBQ",
"name": "comm",
"areasOfWork": [
"Communications"
],
"departments": [],
"grades": [
"G6"
],
"interests": [],
"requiredBy": null,
"frequency": null,
"type": "OPEN",
"eventId": null
},
{
"id": "eZPPPqTqRdiDAE3xCPlJMQ",
"name": "analysis",
"areasOfWork": [
"Analysis"
],
"departments": [],
"grades": [
"G6"
],
"interests": [],
"requiredBy": null,
"frequency": null,
"type": "REQUIRED",
"eventId": null
}
],
"preparation": "",
"owner": {
"scope": "LOCAL",
"organisationalUnit": "co",
"profession": 63,
"supplier": ""
},
"visibility": "PUBLIC",
"status": "Published",
"topicId": ""
}
}
]
}
}
My ES Code:
BoolQueryBuilder boolQuery = boolQuery();
boolQuery.should(QueryBuilders.matchQuery("audiences.departments.keyword", department));
boolQuery.should(QueryBuilders.matchQuery("audiences.areasOfWork.keyword", areaOfWork));
boolQuery.should(QueryBuilders.matchQuery("audiences.interests.keyword", interest));
BoolQueryBuilder filterQuery = boolQuery();
filterQuery.must(QueryBuilders.matchQuery("audiences.grades.keyword", "G6"));
filterQuery.must(QueryBuilders.matchQuery("audiences.type", "OPEN"));
Here is index mapping:
{
"media": {
"aliases": {}
},
"courses": {
"aliases": {}
},
"feedback": {
"aliases": {}
},
"learning-providers": {
"aliases": {}
},
"resources": {
"aliases": {}
},
"courses-0.4.0": {
"aliases": {}
},
".security-6": {
"aliases": {
".security": {}
}
},
"payments": {
"aliases": {}
}
}
Since you want your query to apply in each audience and only return data if it is true in the same audience, you need to specify nested datatype for audiences field otherwise ElasticSearch stores it in form of Objects and it doesnt have concept of nested objects because of which Elasticsearch flattens object hierarchies into a simple list of field names and values.You can refer this for more detail https://www.elastic.co/guide/en/elasticsearch/reference/current/nested.html
Taking your example suppose this was your document :
"audiences": [
{
"id": "1",
"field": "comm"
},
{
"id": "2",
"field": "arts"
}
]
Elasticsearch flattens in the form of :
{
"audiences.id":[1,2],
"audiences.field":[comm,arts]
}
Now here if you search query says that audience must have id:1 and field:arts then also above document will get matched.
So, in order to avoid this such type of objects should be defined as nested object. ElasticSearch will store each object separately instead of flattening it as a result each object will be searched separately.
Mapping of your above mentioned document should be :
Mapping
{
"mappings": {
"properties": {
"shortDescription": {
"type": "text"
},
"audiences": {
"type": "nested"
},
"description": {
"type": "text"
},
"modules": {
"type": "text"
},
"preparation": {
"type": "text"
},
"owner": {
"properties": {
"scope": {
"type": "text"
},
"organisationalUnit": {
"type": "text"
},
"profession": {
"type": "text"
},
"supplier": {
"type": "text"
}
}
},
"learningProvider": {
"properties": {
"id": {
"type": "text"
},
"name": {
"type": "text"
},
"termsAndConditions": {
"type": "text"
},
"cancellationPolicies": {
"type": "text"
}
}
},
"visibility": {
"type": "text"
},
"status": {
"type": "text"
},
"topicId": {
"type": "text"
}
}
}
}
Now, if we index this document :
Document
{
"shortDescription": "asdf",
"description": "asdf",
"learningOutcomes": "",
"modules": [],
"learningProvider": {
"id": "ig2-zIY_QkSpMC4O0Lm0hw",
"name": null,
"termsAndConditions": [],
"cancellationPolicies": []
},
"audiences": [
{
"id": "VfDpsS_5SXi8iZubzTkUBQ",
"name": "comm",
"areasOfWork": [
"Communications"
],
"departments": [],
"grades": [
"G6"
],
"interests": [],
"requiredBy": null,
"frequency": null,
"type": "OPEN",
"eventId": null
},
{
"id": "eZPPPqTqRdiDAE3xCPlJMQ",
"name": "analysis",
"areasOfWork": [
"Analysis"
],
"departments": [],
"grades": [
"G7"
],
"interests": [],
"requiredBy": null,
"frequency": null,
"type": "REQUIRED",
"eventId": null
}
],
"preparation": "",
"owner": {
"scope": "LOCAL",
"organisationalUnit": "co",
"profession": 63,
"supplier": ""
},
"visibility": "PUBLIC",
"status": "Published",
"topicId": ""
}
If you search query is this :
Search Query 1
:
{
"query": {
"nested": {
"path": "audiences",
"query": {
"bool": {
"must": [
{
"match": {
"audiences.type.keyword": "OPEN"
}
},
{
"match": {
"audiences.grades.keyword": "G6"
}
}
]
}
}
}
}
}
Result
"hits": [
{
"_index": "product",
"_type": "_doc",
"_id": "1",
"_score": 0.9343092,
"_source": {
"shortDescription": "asdf",
"description": "asdf",
"learningOutcomes": "",
"modules": [],
"learningProvider": {
"id": "ig2-zIY_QkSpMC4O0Lm0hw",
"name": null,
"termsAndConditions": [],
"cancellationPolicies": []
},
"audiences": [
{
"id": "VfDpsS_5SXi8iZubzTkUBQ",
"name": "comm",
"areasOfWork": [
"Communications"
],
"departments": [],
"grades": [
"G6"
],
"interests": [],
"requiredBy": null,
"frequency": null,
"type": "OPEN",
"eventId": null
},
{
"id": "eZPPPqTqRdiDAE3xCPlJMQ",
"name": "analysis",
"areasOfWork": [
"Analysis"
],
"departments": [],
"grades": [
"G7"
],
"interests": [],
"requiredBy": null,
"frequency": null,
"type": "REQUIRED",
"eventId": null
}
],
"preparation": "",
"owner": {
"scope": "LOCAL",
"organisationalUnit": "co",
"profession": 63,
"supplier": ""
},
"visibility": "PUBLIC",
"status": "Published",
"topicId": ""
}
}
]
But now if your search query is :
Search Query 2 :
{
"query": {
"nested": {
"path": "audiences",
"query": {
"bool": {
"must": [
{
"match": {
"audiences.type.keyword": "OPEN"
}
},
{
"match": {
"audiences.grades.keyword": "G7"
}
}
]
}
}
}
}
}
Result :
"hits": {
"total": {
"value": 0,
"relation": "eq"
},
"max_score": null,
"hits": []
}
So, in short you need to change datatype of audiences field in your mapping and your rest query as well so that it can search for nested datatype.
So, instead of this code fragment :
BoolQueryBuilder filterQuery = boolQuery();
filterQuery.must(QueryBuilders.matchQuery("audiences.grades.keyword", "G6"));
filterQuery.must(QueryBuilders.matchQuery("audiences.type", "OPEN"));
you should use this nested query :
BoolQueryBuilder filterQuery = new BoolQueryBuilder();
filterQuery.must(QueryBuilders.matchQuery("audiences.grades.keyword", "G6"));
filterQuery.must(QueryBuilders.matchQuery("audiences.type", "OPEN"));
NestedQueryBuilder nested = new NestedQueryBuilder("audiences", filterQuery, ScoreMode.None);
I am using pretty old elasticsearch 2.5. I have the availability information of hotels in each doc. There is a field called "availabilities" whose mapping is as follows:
"availabilities":{
"type": "nested",
"dynamic": "strict",
"properties": {
"start": { "type": "date", "format": "yyyy-MM-dd" },
"end": { "type": "date", "format": "yyyy-MM-dd" }
}
}
One of the sample doc (stripped version) is as follows:
{
"name": "Seaside hotel",
"availabilities": [
{
"start": "2018-03-01",
"end": "2018-10-01"
},
{
"start": "2018-10-04",
"end": "2018-10-04"
},
{
"start": "2018-10-06",
"end": "2018-10-06"
},
{
"start": "2018-10-08",
"end": "2018-10-17"
},
{
"start": "2018-10-21",
"end": "2018-10-28"
},
{
"start": "2018-10-30",
"end": "2018-10-31"
},
{
"start": "2018-11-03",
"end": "2018-11-10"
},
{
"start": "2018-11-13",
"end": "2019-03-01"
},
{
"start": "2019-03-04",
"end": "2019-03-04"
},
{
"start": "2019-03-06",
"end": "2020-02-29"
}
]
}
I am trying to find all those hotels' doc, that has availability from "2018-10-01" (YYYY-MM-DD) to "2018-10-10". My search query is as follows:
where the start and end dates are compared in milliseconds 1539154800000 milliseconds = 2018-10-10 and 1538377200000 = 2018-10-01
GET hotels/_search
{
"query": {
"filtered": {
"filter": {
"query": {
"bool": {
"must": [{
"nested": {
"query": {
"bool": {
"must": [{
"script": {
"script": "return (doc['availabilities.end'].date.getMillis() <= 1539154800000 && doc['availabilities.start'].date.getMillis() >= 1538377200000)"
}
}]
}
},
"path": "availabilities"
}
}],
"must_not": null
}
}
}
}
}
}
When I run this query, I end up getting the above "Seaside hotel" in the result set, while it should not have been there because it doesn't have any availabilites from 2018-10-01 to 2018-10-10.
Now I changed my query to not use the script and here I am searching hotel which has availability from 2018-10-09 to 2018-10-16
GET hotels/_search
{
"query": {
"filtered": {
"filter": {
"query": {
"bool": {
"must": [{
"nested": {
"query": {
"bool": {
"must": [{
"range": {
"availabilities.end": {
"gte": "2018-10-16"
}
}
}, {
"range": {
"availabilities.start": {
"lte": "2018-10-09"
}
}
}]
}
},
"path": "availabilities"
}
}],
"must_not": null
}
}
}
}
}
}
and this query should have brought me "Seaside hotel" doc in the result as it has availability per search date, but the search did not give me this hotel.
My whole purpose is to have a query to search hotels in the specified availabilities date by the user. Any idea what am I doing wrong or how can I achieve my goal?
Hello friend your query is returning correct result. I tested it on my machine as you mentioned and its return me correct document.
First I put index
PUT hotels
And then i put name type of hotel index with mentioned mapping
PUT hotels/_mapping/name
{
"name": {
"properties": {
"availabilities": {
"type": "nested",
"dynamic": "strict",
"properties": {
"start": {
"type": "date",
"format": "yyyy-MM-dd"
},
"end": {
"type": "date",
"format": "yyyy-MM-dd"
}
}
},
"name":{
"type": "string"
}
}
}
}
And i put data
PUT hotels/name/1
{
"name": "Seaside hotel",
"availabilities": [
{
"start": "2018-03-01",
"end": "2018-10-01"
},
{
"start": "2018-10-04",
"end": "2018-10-04"
},
{
"start": "2018-10-06",
"end": "2018-10-06"
},
{
"start": "2018-10-08",
"end": "2018-10-17"
},
{
"start": "2018-10-21",
"end": "2018-10-28"
},
{
"start": "2018-10-30",
"end": "2018-10-31"
},
{
"start": "2018-11-03",
"end": "2018-11-10"
},
{
"start": "2018-11-13",
"end": "2019-03-01"
},
{
"start": "2019-03-04",
"end": "2019-03-04"
},
{
"start": "2019-03-06",
"end": "2020-02-29"
}
]
}
And i run query
GET hotels/name/_search
{
"query": {
"bool": {
"must": [
{
"nested": {
"inner_hits":{},
"query": {
"bool": {
"must": [
{
"range": {
"availabilities.end": {
"gte": "2018-10-16"
}
}
},
{
"range": {
"availabilities.start": {
"lte": "2018-10-09"
}
}
}
]
}
},
"path": "availabilities"
}
}
]
}
}
}
And Output is
{
"_index": "hotels",
"_type": "name",
"_id": "1",
"_score": 1.4142135,
"_source": {
"name": "Seaside hotel",
"availabilities": [
{
"start": "2018-03-01",
"end": "2018-10-01"
},
{
"start": "2018-10-04",
"end": "2018-10-04"
},
{
"start": "2018-10-06",
"end": "2018-10-06"
},
{
"start": "2018-10-08",
"end": "2018-10-17"
},
{
"start": "2018-10-21",
"end": "2018-10-28"
},
{
"start": "2018-10-30",
"end": "2018-10-31"
},
{
"start": "2018-11-03",
"end": "2018-11-10"
},
{
"start": "2018-11-13",
"end": "2019-03-01"
},
{
"start": "2019-03-04",
"end": "2019-03-04"
},
{
"start": "2019-03-06",
"end": "2020-02-29"
}
]
}
}
Please check it.
I am trying to persist some objects that have a composite id. If I am only sending an array with one element it works fine, but it the array has more than one it throws an exception when saving the first one. public boolean
addParamsToChart(List<ChartParams> chartParams, Long chartId) {
List<ChartParams> chartParamsList = new ArrayList<>();
for(ChartParams chartParam: chartParams) {
ChartParamsId id = new ChartParamsId();
// id.setChartId(chartId);
id.setChartId(chartParam.getChart().getId());
id.setParamId(chartParam.getParam().getId());
id.setContextSourceId(chartParam.getContextSource().getId());
chartParam.setChartParamsId(id);
if(chartParamsRepository.save(chartParams) !=null) {
chartParamsList.add(chartParam);
}
}
if(chartParamsList.size()!=chartParams.size()) {
// something went wrong, delete previous inserted
deleteChartParams(chartParamsList);
chartRepository.delete(chartId);
return false;
}
return true;
}
[{
"chart": {
"id": 49,
"cv": {
"id": 1,
"name": "Money",
"category": {
"id": 1,
"name": "Euros"
},
"definition": "\"European curreny.\" [EU:euro]",
"enabled": true,
"cvid": "CC:1010"
},
"accountType": {
"id": 1,
"name": "saving"
},
"name": "Euro saving charts"
},
"param": {
"id": 8,
"name": "Totals",
"isFor": "Currency"
},
"contextSource": {
"id": 3,
"name": "euro",
"internal": "eu",
"abbreviatedName": "eu"
} }]
But for this having two objects inside instead of one is not working, throwing an exception at the first save.
[{
"chart": {
"id": 52,
"cv": {
"id": 55,
"name": "Stocks",
"category": {
"id": 1,
"name": "Stocks"
},
"definition": "\"General stocks.\" [GS:ST]",
"enabled": true,
"cvid": "ST:0111"
},
"accountType": {
"id": 1,
"name": "saving"
},
"name": "Stock saving chart"
},
"param": {
"id": 8,
"name": "Totals",
"isFor": "Currency"
},
"contextSource": {
"id": 6,
"name": "stock",
"internal": "st",
"abbreviatedName": "st"
} }, {
"chart": {
"id": 52,
"cv": {
"id": 55,
"name": "Stocks",
"category": {
"id": 1,
"name": "Stocks"
},
"definition": "\"General stocks.\" [GS:ST]",
"enabled": true,
"cvid": "ST:0111"
},
"accountType": {
"id": 1,
"name": "saving"
},
"name": "Stock saving chart"
},
"param": {
"id": 8,
"name": "Totals",
"isFor": "Currency"
},
"contextSource": {
"id": 7,
"name": "Sold stock",
"internal": "st_sold",
"abbreviatedSequence": "st_sold"
} }]
The exception I got is:
org.hibernate.id.IdentifierGenerationException: null id generated for:class eu.stocks.chart.chartParams.ChartParams
I am planning to store million of airbnb type apartments availabilty in elasticsearch .
Where availabilty is an array that contains nested objects (availability type is nested).
And each of those objects have date range, in which that apartment is available.
apartments = [
{
"_id": "kjty873yhekrg789e7r0n87e",
"first_available_date": "2016-06-21",
"availability": [
{
"start": "2016-06-21",
"end": "2016-08-01"
},
{
"start": "2016-08-20",
"end": "2016-08-28"
},
{
"start": "2016-10-03",
"end": "2016-11-02"
},
{ //This means it is available only for one day.
"start": "2016-11-13",
"end": "2016-11-13"
},
{
"start": "2016-11-28",
"end": "2017-01-14"
}
],
"apartment_metadata1": 56456,
"apartment_metadata2": 8989,
"status": "active"
},
{
"_id": "hgk87783iii86937jh",
"first_available_date": "2016-06-09",
"availability": [
{
"start": "2016-06-09",
"end": "2016-07-02"
},
{
"start": "2016-07-21",
"end": "2016-12-19"
},
{
"start": "2016-12-12",
"end": "2017-07-02"
}
],
"apartment_metadata1": 23534,
"apartment_metadata2": 24377,
"status": "active"
}
]
I would want to search apartments those are available for a specific date range (say 2016-08-20 to 2016-12-12). And that
range should fall inside one of the availability date ranges of various apartments.
So I want to write a query, something like:
{
"query": {
"bool": {
"must": [
{
"range": { "first_available_date": {"lte": "2016-08-20"} },
"match": { "status": "active" }
}
]
},
"filter": [
{
"range":
{
"apartments.availability.start": {"gte": "2016-08-20"},
"apartments.availability.end": {"lte": "2016-12-12"}
}
}
]
}
}
}
And above query will return me both apartments (with MULTIPLE availability objects matching the condition),
and that is incorrect, it should only return document with _id: hgk87783iii86937jh as there is EXACTLY one availability object matches the creiteria and that is {"start": "2016-07-21", "end": "2016-12-19"}. So in order to have correct result, the condition should be - there should be EXACTLY one availability object in apartment doc
that should match the condition. So how to enforce that there should be EXACTLY one match in the above query? Second question - is my query even correct?
Using nested query should allow you to achieve the above.
Use inner-hits to get the availability-block that matched.
Below is an example to implement this:
Create Index
put testindex
{
"mappings": {
"data" : {
"properties": {
"availability" : {
"type": "nested"
}
}
}
}
}
Index Data:
put testindex/data/1
{
"first_available_date": "2016-06-21",
"availability": [
{
"start": "2016-06-21",
"end": "2016-08-01"
},
{
"start": "2016-08-20",
"end": "2016-08-28"
},
{
"start": "2016-10-03",
"end": "2016-11-02"
},
{
"start": "2016-11-13",
"end": "2016-11-13"
},
{
"start": "2016-11-28",
"end": "2017-01-14"
},
{
"start": "2016-07-21",
"end": "2016-12-19"
}
],
"apartment_metadata1": 4234,
"apartment_metadata2": 687878,
"status": "active"
}
Query:
post testindex/data/_search
{
"query": {
"bool": {
"must": [
{
"range": {
"first_available_date": {
"lte": "2016-08-20"
}
}
},
{
"match": {
"status": "active"
}
}
],
"filter": [
{
"nested": {
"path": "availability",
"query": {
"bool": {
"must": [
{
"range": {
"availability.start": {
"lte": "2016-08-20"
}
}
},
{
"range": {
"availability.end": {
"gte": "2016-12-12"
}
}
}
]
}
},
"inner_hits": {}
}
}
]
}
}
}
Results:
"hits": {
"total": 1,
"max_score": 1.4142135,
"hits": [
{
"_index": "testindex",
"_type": "data",
"_id": "1",
"_score": 1.4142135,
"_source": {
"first_available_date": "2016-06-21",
"availability": [
{
"start": "2016-06-21",
"end": "2016-08-01"
},
{
"start": "2016-08-20",
"end": "2016-08-28"
},
{
"start": "2016-10-03",
"end": "2016-11-02"
},
{
"start": "2016-11-13",
"end": "2016-11-13"
},
{
"start": "2016-11-28",
"end": "2017-01-14"
},
{
"start": "2016-07-21",
"end": "2016-12-19"
}
],
"apartment_metadata1": 4234,
"apartment_metadata2": 687878,
"status": "active"
},
"inner_hits": {
"availability": {
"hits": {
"total": 1,
"max_score": 1.4142135,
"hits": [
{
"_index": "testindex",
"_type": "data",
"_id": "1",
"_nested": {
"field": "availability",
"offset": 5
},
"_score": 1.4142135,
"_source": {
"start": "2016-07-21",
"end": "2016-12-19"
}
}
]
}
}
}
}
]
}