How to make minimum_should_match work with nested mapping? - elasticsearch

I have an issue regarding ElasticSearch and More like this query.
Having mapping:
{
"directory.v1": {
"mappings": {
"profile.event": {
"properties": {
"event": {
"properties": {
"naics": {
"type": "nested",
"properties": {
"type": {
"type": "keyword"
},
"value": {
"type": "keyword"
}
}
}
}
},
"user_id": {
"type": "long"
}
}
}
}
}
}
and document (A) as a source and document (B) to be found with more like this query (for A)
Profile A (used as source):
{
"_index": "directory.v1",
"_type": "profile.event",
"_id": "83731111.559",
"_score": 1,
"_source": {
"user_id": 8373,
"event": {
"naics": [
{
"value": 331,
"type": "naics"
},
{
"value": 74,
"type": "naics"
},
{
"value": 938,
"type": "naics"
},
{
"value": 2048,
"type": "naics"
},
{
"value": 939,
"type": "naics"
},
{
"value": 2049,
"type": "naics"
},
{
"value": 940,
"type": "naics"
},
{
"value": 2050,
"type": "naics"
},
{
"value": 941,
"type": "naics"
},
{
"value": 2051,
"type": "naics"
},
{
"value": 942,
"type": "naics"
},
{
"value": 2052,
"type": "naics"
},
{
"value": 943,
"type": "naics"
},
{
"value": 2053,
"type": "naics"
},
{
"value": 944,
"type": "naics"
},
{
"value": 2054,
"type": "naics"
},
{
"value": 945,
"type": "naics"
},
{
"value": 2055,
"type": "naics"
},
{
"value": 473,
"type": "naics"
},
{
"value": 128,
"type": "naics"
},
{
"value": 10,
"type": "naics"
},
{
"value": 1242,
"type": "naics"
},
{
"value": 472,
"type": "naics"
},
{
"value": 1241,
"type": "naics"
}
]
}
}
}
Profile B:
{
"_index": "directory.v1",
"_type": "profile.event",
"_id": "46124111.559",
"_score": 1,
"_source": {
"user_id": 46124,
"event": {
"naics": [
{
"value": 331,
"type": "naics"
},
{
"value": 74,
"type": "naics"
},
{
"value": 938,
"type": "naics"
},
{
"value": 2048,
"type": "naics"
},
{
"value": 939,
"type": "naics"
},
{
"value": 2049,
"type": "naics"
},
{
"value": 940,
"type": "naics"
},
{
"value": 2050,
"type": "naics"
},
{
"value": 941,
"type": "naics"
},
{
"value": 2051,
"type": "naics"
},
{
"value": 942,
"type": "naics"
},
{
"value": 2052,
"type": "naics"
},
{
"value": 943,
"type": "naics"
},
{
"value": 2053,
"type": "naics"
},
{
"value": 944,
"type": "naics"
},
{
"value": 2054,
"type": "naics"
},
{
"value": 945,
"type": "naics"
},
{
"value": 2055,
"type": "naics"
}
]
}
}
}
where B doc has all elements (naics) included in A document.
So that I really do not understand why for query:
{
"query": {
"nested": {
"path": "event.naics",
"query": {
"more_like_this": {
"like": [
{
"_id": "83731111.559",
"_type": "profile.event"
}
],
"fields": [
"event.naics.value"
],
"min_term_freq": 1,
"min_doc_freq": 1,
"minimum_should_match": "8%"
}
}
}
}
}
I have results!!
but when I increase min_should_match >= 9% it does not match at all and I get no results.
Also tried to do something like this which gets me some results up to 11%
{
"query": {
"nested": {
"path": "event.naics",
"query": {
"more_like_this": {
"like": [
{
"_id": "83731111.559",
"_type": "profile.event"
}
],
"fields": [
"event.naics.*"
],
"min_term_freq": 1,
"min_doc_freq": 1,
"minimum_should_match": "11%"
}
}
}
}
}
And termvecor for source document is:
{
"_index": "directory.v1",
"_type": "profile.event",
"_id": "83731111.559",
"_version": 5,
"found": true,
"took": 0,
"term_vectors": {}
}

If you get the term vector for document "A" for field event.naics.value you will see you have 24 terms in total each with term frequency 1.
So when you do 8% match that will be rounded down to 1 clause of the 24 generated should clauses, so you get a match. But 9% of 24 will round to 2 clauses should match which is no bueno as each of your nested document has only one value.
For calculation details you can see the bottom of this page
https://github.com/elastic/elasticsearch/blob/99f88f15c5febbca2d13b5b5fda27b844153bf1a/server/src/main/java/org/elasticsearch/common/lucene/search/Queries.java
And morelikethis source is here
https://github.com/elastic/elasticsearch/blob/46a79127edfb0cc93b7580624010ff81ca0cb2f4/server/src/main/java/org/elasticsearch/common/lucene/search/MoreLikeThisQuery.java
Term vector
POST /directory.v1/profile.event/83731111.559/_termvectors
{
"fields":["event.naics.value"],
"offsets" : false,
"payloads" : false,
"positions" : false,
"term_statistics" : true,
"field_statistics" : true
}

Related

Searching through objects inside nested documents provides unexpected output

I am getting unexpected result from Elasticsearch while searching though object property which are inside nested property. I am using elasticsearch-dsl python library for creating document and querying. Is this bug or I am missing something in querying and mapping parts? Below are the elasticsearch json mapping, query, unexpected result and expected result
Mapping:
{
"deal_acls": {
"type": "nested",
"properties": {
"created_at": {
"type": "date"
},
"created_by": {
"properties": {
"id": {
"type": "keyword"
},
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword"
}
}
},
"status": {
"type": "keyword",
"normalizer": "lowercase"
}
}
},
"permission": {
"properties": {
"CRM": {
"properties": {
"description": {
"properties": {
"created_by": {
"type": "keyword",
"normalizer": "lowercase"
},
"object": {
"type": "keyword",
"normalizer": "lowercase"
},
"object_id": {
"type": "keyword",
"normalizer": "lowercase"
},
"timestamp": {
"type": "date"
}
}
},
"permission": {
"properties": {
"delete": {
"type": "integer"
},
"edit": {
"type": "integer"
},
"manage": {
"type": "long"
},
"read": {
"type": "integer"
},
"write": {
"type": "integer"
}
}
}
}
},
"deal": {
"properties": {
"description": {
"properties": {
"created_by": {
"type": "keyword",
"normalizer": "lowercase"
},
"object": {
"type": "keyword",
"normalizer": "lowercase"
},
"object_id": {
"type": "keyword",
"normalizer": "lowercase"
},
"timestamp": {
"type": "date"
}
}
},
"permission": {
"properties": {
"delete": {
"type": "integer"
},
"edit": {
"type": "integer"
},
"manage": {
"type": "long"
},
"read": {
"type": "integer"
},
"write": {
"type": "integer"
}
}
}
}
},
"document": {
"properties": {
"description": {
"properties": {
"created_by": {
"type": "keyword",
"normalizer": "lowercase"
},
"object": {
"type": "keyword",
"normalizer": "lowercase"
},
"object_id": {
"type": "keyword",
"normalizer": "lowercase"
},
"timestamp": {
"type": "date"
}
}
},
"permission": {
"properties": {
"delete": {
"type": "integer"
},
"edit": {
"type": "integer"
},
"manage": {
"type": "long"
},
"read": {
"type": "integer"
},
"write": {
"type": "integer"
}
}
}
}
},
"external_deal_team": {
"properties": {
"description": {
"properties": {
"created_by": {
"type": "keyword",
"normalizer": "lowercase"
},
"object": {
"type": "keyword",
"normalizer": "lowercase"
},
"object_id": {
"type": "keyword",
"normalizer": "lowercase"
},
"timestamp": {
"type": "date"
}
}
},
"permission": {
"properties": {
"delete": {
"type": "integer"
},
"edit": {
"type": "integer"
},
"manage": {
"type": "long"
},
"read": {
"type": "integer"
},
"write": {
"type": "integer"
}
}
}
}
},
"internal_deal_team": {
"properties": {
"description": {
"properties": {
"created_by": {
"type": "keyword",
"normalizer": "lowercase"
},
"object": {
"type": "keyword",
"normalizer": "lowercase"
},
"object_id": {
"type": "keyword",
"normalizer": "lowercase"
},
"timestamp": {
"type": "date"
}
}
},
"permission": {
"properties": {
"delete": {
"type": "integer"
},
"edit": {
"type": "integer"
},
"manage": {
"type": "long"
},
"read": {
"type": "integer"
},
"write": {
"type": "integer"
}
}
}
}
}
}
},
"status": {
"type": "keyword",
"normalizer": "lowercase"
},
"updated_at": {
"type": "date"
},
"updated_by": {
"properties": {
"id": {
"type": "keyword"
},
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword"
}
}
},
"status": {
"type": "keyword",
"normalizer": "lowercase"
}
}
},
"user": {
"properties": {
"id": {
"type": "keyword"
},
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword"
}
}
},
"status": {
"type": "keyword",
"normalizer": "lowercase"
}
}
}
}
}
}
Query:
{
"query": {
"bool": {
"must": [
{
"nested": {
"path": "deal_acls",
"query": {
"term": {
"deal_acls.permission.deal.permission.read": 1
}
}
}
},
{
"nested": {
"path": "deal_acls",
"query": {
"terms": {
"deal_acls.user.id": [
"5f7cea05-6562-4bdd-8448-19cfbe11783a"
]
}
}
}
}
]
}
}
}
Unexpected result: Since the deal permission of user with id=5f7cea05-6562-4bdd-8448-19cfbe11783a is 0, it should be returning empty hits.
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": 2,
"hits": [
{
"_index": "dev.crecentric.dealvault.deals",
"_type": "_doc",
"_id": "a928838d-3422-41db-b40e-28f5c793f806",
"_score": 2,
"_source": {
"id": "a928838d-3422-41db-b40e-28f5c793f806",
"deal_acls": [
{
"user": {
"id": "5f7cea05-6562-4bdd-8448-19cfbe11783a",
"name": "testerrrrs testesssss",
"status": "active"
},
"permission": {
"deal": {
"permission": {
"edit": 0,
"read": 0,
"write": 0,
"delete": 0,
"manage": 0
},
"description": {
"object": "workspace",
"object_id": "fbc840b1-8727-4945-a070-fa1c105f9550",
"timestamp": "2022-02-10T05:46:05.140867+00:00",
"created_by": "d78411e5-6645-4b95-a98c-db6db8748580"
}
},
"external_deal_team": {
"permission": {
"edit": 0,
"read": 1,
"write": 1,
"delete": 0,
"manage": 0
},
"description": {
"object": "workspace",
"object_id": "fbc840b1-8727-4945-a070-fa1c105f9550",
"timestamp": "2022-02-10T05:46:05.140902+00:00",
"created_by": "d78411e5-6645-4b95-a98c-db6db8748580"
}
},
"internal_deal_team": {
"permission": {
"edit": 0,
"read": 1,
"write": 1,
"delete": 0,
"manage": 0
},
"description": {
"object": "workspace",
"object_id": "fbc840b1-8727-4945-a070-fa1c105f9550",
"timestamp": "2022-02-10T05:46:05.140910+00:00",
"created_by": "d78411e5-6645-4b95-a98c-db6db8748580"
}
}
},
"status": "active",
"created_at": "2022-02-10T05:40:15.727598+05:45",
"updated_at": "2022-02-10T05:46:05.177076+05:45"
},
{
"user": {
"id": "d78411e5-6645-4b95-a98c-db6db8748580",
"name": "Ramesh Pradhan",
"status": "active"
},
"permission": {
"CRM": {
"permission": {
"edit": 1,
"read": 1,
"write": 1,
"delete": 1,
"manage": 1
},
"description": {
"object": "owner",
"object_id": "d78411e5-6645-4b95-a98c-db6db8748580",
"timestamp": "2022-02-10T05:35:41.453881+00:00",
"created_by": "d78411e5-6645-4b95-a98c-db6db8748580"
}
},
"deal": {
"permission": {
"edit": 1,
"read": 1,
"write": 1,
"delete": 1,
"manage": 1
},
"description": {
"object": "owner",
"object_id": "d78411e5-6645-4b95-a98c-db6db8748580",
"timestamp": "2022-02-10T05:35:41.453881+00:00",
"created_by": "d78411e5-6645-4b95-a98c-db6db8748580"
}
},
"document": {
"permission": {
"edit": 1,
"read": 1,
"write": 1,
"delete": 1,
"manage": 1
},
"description": {
"object": "owner",
"object_id": "d78411e5-6645-4b95-a98c-db6db8748580",
"timestamp": "2022-02-10T05:35:41.453881+00:00",
"created_by": "d78411e5-6645-4b95-a98c-db6db8748580"
}
},
"external_deal_team": {
"permission": {
"edit": 1,
"read": 1,
"write": 1,
"delete": 1,
"manage": 1
},
"description": {
"object": "owner",
"object_id": "d78411e5-6645-4b95-a98c-db6db8748580",
"timestamp": "2022-02-10T05:35:41.453881+00:00",
"created_by": "d78411e5-6645-4b95-a98c-db6db8748580"
}
},
"internal_deal_team": {
"permission": {
"edit": 1,
"read": 1,
"write": 1,
"delete": 1,
"manage": 1
},
"description": {
"object": "owner",
"object_id": "d78411e5-6645-4b95-a98c-db6db8748580",
"timestamp": "2022-02-10T05:35:41.453881+00:00",
"created_by": "d78411e5-6645-4b95-a98c-db6db8748580"
}
}
},
"status": "active",
"created_at": "2022-02-10T05:35:41.453913+05:45",
"updated_at": "2022-02-10T05:35:41.462956+05:45"
}
]
}
}
]
}
}
Expected result:
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": 2,
"hits": []
}
}
You are using two nested queries: so the result is a document which contains "deal_acls.permission.deal.permission.read": 1 and "deal_acls.user.id": ["5f7cea05-6562-4bdd-8448-19cfbe11783a"], but not necessary in the same subobject of the nested field. Try using a terms query inside a single nested query:
{
"query": {
"nested": {
"path": "deal_acls",
"query": {
"bool": {
"must": [
{ "match": { "deal_acls.user.id": "5f7cea05-6562-4bdd-8448-19cfbe11783a" }},
{ "match": { "deal_acls.permission.deal.permission.read": 1 }}
]
}
}
}
}
}
In case if someone is searching for answer, I have fixed this using this query:
{
"query": {
"nested": {
"path": "deal_acls",
"query": {
"bool": {
"must": [
{
"term": {
"deal_acls.permission.deal.permission.read": 1
}
},
{
"terms": {
"deal_acls.user.id": [
"5f7cea05-6562-4bdd-8448-19cfbe11783a"
]
}
}
]
}
}
}
}
}

Elasticsearch: Already re-mapping but It still wont show all fields

I'm just trying to sync my mongodb with ElasticSearch. I've done the sync with river, and river already worked. But the ES won't show all fields, it only show "_ts" field in the object of "_source":
Request:
GET localhost:9200/test/orders/_search
Response:
{
"took": 11,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 137,
"max_score": 1,
"hits": [
{
"_index": "test",
"_type": "orders",
"_id": "58a3251f761f35a107724add",
"_score": 1,
"_source": {
"_ts": 6438761296509796000
}
},
{
"_index": "test",
"_type": "orders",
"_id": "58a340467f39c50f3a54c614",
"_score": 1,
"_source": {
"_ts": 6438761296509796000
}
},
{
"_index": "test",
"_type": "orders",
"_id": "58b8ec806f34179d7c7b2431",
"_score": 1,
"_source": {
"_ts": 6438761296509796000
}
},
{
"_index": "test",
"_type": "orders",
"_id": "58b8eff56f3417670f7b244a",
"_score": 1,
"_source": {
"_ts": 6438761296509796000
}
},
{
"_index": "test",
"_type": "orders",
"_id": "58b8f0af6f3417fb207b244c",
"_score": 1,
"_source": {
"_ts": 6438761296509796000
}
},
{
"_index": "test",
"_type": "orders",
"_id": "58b8f19a6f341761337b23da",
"_score": 1,
"_source": {
"_ts": 6438761296509796000
}
},
{
"_index": "test",
"_type": "orders",
"_id": "58b9320c6f3417bc1c7b23c7",
"_score": 1,
"_source": {
"_ts": 6438761296509796000
}
},
{
"_index": "test",
"_type": "orders",
"_id": "58b9339f6f341777237b23c6",
"_score": 1,
"_source": {
"_ts": 6438761296509796000
}
},
{
"_index": "test",
"_type": "orders",
"_id": "58b934ab6f341778237b23c7",
"_score": 1,
"_source": {
"_ts": 6438761296509796000
}
},
{
"_index": "test",
"_type": "orders",
"_id": "58b96ef76f34174a4b7b23c8",
"_score": 1,
"_source": {
"_ts": 6438761296509796000
}
}
]
}
}
I already did the remapping, and It shows the updated mapping. See this:
request:
GET localhost:9200/test/orders/_mapping
response:
{
"test": {
"mappings": {
"orders": {
"properties": {
"_ts": {
"type": "long"
},
"activeDate": {
"type": "text"
},
"awbNumber": {
"type": "text"
},
"batchID": {
"type": "text"
},
"consignee": {
"properties": {
"id": {
"type": "text"
},
"name": {
"type": "text"
},
"phoneNumber": {
"type": "text"
}
}
},
"consigner": {
"properties": {
"id": {
"type": "text"
},
"name": {
"type": "text"
},
"phoneNumber": {
"type": "text"
}
}
},
"courier": {
"properties": {
"actualRate": {
"properties": {
"UoM": {
"type": "text"
},
"value": {
"type": "integer"
}
}
},
"id": {
"type": "integer"
},
"max_day": {
"type": "integer"
},
"min_day": {
"type": "integer"
},
"name": {
"type": "text"
},
"rate": {
"properties": {
"UoM": {
"type": "text"
},
"value": {
"type": "integer"
}
}
},
"rate_id": {
"type": "integer"
},
"rate_name": {
"type": "text"
},
"shipmentType": {
"type": "integer"
}
}
},
"creationDate": {
"type": "text"
},
"destination": {
"properties": {
"address": {
"type": "text"
},
"cityID": {
"type": "integer"
},
"cityName": {
"type": "text"
},
"id": {
"type": "integer"
},
"provinceID": {
"type": "integer"
},
"provinceName": {
"type": "text"
}
}
},
"driver": {
"properties": {
"feedback": {
"properties": {
"comment": {
"type": "text"
},
"score": {
"type": "long"
}
}
},
"id": {
"type": "long"
},
"isPaymentCollected": {
"type": "integer"
},
"name": {
"type": "text"
},
"phoneNumber": {
"type": "text"
},
"vehicleNumber": {
"type": "text"
},
"vehicleType": {
"type": "text"
}
}
},
"externalID": {
"type": "text"
},
"groupID": {
"type": "integer"
},
"id": {
"type": "text"
},
"isActive": {
"type": "long"
},
"isAutoTrack": {
"type": "integer"
},
"isCustomAWB": {
"type": "integer"
},
"isEscrow": {
"type": "integer"
},
"isLabelPrinted": {
"type": "integer"
},
"lastUpdatedDate": {
"type": "text"
},
"origin": {
"properties": {
"address": {
"type": "text"
},
"cityID": {
"type": "integer"
},
"cityName": {
"type": "text"
},
"id": {
"type": "integer"
},
"provinceID": {
"type": "integer"
},
"provinceName": {
"type": "text"
}
}
},
"package": {
"properties": {
"content": {
"type": "text"
},
"contents": {
"type": "integer"
},
"cubicalWeight": {
"properties": {
"UoM": {
"type": "text"
},
"value": {
"type": "integer"
}
}
},
"dimension": {
"properties": {
"height": {
"properties": {
"UoM": {
"type": "text"
},
"value": {
"type": "integer"
}
}
},
"length": {
"properties": {
"UoM": {
"type": "text"
},
"value": {
"type": "integer"
}
}
},
"width": {
"properties": {
"UoM": {
"type": "text"
},
"value": {
"type": "integer"
}
}
}
}
},
"fragile": {
"type": "integer"
},
"isConfirmed": {
"type": "integer"
},
"itemName": {
"type": "text"
},
"itemSubtype": {
"type": "integer"
},
"itemType": {
"type": "integer"
},
"pictureURL": {
"type": "text"
},
"price": {
"properties": {
"UoM": {
"type": "text"
},
"value": {
"type": "integer"
}
}
},
"type": {
"type": "integer"
},
"weight": {
"properties": {
"UoM": {
"type": "text"
},
"value": {
"type": "integer"
}
}
}
}
},
"paymentType": {
"type": "text"
},
"pickUpTime": {
"type": "text"
},
"rates": {
"properties": {
"actualInsurance": {
"properties": {
"UoM": {
"type": "text"
},
"value": {
"type": "long"
}
}
},
"actualShipment": {
"properties": {
"UoM": {
"type": "text"
},
"value": {
"type": "integer"
}
}
},
"escrowCost": {
"properties": {
"UoM": {
"type": "text"
},
"value": {
"type": "integer"
}
}
},
"fulfillmentCost": {
"properties": {
"UoM": {
"type": "text"
},
"value": {
"type": "integer"
}
}
},
"insurance": {
"properties": {
"UoM": {
"type": "text"
},
"value": {
"type": "long"
}
}
},
"itemPrice": {
"properties": {
"UoM": {
"type": "text"
},
"value": {
"type": "integer"
}
}
},
"liability": {
"properties": {
"UoM": {
"type": "text"
},
"value": {
"type": "integer"
}
}
},
"shipment": {
"properties": {
"UoM": {
"type": "text"
},
"value": {
"type": "integer"
}
}
}
}
},
"readyTime": {
"type": "text"
},
"shipmentStatus": {
"properties": {
"description": {
"type": "text"
},
"name": {
"type": "text"
},
"statusCode": {
"type": "integer"
},
"updateDate": {
"type": "text"
},
"updatedBy": {
"type": "text"
}
}
},
"source": {
"type": "text"
},
"specialID": {
"type": "text"
},
"stickerNumber": {
"type": "text"
},
"useInsurance": {
"type": "integer"
}
}
}
}
}
}
I'm expecting I can get the whole fields (not just a single "_ts" field) in the "_source" field which it should be the same with what I've mapped.
I've tried to delete the index and recreate one, still not worked. Any clue of this kind of issue. I really really need help, thank you so much.
ElasticSearch has a behavior of creating fields on its own. So if you do
PUT document/index/1
{
"id" : "1",
"name" : "kashish",
"phoneNumber" : "9740683281"
}
This will automatically create fields for your index. What I am illustrating is if explicitly defining of fields is not working for you for some reason , you can empty the index(if contains dummy data) and then just put your json which ES will pickup automatically.

Elastic Search sum of child items

I'm try do a query like this in Elastic Search:
Return me all the devices of an app that had some logs between two dates and for each device return me the total number of logs
For this I've a parent-child relationship. I've the parent device type that has the device information and then a child entity device_logs that has the number of logs for each day.
I tried to run the following query with a custom score function. I do get the right devices, but the score has the sum of all the device_logs entries instead of the entries in the dates range.
Any idea if it's possible to do this kind of query?
{
"query": {
"bool": {
"filter" :
[
{
"term": {"app": 347}
}
],
"must" :
[
{
"has_child": {
"type": "device_logs",
"inner_hits" : {},
"query": {
"bool": {
"filter": {
"range": {
"date": {
"from": "2017-01-15T00:00:00Z",
"include_lower": true,
"include_upper": true,
"to": "2017-01-17T23:59:59Z"
}
}
}
}
}
}
},
{
"has_child": {
"type": "device_logs",
"score_mode": "sum",
"query" : {
"function_score" : {
"script_score": {
"script": "_score * doc['logs'].value"
}
}
}
}
}
]
}
}
}
EDIT: Adding mappings and some docs
Here you have the mappings:
"mappings": {
"device": {
"properties": {
"app": {
"type": "long",
"include_in_all": false
},
"created_at": {
"type": "date",
"include_in_all": false
},
"id": {
"type": "long",
"include_in_all": false
},
"language": {
"type": "keyword",
"include_in_all": false,
"ignore_above": 256
},
"last_log_at": {
"type": "date",
"include_in_all": false
},
"last_ping_at": {
"type": "date",
"include_in_all": false
},
"last_seen_at": {
"type": "date"
},
"log_enabled": {
"type": "boolean"
},
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
},
"lowercase": {
"type": "text",
"analyzer": "case_insensitive_sort"
}
}
},
"os_version": {
"type": "keyword",
"include_in_all": false,
"ignore_above": 256
},
"timezone": {
"type": "keyword",
"include_in_all": false,
"ignore_above": 256
},
"type": {
"type": "keyword",
"ignore_above": 256
},
"udid": {
"type": "keyword",
"ignore_above": 256
},
"version": {
"properties": {
"build": {
"type": "keyword",
"include_in_all": false,
"ignore_above": 256
},
"id": {
"type": "long",
"include_in_all": false
},
"version": {
"type": "keyword",
"include_in_all": false,
"ignore_above": 256
}
}
}
}
},
"device_logs": {
"_parent": {
"type": "device"
},
"_routing": {
"required": true
},
"properties": {
"_": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"app": {
"type": "long",
"include_in_all": false
},
"date": {
"type": "date",
"include_in_all": false
},
"errors": {
"type": "long",
"include_in_all": false
},
"logs": {
"type": "long",
"include_in_all": false
},
"warnings": {
"type": "long",
"include_in_all": false
}
}
}
}
And some documents:
{
"_index": "devices",
"_type": "device_logs",
"_id": "22466_2017.01.17",
"_score": 1,
"_routing": "22466",
"_parent": "22466",
"_source": {
"_": "22466_2017.01.17",
"app": 200,
"date": "2017-01-17T00:00:00Z",
"logs": 660,
"warnings": 238,
"errors": 217
}
}
{
"_index": "devices",
"_type": "device",
"_id": "22466",
"_score": 1,
"_source": {
"id": 22466,
"udid": "770CA14ED7FE861EC452",
"name": "Edward's iPhone",
"type": "iPhone7,2",
"app": 200,
"log_enabled": false,
"created_at": "2016-12-21T10:55:02Z",
"last_seen_at": "2017-01-19T10:07:33Z",
"last_log_at": "2017-01-19T11:07:40.756275026+01:00",
"language": "en-US",
"os_version": "9.2",
"timezone": "GMT+1",
"version.id": 7305,
"version.version": "1",
"version.build": "100"
}
}
I have solved your query.
From the first look at the query, I was doubtful that you are not filtering the child documents in one of the must filters before applying the function score on the child document.
I have used the following set of documents for this query
parent doc
{
"id": 22466,
"udid": "770CA14ED7FE861EC452",
"name": "Edward's iPhone",
"type": "iPhone7,2",
"app": 347,
"log_enabled": false,
"created_at": "2016-12-21T10:55:02Z",
"last_seen_at": "2017-01-19T10:07:33Z",
"last_log_at": "2017-01-19T11:07:40.756275026+01:00",
"language": "en-US",
"os_version": "9.2",
"timezone": "GMT+1",
"version.id": 7305,
"version.version": "1",
"version.build": "100"
}
child docs
{
"_type": "device_logs",
"_id": "22466_2017.01.17",
"_score": 0,
"_routing": "22466",
"_parent": "22466",
"_source": {
"_": "22466_2017.01.17",
"app": 200,
"date": "2017-01-17T00:00:00Z",
"logs": 660,
"warnings": 238,
"errors": 217
}
},
{
"_type": "device_logs",
"_id": "22466_2017.02.17",
"_score": 0,
"_routing": "22466",
"_parent": "22466",
"_source": {
"_": "22466_2017.02.17",
"app": 200,
"date": "2017-01-17T00:00:00Z",
"logs": 200,
"warnings": 238,
"errors": 217
}
},
{
"_type": "device_logs",
"_id": "22466_2017.02.20",
"_score": 0,
"_routing": "22466",
"_parent": "22466",
"_source": {
"_": "22466_2017.02.20",
"app": 200,
"date": "2017-01-20T00:00:00Z",
"logs": 200,
"warnings": 238,
"errors": 217
}
}
Note - The first must filter only filter the documents for innerhits.
Please use the following query:
{
"query": {
"bool": {
"filter": [{
"term": {
"app": 347
}
}],
"must": [{
"has_child": {
"type": "device_logs",
"inner_hits": {},
"query": {
"bool": {
"filter": {
"range": {
"date": {
"from": "2017-01-15T00:00:00Z",
"include_lower": true,
"include_upper": true,
"to": "2017-01-17T23:59:59Z"
}
}
}
}
}
}
}, {
"has_child": {
"type": "device_logs",
"score_mode": "sum",
"query": {
"function_score": {
"query": {
"bool": {
"filter": {
"range": {
"date": {
"from": "2017-01-15T00:00:00Z",
"include_lower": true,
"include_upper": true,
"to": "2017-01-17T23:59:59Z"
}
}
}
}
},
"score_mode": "sum",
"boost_mode": "sum",
"script_score": {
"script": "_score + doc['logs'].value"
}
}
}
}
}]
}
}
}
Few references https://github.com/elastic/elasticsearch/issues/10051
Following is the response I get with explain bool set to true
{
"took": 7,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 861,
"hits": [
{
"_shard": "[array_index1][0]",
"_node": "nnauJDrIS8-QCqicOMF23g",
"_index": "array_index1",
"_type": "device",
"_id": "22466",
"_score": 861,
"_source": {
"id": 22466,
"udid": "770CA14ED7FE861EC452",
"name": "Edward's iPhone",
"type": "iPhone7,2",
"app": 347,
"log_enabled": false,
"created_at": "2016-12-21T10:55:02Z",
"last_seen_at": "2017-01-19T10:07:33Z",
"last_log_at": "2017-01-19T11:07:40.756275026+01:00",
"language": "en-US",
"os_version": "9.2",
"timezone": "GMT+1",
"version.id": 7305,
"version.version": "1",
"version.build": "100"
},
"_explanation": {
"value": 861,
"description": "sum of:",
"details": [
{
"value": 1,
"description": "A match, join value 22466",
"details": []
},
{
"value": 860,
"description": "A match, join value 22466",
"details": []
},
{
"value": 0,
"description": "match on required clause, product of:",
"details": [
{
"value": 0,
"description": "# clause",
"details": []
},
{
"value": 1,
"description": "app:[347 TO 347], product of:",
"details": [
{
"value": 1,
"description": "boost",
"details": []
},
{
"value": 1,
"description": "queryNorm",
"details": []
}
]
}
]
}
]
},
"inner_hits": {
"device_logs": {
"hits": {
"total": 2,
"max_score": 0,
"hits": [
{
"_type": "device_logs",
"_id": "22466_2017.01.17",
"_score": 0,
"_routing": "22466",
"_parent": "22466",
"_source": {
"_": "22466_2017.01.17",
"app": 200,
"date": "2017-01-17T00:00:00Z",
"logs": 660,
"warnings": 238,
"errors": 217
}
},
{
"_type": "device_logs",
"_id": "22466_2017.02.17",
"_score": 0,
"_routing": "22466",
"_parent": "22466",
"_source": {
"_": "22466_2017.02.17",
"app": 200,
"date": "2017-01-17T00:00:00Z",
"logs": 200,
"warnings": 238,
"errors": 217
}
}
]
}
}
}
}
]
}
}
Please verify your results.

Query/Filter by nested objects getting unexpected results in some nested objects

I'm having weird results on some nested objects that I can't understand.
I mean, in mapping the nested objects are all the same, they have an id and a tree id also, both are long and thats it, but profession and category do not return the expected values, but the others do (professionType, professionSubtype and attributeType).
So I got index mapped as:
Mapping:
{
"attribute-tree": {
"mappings": {
"attribute": {
"dynamic": "strict",
"properties": {
"attributeType": {
"properties": {
"id": {"type": "long"},
"tree": {"type": "long"}
}
},
"category": {
"properties": {
"id": {"type": "long"},
"tree": {"type": "long"}
}
},
"family": {"type": "long"},
"id": {"type": "long"},
"name": {
"type": "string",
"index": "not_analyzed"
},
"parentTree": {"type": "long"},
"profession": {
"properties": {
"id": {"type": "long"},
"tree": {"type": "long"}
}
},
"professionSubtype": {
"properties": {
"id": {"type": "long"},
"tree": {"type": "long"}
}
},
"professionType": {
"properties": {
"id": {"type": "long"},
"tree": {"type": "long"}
}
},
"sorter": {
"properties": {
"id": {"type": "long"},
"name": {
"type": "string",
"index": "not_analyzed"
},
"tree": {"type": "long"}
}
},
"suggester": {
"type": "completion",
"index_analyzer": "edgeNGram_analyzer",
"search_analyzer": "whitespace_analyzer",
"payloads": true,
"preserve_separators": true,
"preserve_position_increments": true,
"max_input_length": 50
},
"tree": {"type": "long"},
"type": {"type": "string"}
}
},
"division": {
// same as "attribute"
},
"profession-subtype": {
// same as "attribute"
},
"profession-type": {
// same as "attribute"
},
"profession": {
// same as "attribute"
},
"category": {
// same as "attribute"
}
}
}
}
If any kind of information is missing, please say
Thanks in advance
Example
Filtering by category.id:
POST /attribute-tree/_search
{
"size": 2,
"query": {
"filtered": {
"filter": {
"term": {"category.id": 1}
}
}
}
}
I get:
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 31,
"max_score": 1,
"hits": [
{
"_index": "attribute-tree",
"_type": "category",
"_id": "4064",
"_score": 1,
"_source": {
"id": 1,
"tree": 4064,
"name": "Profession",
"type": "C",
"parentTree": 4063,
"profession": {
"id": null,
"tree": null
},
"professionType": {
"id": 1,
"tree": 1
},
"professionSubtype": {
"id": 6,
"tree": 4063
},
"category": {
"id": null,
"tree": null
},
"attributeType": {
"id": null,
"tree": null
},
"family": [
4063,
1
],
"suggester": {
"input": [
"Profession"
],
"output": "Profession"
},
"sorter": {
"name": "Profession",
"id": 1,
"tree": 4064
}
}
},
{
"_index": "attribute-tree",
"_type": "category",
"_id": "4083",
"_score": 1,
"_source": {
"id": 1,
"tree": 4083,
"name": "Profession",
"type": "C",
"parentTree": 4082,
"profession": {
"id": null,
"tree": null
},
"professionType": {
"id": 2,
"tree": 4072
},
"professionSubtype": {
"id": 8,
"tree": 4082
},
"category": {
"id": null,
"tree": null
},
"attributeType": {
"id": null,
"tree": null
},
"family": [
4082,
4072
],
"suggester": {
"input": [
"Profession"
],
"output": "Profession"
},
"sorter": {
"name": "Profession",
"id": 1,
"tree": 4083
}
}
}
]
}
}
Notice that on this example the category.id doesn't have the expected value (this is also valid for when it is 0 instead of null)
However I have nodes with category.id = 1:
POST /attribute-tree/_search
{
"size": 2,
"query": {
"filtered": {
"filter": {
"term": {"tree": 4}
}
}
}
}
Item:
{
"took": 5,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "attribute-tree",
"_type": "profession",
"_id": "4",
"_score": 1,
"_source": {
"id": 1,
"tree": 4,
"name": "A&R Administrator",
"type": "P",
"parentTree": 3,
"profession": {
"id": null,
"tree": null
},
"professionType": {
"id": 1,
"tree": 1
},
"professionSubtype": {
"id": 1,
"tree": 2
},
"category": {
"id": 1,
"tree": 3
},
"attributeType": {
"id": null,
"tree": null
},
"family": [
3,
2,
1
],
"suggester": {
"input": [
"A&R",
"Administrator"
],
"output": "A&R Administrator"
},
"sorter": {
"name": "A&R Administrator",
"id": 1,
"tree": 4
}
}
}
]
}
}
Filtering by professionType.id:
POST /attribute-tree/_search
{
"size": 2,
"query": {
"filtered": {
"filter": {
"term": {"professionSubtype.id": 1}
}
}
}
}
I get:
{
"took": 7,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 3890,
"max_score": 1,
"hits": [
{
"_index": "attribute-tree",
"_type": "category",
"_id": "251",
"_score": 1,
"_source": {
"id": 4,
"tree": 251,
"name": "Medium",
"type": "C",
"parentTree": 2,
"profession": {
"id": null,
"tree": null
},
"professionType": {
"id": 1,
"tree": 1
},
"professionSubtype": {
"id": 1,
"tree": 2
},
"category": {
"id": null,
"tree": null
},
"attributeType": {
"id": null,
"tree": null
},
"family": [
2,
1
],
"suggester": {
"input": [
"Medium"
],
"output": "Medium"
},
"sorter": {
"name": "Medium",
"id": 4,
"tree": 251
}
}
},
{
"_index": "attribute-tree",
"_type": "profession",
"_id": "4",
"_score": 1,
"_source": {
"id": 1,
"tree": 4,
"name": "A&R Administrator",
"type": "P",
"parentTree": 3,
"profession": {
"id": null,
"tree": null
},
"professionType": {
"id": 1,
"tree": 1
},
"professionSubtype": {
"id": 1,
"tree": 2
},
"category": {
"id": 1,
"tree": 3
},
"attributeType": {
"id": null,
"tree": null
},
"family": [
3,
2,
1
],
"suggester": {
"input": [
"A&R",
"Administrator"
],
"output": "A&R Administrator"
},
"sorter": {
"name": "A&R Administrator",
"id": 1,
"tree": 4
}
}
}
]
}
}
Also tested as:
POST /attribute-tree/_search
{
"query":{
"filtered": {
"query": {"match_all":{}},
"filter": {
"nested": {
"path": "category",
"filter": {
"bool": {
"must": [
{"term": {"category.id": 1}}
]
}
}
}
}
}
}
}
But this gives an error
org.elasticsearch.index.query.QueryParsingException: [attribute-tree] [nested] nested object under path [category] is not of nested type
From the mapping it looks like the index has a type of category as well as a field in attribute type called category.
To allow for proper field resolution and disambiguate between field id in type category vs field category.id in type attribute you would need to specify the entire path to the field including the type i.e <type>.<fieldname>.
Example:
POST /attribute-tree/_search
{
"size": 2,
"query": {
"filtered": {
"filter": {
"term": {"attribute.category.id": 1}
}
}
}
}
This issue thread has more discussion with regard to this.

Elastic search not return all fields

I have an index which name is news, here is the stored mapping in the elasticsearch:
{
"news": {
"mappings": {
"newsdetail": {
"properties": {
"active": {
"type": "long"
},
"authorId": {
"type": "long"
},
"creationDate": {
"type": "date",
"format": "dateOptionalTime"
},
"iDate": {
"type": "string"
},
"iTime": {
"type": "string"
},
"isExclusive": {
"type": "boolean"
},
"itemId": {
"type": "long"
},
"languageId": {
"type": "long"
},
"lead": {
"type": "string"
},
"main": {
"type": "long"
},
"mainTitr": {
"type": "long"
},
"mediaType": {
"type": "long"
},
"newsType": {
"type": "long"
},
"pictureId": {
"type": "long"
},
"ref": {
"type": "string"
},
"subMain": {
"type": "long"
},
"subMainTitr": {
"type": "long"
},
"subtitle": {
"type": "string"
},
"text": {
"type": "string"
},
"title": {
"type": "string"
},
"video": {
"type": "string"
},
"viewCount": {
"type": "long"
}
}
}
}
}
}
when I try to search in this index's data it returns this:
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 69612,
"max_score": 1,
"hits": [
{
"_index": "news",
"_type": "newsdetail",
"_id": "AU6Gt1fK36ANCl26_meY",
"_score": 1,
"_source": {
"itemId": 452,
"pictureId": 452,
"authorId": 1,
"languageId": 1,
"active": 1,
"title": "خبر تستی",
"main": 1,
"mainTitr": 1,
"subMain": 1,
"subMainTitr": 1,
"video": "ویدئوی تستی",
"newsType": 3,
"creationDate": "2014-06-09T18:38:47.203",
"viewCount": 201,
"isExclusive": false,
"mediaType": 0
}
}
]
}
}
some of fields has missed. what should I do to see whole fields. note that I have not select fields and my request is GET news/_search?pretty=true&size=1

Resources