Elasticsearch - Mapping fields from other indices - elasticsearch

How can I define mapping in Elasticsearch 7 to index a document with a field value from another index? For example, if I have a users index which has a mapping for name, email and account_number but the account_number value is actually in another index called accounts in field number.
I've tried something like this without much success (I only see "name", "email" and "account_id" in the results):
PUT users/_mapping
{
"properties": {
"name": {
"type": "text"
},
"email": {
"type": "text"
},
"account_id": {
"type": "integer"
},
"accounts": {
"properties": {
"number": {
"type": "text"
}
}
}
}
}
The accounts index has the following mapping:
{
"properties": {
"name": {
"type": "text"
},
"number": {
"type": "text"
}
}
}

As I understand it, you want to implement field joining as is usually done in relational databases. In elasticsearch, this is possible only if the documents are in the same index. (Link to doc). But it seems to me that in your case you need to work differently, I think your Account object needs to be nested for User.
PUT /users/_mapping
{
"mappings": {
"properties": {
"account": {
"type": "nested"
}
}
}
}
You can further search as if it were a separate document.
GET /users/_search
{
"query": {
"nested": {
"path": "account",
"query": {
"bool": {
"must": [
{ "match": { "account.number": 1 } }
]
}
}
}
}
}

Related

Elasticsearch multiple index query

I have a following index which stores the course details (I have truncated some attributes for brevity):
{
"settings": {
"index": {
"number_of_replicas": "1",
"number_of_shards": "1"
}
},
"aliases": {
"course": {
}
},
"mappings": {
"properties": {
"name": {
"type": "text"
},
"id": {
"type": "integer"
},
"max_per_user": {
"type": "integer"
}
}
}
}
Here max_per_user is number of times a user can complete the course. A user is allowed through a course multiple times but not more than max_per_user for a course
I want to track user interactions with courses. I have created following index to track interaction events. event_type_id represents a type of interaction
{
"settings": {
"index": {
"number_of_replicas": "1",
"number_of_shards": "1"
}
},
"aliases": {
"course_events": {
}
},
"mappings": {
"properties": {
"user_progress": {
"dynamic": "true",
"properties": {
"current_count": {
"type": "integer"
},
"user_id": {
"type": "integer"
},
"events": {
"dynamic": "true",
"properties": {
"event_type_id": {
"type": "integer"
},
"event_timestamp": {
"type": "date",
"format": "strict_date_time"
}
}
}
}
},
"created_at": {
"type": "date",
"format": "strict_date_time"
},
"course_id": {
"type": "integer"
}
}
}
}
Where current_count is number of times the user has gone through the complete course
Now when I run a search on course index, I also want to be able to pass in the user_id and get only those courses where the current_count for the given user is less than max_per_user for the course
My search query for course index is something like this (truncated some filters for brevity). This query is executed when a user searches for a course, so basically at the time of executing this I will have user_id.
{
"sort": [
{
"id": "desc"
}
],
"query": {
"bool": {
"filter": [
{
"range": {
"end_date": {
"gte": "2020-09-28T12:27:55.884Z"
}
}
},
{
"range": {
"start_date": {
"lte": "2020-09-28T12:27:55.884Z"
}
}
}
],
"must": [
{
"term": {
"is_active": true
}
}
]
}
}
}
I am not sure how to construct my search query such that I am able to filter out courses where max_per_user has been achieved for a given user_id.
If I understood the question correctly you want to find the courses where max_per_user limit isn't exceeded. My answer is on the same basis:
Considering your current Schema way to find what you want is:
For the given user_id find all the course_ids and their corresponding completion count
Using the data fetched in #1 find out the courses where-in max_per_user limit is not exceeded.
Now comes the problem:
In a relational database such use case can be solved using table join and checks
Elastic Search doesn't support joins and can't be done here.
Poor solution with current schema:
For each course check whether it is applicable or not. For n courses number of queries to E.S will be proportional to N.
Solution with current schema:
With-in the user-course-completion index (second index you mentioned), track max_per_user as well and use a simple query like below, to get the required course ids :
{
"size": 10,
"query": {
"script": {
"script": "doc['current_usage'].value<doc['max_per_user'].value &&
doc['u_id'].value==1" // <======= 1 is the user_id here
}
}
}

nested terms aggregation on object containing a string field

I like to run a nested terms aggregation on string field which is inside an object.
Usually, I use this query
"terms": {
"field": "fieldname.keyword"
}
to enable fielddata
But I am unable to do that for a nested document like this
{
"nested": {
"path": "objectField"
},
"aggs": {
"allmyaggs": {
"terms": {
"field": "objectField.fieldName.keyword"
}
}
}
}
The above query is just returning an empty buckets array
Is there a way this can be done without enabling field-data by default during index mapping.
Since that will take a large heap memory and I have already loaded a huge data without it
document mapping
{
"mappings": {
"properties": {
"productname": {
"type": "nested",
"properties": {
"productlineseqno": {
"type": "text"
},
"invoiceitemname": {
"type": "text"
},
"productlinename": {
"type": "text"
},
"productlinedescription": {
"type": "text"
},
"isprescribable": {
"type": "boolean"
},
"iscontrolleddrug": {
"type": "boolean"
}
}
}
sample document
{
"productname": [
{
"productlineseqno": "1.58",
"iscontrolleddrug": "false",
"productlinename": "Consultations",
"productlinedescription": "Consultations",
"isprescribable": "false",
"invoiceitemname": "invoice name"
}
]
}
Fixed
By changing the mapping to enable field data
Nested query is used to access nested fields similarly nested aggregation is needed to aggregation on nested fields
{
"aggs": {
"fieldname": {
"nested": {
"path": "objectField"
},
"aggs": {
"fields": {
"terms": {
"field": "objectField.fieldname.keyword",
"size": 10
}
}
}
}
}
}
EDIT1:
If you are searching for productname.invoiceitemname.keyword then it will give empty bucket as no field exists with that name.
You need to define your mapping like below
{
"mappings": {
"properties": {
"productname": {
"type": "nested",
"properties": {
"productlineseqno": {
"type": "text"
},
"invoiceitemname": {
"type": "text",
"fields":{ --> note
"keyword":{
"type":"keyword"
}
}
},
"productlinename": {
"type": "text"
},
"productlinedescription": {
"type": "text"
},
"isprescribable": {
"type": "boolean"
},
"iscontrolleddrug": {
"type": "boolean"
}
}
}
}
}
}
Fields
It is often useful to index the same field in different ways for
different purposes. This is the purpose of multi-fields. For instance,
a string field could be mapped as a text field for full-text search,
and as a keyword field for sorting or aggregations:
When mapping is not explicitly provided, keyword fields are created by default. If you are creating your own mapping(which you need to do for nested type), you need to provide keyword fields in mapping, wherever you intend to use them

How can I get parent with all children in one query

I have following mapping:
PUT /test_products
{
"mappings": {
"_doc": {
"properties": {
"type": {
"type": "keyword"
},
"name": {
"type": "text"
},
"entity_id": {
"type": "integer"
},
"weighted": {
"type": "integer"
}
"product_relation": {
"type": "join",
"relations": {
"window": "simple"
}
}
}
}
}
}
I want to get "window" products with all "simple"s but only where one or more "simple"s have property "weighted" = 1
I wrote following query:
GET test_products/_search
{
"query": {
"has_child": {
"type": "simple",
"query": {
"term": {
"weighted": 1
}
},
"inner_hits": {}
}
}
}
But I've got "window"s with "simple"s which are match to the term. In other words I want to filter "window"s list by "simple"'s option and get all matched "window"s with all their "simple"s. Is it possible without "nested" in one query? Or I have to do some queries?
OK. Luckily, I need to get only one "window" product with all it's children by it's ID, so I found parent_id query which can helps me with this task.
Now I have following query:
GET test_products/_search
{
"query": {
"parent_id": {
"type": "simple",
"id": "window-1"
}
}
}
Unfortunately, I have to execute 2 queries (has_child and then parent_id) instead of one but it's OK for me.

Search for documents in elasticsearch and then query the nested fields

I have an index like this:
{
"rentals": {
"aliases": {},
"mappings": {
"rental": {
"properties": {
"address": {
"type": "text"
},
"availability": {
"type": "nested",
"properties": {
"chargeBasis": {
"type": "text"
},
"date": {
"type": "date"
},
"isAvailable": {
"type": "boolean"
},
"rate": {
"type": "double"
}
}
}
}
And this is my use case:
I need to search for all the "rentals" that have a given address.
This is easy and done
I need to get "availability" data for all those "rentals" searched; only for today's date.
This is the part where I'm stuck at, how do I query the nested documents of all the "rentals"?
You need to use the nested query:
Because nested objects are indexed as separate hidden documents, we can’t query them directly. Instead, we have to use the nested query to access them.
Try something like:
{
"query": {
"nested": {
"path": "availability",
"query": {
"term": {
"availability.date": "2015-01-01"
}
}
}
}
}

Unable to find a field mapper for field in nested query using field_value_factor

Here's the mapping:
PUT books-index
{
"mappings": {
"books": {
"properties": {
"tags": {
"type": "nested",
"fields": {
"name": {
"type": "string"
},
"weight": {
"type": "float"
}
}
}
}
}
}
}
Then doing a nested query using a field_value_factor fails with an error
GET books-index/books/_search
{
"query": {
"nested": {
"path": "tags",
"score_mode": "sum",
"query": {
"function_score": {
"query": {
"match": {
"tags.name": "world"
}
},
"field_value_factor": {
"field": "weight"
}
}
}
}
}
}
The error: "nested: ElasticsearchException[Unable to find a field mapper for field [weight]]"
Interestingly, if there's one book in the index with tags - there's no error and the query works well.
Why is this happening? how can I prevent the error when there are no books with tags in the index?
Any ideas?
Thank you!
P.S. There's also an issue on github for this: https://github.com/elastic/elasticsearch/issues/12871
it looks like your mapping is incorrect.
After PUTing the mapping you provided, try executing GET books-index/_mapping, It will show these results:
"books-index": {
"mappings": {
"books": {
"properties": {
"tags": {
"type": "nested"
}
}
}
}
}
It's missing name and weight! The problem with the mapping is that you used either you used fields instead of properties, or you forget to put in a second properties key.
I modified your mapping to reflect that you were looking for a nested name and type within tags, as it looks like that is what your query wants.
PUT books-index
{
"mappings": {
"books": {
"properties": {
"tags": {
"type": "nested",
"properties": { // <--- HERE!
"name": {
"type": "string"
},
"weight": {
"type": "float"
}
}
}
}
}
}
}

Resources