Elasticsearch - Conditional fields or naming alias? - elasticsearch

In my elasticsearch setup I want to search amongst a large collection of payments (billions), where each payment has multiple account names. This is due to each user being able to name assign a personal name to each account, while an account can be shared by many users. 1 account, many names (1 per user).
Simplified payment structure:
Payment {
"agreementNumber": 12345,
"accountNumber": 123456789,
"amount": 17,
"currency": "EUR",
"accountName" : ...
}
The user needs to be able to search for and sort on account names, however the result set must differ based on user.
Eg. if Bob and Lisa both have access to the same account, but have made individual naming, sorting on account name would make the payments appear in a different sequence. The rest of the payment details however remains unchanged. This example could be repeated for several thousands users that created their own naming.
Consideration I have made:
Flattened or inner object
I am unable to flatten the structure of the payment to contain all possible account names, and only use the account name for the specific user given the context. That would mean that all payments made with said account number would need to contain all names, and all payments with said number would need to be updated every time an alias is created, updated or deleted.
Nested
Here I would create one collection of names that I refer to in my parent. This would lighten the storage as I only maintain the account names once. This comes with limitations that an update to my nested element (list of account names), would trigger a reindex of all parents (payments) as well. As one account number can be part of thousands to millions of payments this would be very expensive.
Parent/Child relationship
Having account names as children of the parent means, that I can update the child and parent independently negating the drawback of nesting. However Elasticsearch doesn't support joins as far as I have understand, meaning I would get payments and account names as individual documents.
How do I structure my account names without it being crazy expensive?
Note
The limitations mentioned stems from this post:
https://www.elastic.co/blog/managing-relations-inside-elasticsearch

Related

How do i satisfy business requirements across microservices with immediate consistenc?

Let’s assume I’m in the context of an admin panel for a webshop. I have a list of orders. Those orders are payed for and are ready to ship. The (admin) user would like to start making shipments based on the items ordered.
Imagine there are 2 microservices. One for orders and one for shipments. In order to create a shipment, i will send a request with a couple of items to be shipped and an order ID to the shipment service. The shipment service will then check whether the items are present in the order by querying the order service. Because i don’t want to create a shipment with items that are not present in the order.
I’d like to have immediate consistency because the shipment data will be send to a third-party application after creation. Thereby it also feels weird to allow shipments to be created if the data is not correct.
I’m also using GraphQL mutations. Which means i have to return the updated state to the user, which also makes eventual consistency a lot harder.
What is the recommended approach for these situations? Could this be a sign that these 2 microservices need to be merged? I can imagine this situation can occur multiple times.

Need help in choosing right caching strategy

We car planning to store prices data to Memcache. prices are subject to car variant and location(city). This is how it is stored in the database.
variant, city, price
21, 48, 40000
Now the confusion is that how do we store this data into Memcache.
Possibility 1 : We store each price in separate cache object and do a multiget if the price of all variant belongs to a model need to be displayed on a single page.
Possibility 2 : We store prices at the model, city level. Prices of all variants of a model will be stored in a single object. This object will be slightly heavy but multiget wouldn't be required.
Need your help in taking the right decision.
TLDR: It all depends on how you want to expose the feature to your end users, and what the query pattern looks like.
For example:
If your flow is that a user can see all the variant prices on a detail page for a city, then you could use <city_id>_<car_model_id> as the key, and store all data for variants against that key (Possibility 2).
If the flow is that a user can see prices of all variants across cities on a single page, then you would need the key as <car_model_id> and store all data as Json against this key
If the flow is that a user can see prices of one variant at a time only for every city, then you would use the key <city_id>_<car_variant_id> and store prices.
One thing to definitely keep in mind is the frequency with which you may have to refresh the cache/ perform upserts, which in the case of cars should be infrequent (who changes the prices of a car every day/second). So, I would have gone with option 1 above (Possibility 2 as described by you).

Elasticsearch the best way to design multiple one to many and many to many

I have two scenarios that I want to support but I don’t know the best way to design relations in the elasticsearch. I read the entire elasticsearch documentation but I couldn’t find the best way to design types for my scenarios.
Multiple one to many.
Let’s assume that I have the following tables in my relational database that I want to transfer to the elasticsearch:
Transaction table Id User1Id User2Id ….
User table Id Name
Transaction contains two references to User. As far as I know I cannot use the parent->child relation specifying two parents? I need to store transaction and user in separate types because they can be changed separately. I need to be able to search transaction through user details and return users connected with transactions. Any idea how to design such structure in the elastic search?
Many to many
Let’s assume that we have the following tables:
Order Id …
OrderLine OrderId UserId Amount …
User Id Name
Order line is always saved with the order so I thought that I can store order with order lines as a nested object relation but the user must be in the separate table. Is there any way how can I connected multiple users from order line with user type? I assume that I can use application side join but I need to retrieve order and order line always together and be able to search order by user data.
I can use grandparent and grandchildren relations but then I need to make joins in the application. Any idea how to design it in the best way?

How can I distinguish between students and teachers?

Using the Google Classroom API method userProfile, I can get various information about a user, including their name and email address, but not whether they are a student or teacher. How can I determine whether a user is a student or teacher?
Classroom does have the concept of teachers and students, however the distinction between teachers and students is only meaningful relative to a particular course (it’s possible for a user to be a “teacher” of one course and a “student” of another) and so you might not be able to use these categories to apply access controls in the way you were expecting.
For example, if alice#school.edu is a member of a particular course’s courses.teachers collection, and bob#school.edu is a member of courses.students, then you can use this information to decide that bob#school.edu should not see certain content created by alice#school.edu. (For example, you might not want to show Bob the answers to a quiz that Alice has created on your website, just the questions.)
However, because by default all users can create courses, you probably do not want to show alice#school.edu sensitive information created by teachers of other courses, information intended for teachers that you provide (for example, if you are a textbook publisher), or giving her domain-wide admin features.
If you need to distinguish between “real-world” teachers and students, we recommend that you do this via a mechanism entirely separate from Classroom, such as checking that the user’s email address appears in:
a separately-maintained list of teachers (e.g. CSV uploaded by admin)
the classroom_teachers group – domain administrators can choose to verify teachers to allow them to create new classes (use the Directory API to list a user’s groups)
Classroom api dosent provide global role for a teacher or a student its vary from course to course so you can just call student/teacher api
after that you will get json output and you find a special permission for teacher "Create Course" it will help you to recognized that the person is teacher.
"permissions": [
{
"permission": "CREATE_COURSE"
}
]
in case of student this array will be null.

CLP and ACL for multi-tenant app in Parse.com

Imagine a website that agregates online ordering for many restaurants and is built using parse.com.
In parse.com there is a class called Order where all of the orders are stored.
Each order belongs to one, and only one, restaurant.
When querying the Order class, each restaurant can only read (and write) its own orders. A restaurant should not see (and write) orders for other restaurants.
To solve this, I've tried using one role per restaurant and add the restaurant-role to the each restaurants order's ACLs. So I've created one role for each of the Restaurants using the following naming taxonomy: Restaurant-[restaurantObjectId].
I have taken care that user's belong to their respective restaurant-role.
I've also fiddle with Class Level Permissions (CLPs) without results: either total access or total lack of access, none of access limited to restaurant data.
Any clues?
It seems that one has to have make the Find operation available to the Public. Otherwise it gives the not authorized error.

Resources