design of car booking application using elasticsearch - elasticsearch

I need some help in designing car booking application.
There is a document with information about car (title, model, brand, info, etc.)
Problems I'm stuck with are:
How to store available booking days? (I suppose I could use nested
free date range objects in array)
How to store price per day (it's possible to have individual price
per day)?
Booking days and prices could change often. So the third question is: "how to update them cleverly (partially), so I shouldn't read the document, and then store it". I'm looking at script solution using
update api (http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/docs-update.html), but it looks ugly. Maybe there are other approaches?
Thanks,
Alex

with the introduction of the range datatypes, there is no need to use a real nested object, if you meant that.
That might also help you with storing the prices, but that could just be any object I suppose (it depends if you want to search for that as well).
Update API was made for exactly that use-case, that you do not need to get the whole document, so that shounds like a plan.

Related

Elasticsearch read model sync with write model

My application following CQRS strategy separates Read model from Write model. I have a Product and multiple Purchase orders related to that Product.
The PurchaseOrder read model is in Elasticsearch and with product name attached. Now if I change the product name in the write model then I need to update all the PurchaseOrder's productName field accordingly in the read model(using Elasticsearch's bulk update API).
My question is: As I have millions of PurchaseOrders, will this productName sync be a performance issue? Or any suggestions for modeling such kind of syncing?
Although I do not believe that changing a product name on existing orders is a good idea (the invoice might have been generated and the product name in the order should match the one in the invoice), the question still has merit.
You may want your PurchaseOrder to only keep the ID (and perhaps the version?) of the Product, so that you can avoid such a mass update. This approach, on the other hand, requires a call to the Product aggregate root every time you want to translate the ID of the product in its own name. The impact of such a read can obviously be mitigated by using a cache.
I guess it really depend on the number of occurrences of such two circumstances to happen and I would then optimize the most occurring one.

Need help in choosing right caching strategy

We car planning to store prices data to Memcache. prices are subject to car variant and location(city). This is how it is stored in the database.
variant, city, price
21, 48, 40000
Now the confusion is that how do we store this data into Memcache.
Possibility 1 : We store each price in separate cache object and do a multiget if the price of all variant belongs to a model need to be displayed on a single page.
Possibility 2 : We store prices at the model, city level. Prices of all variants of a model will be stored in a single object. This object will be slightly heavy but multiget wouldn't be required.
Need your help in taking the right decision.
TLDR: It all depends on how you want to expose the feature to your end users, and what the query pattern looks like.
For example:
If your flow is that a user can see all the variant prices on a detail page for a city, then you could use <city_id>_<car_model_id> as the key, and store all data for variants against that key (Possibility 2).
If the flow is that a user can see prices of all variants across cities on a single page, then you would need the key as <car_model_id> and store all data as Json against this key
If the flow is that a user can see prices of one variant at a time only for every city, then you would use the key <city_id>_<car_variant_id> and store prices.
One thing to definitely keep in mind is the frequency with which you may have to refresh the cache/ perform upserts, which in the case of cars should be infrequent (who changes the prices of a car every day/second). So, I would have gone with option 1 above (Possibility 2 as described by you).

How to create a quick search in CRM that spans multiple entities with grouped conditions

We are a housing association with a large CRM system (2016 & SP1). We have a new requirement that requires our users to be able to search for people who are current (ie not previous) occupants or residents or who are not residents (eg contractors)
For this purpose, we need to search the Person entity which has a related Tenancy entity. Person has TenancyType field with possible (option set) values Occupant, Resident, Contractor. Tenancy has TenancyStatus field with possible (text) values Current and Previous.
We tried using the following filter criteria in the quick view on the Person entity:
thinking that it would return all people who are not previous residents. However we noticed that it would filter out contractors because contractors do not have related tenancy records.
We needed to change the criteria to return all contractors OR all residents and occupants with no previous tenancy. So we changed it to the following:
at which point we got stuck because we noticed that it was not possible to AND together the second and the third conditions as the third one is a related entity.
We are wondering what the best way is to achieve the above bearing in mind that we do not want a separate view for each condition, eg one for residents, one for none residents, etc.
Any help or suggestion is greatly appreciated.
It is not possible to do this with a single query.
Instead, you can use two queries. If you do not want to do that, then using reports (as suggested by Alex) or a BI-solution would be other possibilities.
Thanks to everyone here who spent time answering my question. The following describes the correct answer:
https://community.dynamics.com/crm/f/117/p/241352/666651#666651

Event Sourcing - Aggregate modeling

Two questions
1) How to model aggregate and reference between them
2) How to organise/store events so that they can be retrieved efficiently
Take this typical use case as example, we have Order and LineItem (they are an aggregate, Order is the aggregate root), and Product aggregate.
As LineItem needs to know which Product, so there are two options 1) LineItem has direct reference to Product aggregate (which seems not a best practice, as it violate the idea of aggregate being a consistence boundary because we can update Product aggregate directly from Order aggregate) 2) then LineItem only has ProductId.
It looks like 2nd option is the way to go...What do you think here?
However, another problem arises which is about building a Order read/view model. In this Order view model, it needs to know which Products are in Order (i.e. ProductId, Type, etc.). The typical use case is reporting, and CommandHandler also can use this Product object to perform logic such as whether there are too many particular products, etc. In order to do it, given the fact that those data are in two separate aggregate, then we need 1+ database roundtrips. As we are using events to build model, so the pseudo code looks like below
1) for a given order id (guid, order aggregate id), we load all the events for it; -- 1st database access
2) then build a Order aggregate, then we know which ProductId are referenced in Order;
3) for the list of ProductIds, we load all events for it; -- 2nd database access
If we build a really big graph of objects (a lot of different aggregates), then this may end up with a few more database access (each of which is slow)...What's your idea in here?
Thanks
Take this typical use case as example, we have Order and LineItem (they are an aggregate, Order is the aggregate root), and Product aggregate.
The Order aggregate makes sense the way you have described it. "Product aggregate" is more suspicious; do you ask the model if the product is allowed to change, or are you telling the model that the product has changed?
If Product can change without first consulting with the order, then the LineItem must not include the product. A reference to the product (aka the ProductId) is ok.
If we build a really big graph of objects (a lot of different aggregates), then this may end up with a few more database access (each of which is slow)...What's your idea in here?
For reads, reports, and the like -- where you aren't going to be adding new events to the history -- one possible answer is to do the slow work in advance. An asynchronous process listens for writes in the event store, and then publishes those events to a bus. Subscribers build new versions of the reports when new events are observed, and cache the results. (search keyword: cqrs)
When a client asks for a report, you give them one out of the cache. All the work is done, so it's very quick.
For command handlers, the answer is more complicated. Business rules are supposed to be in the domain model, so having the command handler try to validate the command (as opposed to the domain model) is a bit broken.
The command handler can load the products to see what the state might look like, and pass that information to the aggregate with the command data, but it's not clear that's a good idea -- if the client is going to send a command to be run, and you need to flesh out the Order command with Product data, why not instead have the command add the product data to the command directly, and skip the middle man.
CommandHandler also can use this Product object to perform logic such as whether there are too many particular products, etc.
This example is a bit vague, but taking a guess: you are thinking about cases where you prevent an order from being placed if the available inventory is insufficient to fulfill the order.
For real world inventory - a physical book in a warehouse - that's probably the wrong approach to take. First, the model itself is wrong; if you want to know how much product is in the warehouse, you should be querying the warehouse, not the product. Second, a physical warehouse is not constrained by your model -- calling the addProduct method on a warehouse aggregate doesn't cause the product to magically appear there.
Third, it probably doesn't match very well with what your domain experts want anyway. If the model says that the warehouse doesn't have enough product, do you think the stake holders want the system to
tell the shopper to buy the product somewhere else, or...
accept the order, and contact the supplier for a new delivery.
Hint: when in doubt, carefully review how amazon.com does it.

Database schema for rewarding users for their activities

I would like to provide users with points when they do a certain thing. For example:
adding article
adding question
answering question
liking article
etc.
Some of them can have conditions like there are only points for first 3 articles a day, but I think I will handle this directly in my code base.
The problem is what would be a good database design to handle this? I think of 3 tables.
user_activities - in this table I will store event types (I use
laravel so it would probably be the event class name) and points for
specific event.
activity_user - pivot table between user_activities and users.
and of course users table
It is very simple so I am worrying that there are some conditions I haven't thought of, and it would come and bite me in the future.
I think you'll need a forth table that is simply "activities" that is simply a list of the kinds of activities to track. This will have an ID column, and then in your user_activities table include an 'activity_id' to link to that. You'll no doubt have unique information for each kind, for example an activities table may have columns like
ID : unique ID per laravel
ACTIVITY_CODE : short code to use as part of application/business logic
ACTIVITY_NAME : longer name that is for display name like "answered a question"
EVENT : what does the user have to do to trigger the activity award
POINT_VALUE: how many points for this event
etc
If you think that points may change in the future (eg. to encourage certain user activities) then you'll want to track the actual point awarded at the time in the user activities table, or some way to track what the points were at any one time.
While I'm suggesting fourth table, what you really need is more carefully worded list of features to be implemented before doing any design work. My example of allowing for points awarded to change over time is such a feature that you don't mention but you'll need to design for if this feature is needed.
Well I have found this https://laracasts.com/lessons/build-an-activity-feed-in-laravel as very good solution. Hope it helps someone :)

Resources