Are txs in the same block ordered? - solana

Suppose I have a 2 txs A and B where B depends on A succeeding.
If both A & B are included in the same block/slot, this implies there may be some partial ordering between txs in this block that needs to be enforced for them to succeed.
Does this mean that the leader tries to order txs in the block when proposing? If yes, is this ordering of txs surfaced at the RPC level (eg maybe a "tx slot" inside the block)?

The leader validator that produces the block does in fact order the transactions in whichever order they want, based on account read / write locks, affected programs, etc. Typically, it will go first-come, first-serve, but in time, as MEV is added to validators, this will no longer be the case, and they can enforce their own constraints.
At the RPC level, the order of transactions in the block is the order in which they were executed. The explorer surfaces this with a "#" column, e.g. https://explorer.solana.com/block/137445155?filter=all

Related

Seeding microservices databases

Given service A (CMS) that controls a model (Product, let's assume the only fields that it has are id, title, price) and services B (Shipping) and C (Emails) that have to display given model what should the approach be to synchronize given model information across those services in event sourcing approach? Let's assume that product catalog rarely changes (but does change) and that there are admins that can access data of shipments and emails very often (example functionalities are: B:display titles of products the order contained and C:display content of email about shipping that is going to be sent). Each of the services has their own DB.
Solution 1
Send all required information about Product within event - this means following structure for order_placed:
{
order_id: [guid],
product: {
id: [guid],
title: 'Foo',
price: 1000
}
}
On service B and C product information is stored in product JSON attribute on orders table
As such, to display necessary information only data retrieved from the event is used
Problems: depending upon what other information needs to be presented in B and C, amount of data in event can grow. B and C might not require the same information about Product, but the event will have to contain both (unless we separate the events into two). If given data is not present within given event, code can not use it - if we'll add a color option to given Product, for existing orders in B and C, given product will be colorless unless we update the events and then rerun them.
Solution 2
Send only guid of product within event - this means following structure for order_placed:
{
order_id: [guid],
product_id: [guid]
}
On services B and C product information is stored in product_id attribute on orders table
Product information is retrieved by services B and C when required by performing an API call to A/product/[guid] endpoint
Problems: this makes B and C dependant upon A (at all times). If schema of Product changes on A, changes have to be done on all services that depend on them (suddenly)
Solution 3
Send only guid of product within event - this means following structure for order_placed:
{
order_id: [guid],
product_id: [guid]
}
On services B and C product information is stored in products table; there's still product_id on orders table, but there's replication of products data between A, B and C; B and C might contain different information about Product than A
Product information is seeded when services B and C are created and are updated whenever information about Products changes by making call to A/product endpoint (that displays required information of all products) or by performing a direct DB access to A and copying necessary product information required for given service.
Problems: this makes B and C dependant upon A (when seeding). If schema of Product changes on A, changes have to be done on all services that depend on them (when seeding)
From my understanding, the correct approach would be to go with solution 1, and either update events history per certain logic (if Product catalog hasn't changed and we want to add color to be displayed, we can safely update history to get current state of Products and fill missing data within the events) or cater for nonexistence of given data (if Product catalog has changed and we want to add color to be displayed, we can't be sure if at that point in time in the past given Product had a color or not - we can assume that all Products in previous catalog were black and cater for by updating events or code)
Solution #3 is really close to the right idea.
A way to think about this: B and C are each caching "local" copies of the data that they need. Messages processed at B (and likewise at C) use the locally cached information. Likewise, reports are produced using the locally cached information.
The data is replicated from the source to the caches via a stable API. B and C don't even need to be using the same API - they use whatever fetch protocol is appropriate for their needs. In effect, we define a contract -- protocol and message schema -- which constrain the provider and the consumer. Then any consumer for that contract can be connected to any supplier. Backward incompatible changes require a new contract.
Services choose the appropriate cache invalidation strategy for their needs. This might mean pulling changes from the source on a regular schedule, or in response to a notification that things may have changed, or even "on demand" -- acting as a read through cache, falling back to the stored copy of the data when the source is not available.
This gives you "autonomy", in the sense that B and C can continue to deliver business value when A is temporarily unavailable.
Recommended reading: Data on the Outside, Data on the Inside, Pat Helland 2005.
Generally speaking, I'd strongly recommend against option 2 because of the temporal coupling between those two service (unless communication between these services is super stable, and not very frequent). Temporal coupling is what you describe as this makes B and C dependant upon A (at all times), and means that if A is down or unreachable from B or C, B and C cannot fulfill their function.
I personally believe that both options 1 and 3 have situations where they are valid options.
If the communication between A and B & C is so high, or the amount of data needed to go into the event is large enough to make it a concern, then option 3 is the best option, because the burden on the network is much lower, and latency of operations will decrease as the message size decreases. Other concerns to consider here are:
Stability of contract: if the contract of message leaving A changed often, then putting a lot of properties in the message would result in lots of changes in consumers. However, in this case I believe this to not be a big problem because:
You mentioned that system A is a CMS. This means that you're working on a stable domain and as such I don't believe you'll be seeing frequent changes
Since the B and C are shipping and email, and you're receiving data from A, I believe you'll be experiencing additive changes instead of breaking ones, which are safe to add whenever you discover them with no rework.
Coupling: There is very little to no coupling here. First since the communication is via messages, there is no coupling between the services other than a short temporal one during seeding of the data, and the contract of that operation (which is not a coupling you can or should try to avoid)
Option 1 is not something I'd dismiss though. There is the same amount of coupling, but development-wise it should be easy to do (no need for special actions), and stability of the domain should mean that these won't change often (as I mentioned already).
Another option I'd suggest is a slight variation to 3, which is not to run the process during start-up, but instead observe a "ProductAdded and "ProductDetailsChanged" event on B and C, wheneve there is a change in the product catalogue in A. This would make your deployments faster (and so easier to fix a problem/bug if you find any).
Edit 2020-03-03
I have a specific order of priorities when determining the integration approach:
What is the cost of consistency? Can we accept some milliseconds of inconsistency between things changed in A and them being reflected in B & C?
Do you need point-in-time queries (also called temporal queries)?
Is there any source of truth for the data? A service which owns them and is considered upstream?
If there is an owner / single source of truth is that stable? Or do we expect to see frequent breaking changes?
If the cost of inconsistency is high, (basically the product data in A need to be consistent as soon as possible with product cached in B and C), then youb cannot avoid needing to accept unavaibility, and make a synchronous request (like a web/rest request) from B & C to A to fetch the data. Be aware! This still does not mean transactionally consistent, but just minimizes the windows for inconsistency. If you absolutely, positively have to be immediately consistent, you need to rething your service boundaries. However, I very strongly believe this should not be a problem. From experience, it's actually extremely rare that the company can't accept some seconds of inconsistency, so you shouldn't even need to make synchronous requests.
If you do need point-in-time queries (which I didn't notice in your question and hence didn't include above, maybe wrongly), the cost of maintaining this on downstream services is so high (you'd need to duplicate internal event projection logic in all downstream services) that makes the decision clear: you should leave ownership to A, and query A ad-hoc over web request (or similar), and A should use event sourcing to retrieve all the events you knew about at the time to project to the state, and return it. I guess this may be option 2 (if I understood correctly?), but the costs are such that while temporal coupling is better than maintainance cost of duplciated events and projection logic.
If you don't need a point in time, and there isn't a clear, single owner of the data (which in my initial answer I did assume this based on your question), then a very reasonable pattern would be to hold representations of the product in each service separately. When you update the data for products, you update A, B and C in parallel by making parallel web requests to each one, or you have a command API which send multiple commands to each of A, B and C. B & C use their local version of the data to do their job, which may or may not be stale. This isn't any of the options above (although it could be made to be close to option 3), as data in A, B and C may differ, and the "whole" of the product may be a composition of all three data sources.
Knowing if the source of truth is has a stable contract is useful because you can use it to use the domain/internal events (or events you store in your event sourcing as storage pattern in A) for integration across A and services B and C. If the contract is stable you can integrate through the domain events. However, then you have an additional concern in the case where changes are frequent, or that contract of message is large enough that make transport a concern.
If you have a clear owner, with a contrac that is expected to be stable, the best options would be option 1; an order would contain all necessary information and then B and C would do their function using the data in the event.
If the contract is liable to change, or break often, following your option 3, that is falling back to web requests to fetch product data is actually a better option, since it's a much easier to maintain multiple versions. So B would make a request on v3 of product.
There are two hard things in Computer Science, and one of them is cache invalidation.
Solution 2 is absolutely my default position, and you should generally only consider implementing caching if you run into one of the following scenarios:
The API call to Service A is causing performance problems.
The cost of Service A being down and being unable to retrieve the data is significant to the business.
Performance problems are really the main driver. There are many ways of solving #2 that don't involve caching, like ensuring Service A is highly available.
Caching adds significant complexity to a system, and can create edge cases that are hard to reason about, and bugs that are very hard to replicate. You also have to mitigate the risk of providing stale data when newer data exists, which can be much worse from a business perspective than (for example) displaying a message that "Service A is down--please try again later."
From this excellent article by Udi Dahan:
These dependencies creep up on you slowly, tying your shoelaces
together, gradually slowing down the pace of development, undermining
the stability of your codebase where changes to one part of the system
break other parts. It’s a slow death by a thousand cuts, and as a
result nobody is exactly sure what big decision we made that caused
everything to go so bad.
Also, If you need point-in-time querying of product data, this should be handled in the way the data is stored in the Product database (e.g. start/end dates), should be clearly exposed in the API (effective date needs to be an input for the API call to query the data).
It is very hard to simply say one solution is better than the other. Choosing one among Solution #2 and #3 depends on other factors (cache duration, consistency tolerance, ...)
My 2 cents:
Cache invalidation might be hard but the problem statement mentions that product catalog change rarely. This fact make product data a good candidate for caching
Solution #1 (NOK)
Data is duplicated across multiple systems
Solution #2 (OK)
Offers strong consistency
Works only when product service is highly available and offers good performance
If email service prepares a summary (with lot of products), then the overall response time could be longer
Solution #3 (Complex but preferred)
Prefer API approach instead of direct DB access to retrieve product information
Resilient - consuming services are not impacted when product service is down
Consuming applications (shipping and email services) retrieve product details immediately after an event is published. The possibility of product service going down within these few milliseconds is very remote.

Role of off-chain workers

I'm trying to build a mental model of the role of off-chain workers in substrate. The bigger picture seems to be that they move logic inside the substrate node, that was otherwise done by oracles, triggering on predefined transactions. There are two use cases I was thinking of specifically:
1: Validating file formats: incoming transaction proposes a file accessible via url or ipfs hash, and it's format needs to be validated. An off-chain worker fetches the file, asserts format (size, encoding, content, whatever) and if correct submits another transaction saying it's valid.
2: Key generation: let's assume there is a separate service distributed with the substrate node, which manages keys for each instance. Node A runs a key sharing algorithm (like Shamir's secret sharing) via this external service between participants A, B and C, then makes a transaction creating a group (A,B,C) on-chain. This transaction triggers all nodes that are in this group to run off-chain workers, call into their local key store verifying having the key. They can all mark it on-chain afterwards.
As far as I understand it correctly, off-chain workers are triggered in every node after block execution. In the former use case, this would result in lots of transactions validating just one file, and nothing guarantees the correctness of these. What is a good way of reaching consensus on the validity of the file? Is it also possible without economic incentives like staking? It would be problematic with tokens having no value in the network, e.g in enterprise settings. Is this even the right use case for off-chain workers? The second example should not suffer from such issue, we just need all parties to verify having the key.
Where does the thought process above go wrong, and why?
As far as I understand it correctly, off-chain workers are triggered in every node after block execution.
Yes and no. There is a CLI flag for it. And at the time of this writing it says:
--offchain-worker <ENABLED>
Should execute offchain workers on every block.
By default it's only enabled for nodes that are authoring new blocks. [default: WhenValidating] [possible
values: Always, Never, WhenValidating]
In the former use case, this would result in lots of transactions validating just one file, and nothing guarantees the correctness of these.
I think it is the responsibility of the receiving function (aka. Call) to handle and incentivise this. For example, there could be a reward opportunity to validate an address. But, if it has already been submitted by another transaction, you will get slashed (or even if not, you do pay some transaction fee, for nothing). In such cases, you can assume that not all participants will submit a transaction. They will only do it when there is a chance of improvement, which should be depicted by your potential reward/slash scheme.
Is this even the right use case for off-chain workers?
I am no expert here, but I think at least the validation example is a good example. It is just a matter of finding a good incentive + anti-spam slashing.
I am less familiar with the second example, so no comments on that.

EventStore Competing Consumer Ordering

I am in the process of scaling out an application horizontally, and have realised read model updates (external projection via event handler) will need to be handled on a competing consumer basis.
I initially assumed that I would need to ensure ordering, but this requirement is message dependent. In the case of shopping cart checkouts where i want to know totals, i can add totals regardless of the order - get the message, update the SQL database, and ACK the message.
I am now racking my brains to even think of a scenario/messages that would be anything but, however i know this is not the case. Some extra clarity and examples would be immensely useful.
My questions i need help with please are:
What type of messages would the ordering need to be important, and
how would this be resolved using the messages as-is?
How would we know which event to resubscribe from when the processes
join/leave I can see possible timing issues that could cause a
subscription to be requested on a message that had just been
processed by another process?
I see there is a Pinned consumer strategy for best efforts affinity of stream to subscriber, however this is not guaranteed. I could solve this making a specific stream single threaded processing only those messages in order - is it possible for a process to have multiple subscriptions to different streams?
To use your example of a shopping cart, ordering would be potentially important for the following events:
Add item
Update item count
Remove item
You might have sequences like A: 'Add item, remove item' or B: 'Add item, Update item count (to 2), Update item count (to 3)'. For A, if you process the remove before the add, obviously you're in trouble. For B, if you process two update item counts out of order, you'll end up with the wrong final count.
This is normally scaled out by using some kind of sharding scheme, where a subset of all aggregates are allocated to each shard. For Event Store, I believe this can be done by creating a user-defined projection using partitionBy to partition the stream into multiple streams (aka 'shards'). Then you need to allocate partitions/shards to processing nodes in a some way. Some technologies are built around this approach to horizontal scaling (Kafka and Kinesis spring to mind).

Track causality in message queue

I'm working on a system where I have to process multiple messages but keep partial order of these messages. I have RabbitMQ queue with messages of two types: item create and item update.
Consider we have queue with 6 messages:
CREATE1 CREATE2 UPDATE1 UPDATE2 UPDATE1 UPDATE1
If I process them one by one, then it's completely fine, but it's very slow, because I have lots of messages
If I read them into some buffer, then I can process them in parallel, but I can process UPDATE1 for first item which wasn't created yet. The worse, the last update may be processed before previous one and thus erase latest item state
I can create some extra field in the message or put it in the queue with some extra header, e.g. MESSAGE_ID:10 to make sure that all messages for one item have the same MESSAGE_ID. The problem is that I don't know what to do with it.
How can I read from the queue multiple items at once without breaking causality between messages?
The pseudocode that I imagine for this task could be:
const prefetchItemsCount = 20
let buffer = new Message[prefetchItemsCount]
let set = new Set()
foreach item in queue
if !set.Contains(item.MessageId)
set.Add(item.MessageId)
buffer.Add(item)
if set.Count == buffer.Count
break
return buffer
So in our example it will return following sequences of items:
CREATE1 CREATE2
UPDATE1 UPDATE2
UPDATE1
UPDATE1
Which makes it almost twice as faster
How can I read from the queue multiple items at once without breaking causality between messages?
Nice case, indeed.
If indeed performed in the desired manner, the TimeDOMAIN singularity of "at once" goes principally against a hidden morphology of what was expressed as "causality".
Given together with a QUEUE-ingress side, which is by definition a pure-[SERIAL] ( nothing may happen at once, just a pure one-goes-after-another, even if a "just"-[CONCURENT] scheduling may get exposed to external agents, the internal QUEUE-management conforms to a pure sequential ordering of messages internal flow and delivery ( also ref. to time-stamping, persistence and other artefacts thereof ) ).
Causality also means some cause -> effect ordering of events, both in the abstracted causality sense of the relation and also in the flow of real time of how thing indeed do happen, so practically an anti-pattern to the "at once".
Last, but not least, the Causality also has to handle an additional paradigm, the latency between the cause -> side and the -> effect side of the ( often Finite-State-Automata, typically having much richer state-space than a just { 0 -> CREATE -> UPDATE [ -> UPDATE [...] ] -> } ) series of events.
Result?
While one may "read" using some degree of [CONCURRENT]-scheduling of processes, the FSA / Causality conditions principally avoid moving anywhere out of the principal pure-[SERIAL] post-processing of event-messages delivered.
More reguirements on this come, if the messaging framework is broker-less and without guarranteed robustness against lost messages / messages ordering / messages authenticity / messages content.
There the Devils start to dance against your attempts to build a consistent, distributed transaction processing robust, distributed FSA :o)

CQRS DDD: How to validate products existence before adding them to order?

CQRS states: command should not query read side.
Ok. Let's take following example:
The user needs to create orders with order lines, each order line contains product_id, price, quantity.
It sends requests to the server with order information and the list of order lines.
The server (command handler) should not trust the client and needs to validate if provided products (product_ids) exist (otherwise, there will be a lot of garbage).
Since command handler is not allowed to query read side, it should somehow validate this information on the write side.
What we have on the write side: Repositories. In terms of DDD, repositories operate only with Aggregate Roots, the repository can only GET BY ID, and SAVE.
In this case, the only option is to load all product aggregates, one by one (repository has only GET BY ID method).
Note: Event sourcing is used as a persistence, so it would be problematic and not efficient to load multiple aggregates at once to avoid multiple requests to the repository).
What is the best solution for this case?
P.S.: One solution is to redesign UI (more like task based UI), e.g.: User first creates order (with general info), then adds products one by one (each addition separate http request), but still I need to support bulk operations (api for third party applications as an example).
The short answer: pass a domain service (see Evans, chapter 5) to the aggregate along with the other command arguments.
CQRS states: command should not query read side.
That's not an absolute -- there are trade offs involved when you include a query in your command handler; that doesn't mean that you cannot do it.
In domain-driven-design, we have the concept of a domain service, which is a stateless mechanism by which the aggregate can learn information from data outside of its own consistency boundary.
So you can define a service that validates whether or not a product exists, and pass that service to the aggregate as an argument when you add the item. The work of computing whether the product exists would be abstracted behind the service interface.
But what you need to keep in mind is this: products, presumably, are defined outside of the order aggregate. That means that they can be changing concurrently with your check to verify the product_id. From the point of view of correctness, there's no real difference between checking the validity of the product_id in the aggregate, or in the application's command handler, or in the client code. In all three places, the product state that you are validating against can be stale.
Udi Dahan shared an interest observation years ago
A microsecond difference in timing shouldn’t make a difference to core business behaviors.
If the client has validated the data one hundred milliseconds ago when composing the command, and the data was valid them, what should the behavior of the aggregate be?
Think about a command to add a product that is composed concurrently with an order of that same product - should the correctness of the system, from a business perspective, depend on the order that those two commands happen to arrive?
Another thing to keep in mind is that, by introducing this check into your aggregate, you are coupling the ability to change the aggregate to the availability of the domain service. What is supposed to happen if the domain service can't reach the data it needs (because the read model is down, or whatever). Does it block? throw an exception? make a guess? Does this choice ripple back into the design of the aggregate, and so on.

Resources