Best way to track/trace a JSON Object (a time series data) as it flows through a system of microservices on a IOT platform - spring-boot

We are working on an IOT platform, which ingests many device parameter
values (time series) every second from may devices. Once ingested the
each JSON (batch of multiple parameter values captured at a particular
instance) What is the best way to track the JSON as it flows through
many microservices down stream in an event driven way?
We use spring boot technology predominantly and all the services are
containerised.
Eg: Option 1 - Is associating UUID to each object and then updating
the states idempotently in Redis as each microservice processes it
ideal? Problem is each microservice will be tied to Redis now and we
have seen performance of Redis going down as number api calls to Redis
increase as it is single threaded (We can scale this out though).
Option 2 - Zipkin?
Note: We use Kafka/RabbitMQ to process the messages in a distributed
way as you mentioned here. My question is about a strategy to track
each of this message and its status (to enable replay if needed to
attain only once delivery). Let's say a message1 is being by processed
by Service A, Service B, Service C. Now we are having issues to track
if the message failed getting processed at Service B or Service C as
we get a lot of messages

Better approach will be using Kafka instead of Redis.
Create a topic for every microservice & keep moving the packet from
one topic to another after processing.
topic(raw-data) - |MS One| - topic(processed-data-1) - |MS Two| - topic(processed-data-2) ... etc
Keep appending the results to same object and keep moving it down the line, untill every micro-service has processed it.

Related

Message Based Microservices - Api Gateway Performance

I'm in the process of designing a micro-service architecture and I have a performance related question. This is what I am trying out with my design:
I have a several micro-services which perform distinct actions and store those results in their own data-store.
The micro-services receive work via a message queue where they receive requests to run their process for the specific data given. The micro-services do NOT communicate with each other.
I have an API gateway which effectively has three journeys:
1) Receive a request to process data which it then translates into several messages which it puts on the queue for the micro-services to process in their own time. The processing time can be in minutes or longer (not-instant)
2) Receives a request for the status of the process, where it returns the progress of the overall process.
3) Receives a request for combined data, which is some combination of all the results from the services.
My problem lies in #3 above and the performance of this process.
Whenever this request is received, the api gateway has to put a message request onto the queue for information from all the services, it than has to wait for all the services to reply with the latest state of their data and then it combines this data and returns to the caller.
This process is obviously rather slow as it has to wait for every service to respond. What is the way of speeding this up?
The only way I thought of solving this is having another aggregate service/data-store where duplicate data is stored and queried by my api gateway. I really don't like this approach as it duplicates data and is extra work/code.
What is the 'correct' and performant way of querying up-to-date data from my micro-services.
You can use these approach for Querying data across microservices. Reference
Selective data replication
With this approach, we replicate the data needed from other microservices into the database of our microservice. The only coupling between microservices is in the data replication configuration.
Composite service layer
With this approach, you introduce composite services that aggregate data from lower-level microservices.

Task distribution across microservices

We are building our first microservice architecture using Spring Boot and Kubernetes. I have a general question about scaling up one of our microservices which processes RSS feeds.
Currently we have about 100 feeds and run one instance of the microservice to process them. The feed sources are stored in a database and once the feeds are processed they are written to a central Kafka queue.
We want to increase the number of feeds and the number of instances of the microservice to process the feeds.
Are there any design patterns which I could follow to distribute the RSS feeds across the number of instances available? How would I dynamically allocate which microservice instance processes which set of feeds.
Any recommendations or best practice advice would be appreciated.
The first attempt is to use some messaging system.
You could send a message that some "rss feed must be processed" with essential information about this task (feed id, link whatever).
Then make all instances implement logic of consumption from the queue.
This way, the instances will compete for processing the job. The more messages you have in the more tasks to do you'll have (obviously). You can then scale out the number of microservices.
You can use hash function to distribute RSS feeds across your microservices. Lets say you have 5 instance of microservices, you can use below algorithm for assigning RSS to your microservices
hash_code = hashingAlgorithm(rss)
node_id = hash_code % num_of_nodes // 5 in this case
get_service(node_id).send(rss)
The process of assigning RSS to your microservices is also can be scaled easily, you can launch 3 independent process to read from your DB and assigning RSS to microservices without any coordination.

How to solve two generals issue between event store and persistence layer?

Two General Problems - EventStore and persistence layer?
I would like to understand how industry is actually dealing with this problems!
If a microservice 1 persists object X into Database A. In the same time, for micro-service 2 to feed on the data from micro-service 1, micro-service 1 writes the same object X to an event store B.
Now, the question I have is, where do I write object X first?
Database A first and then to event store B, is it fair to roll back the thread at the app level if Database A is down? Also, what should be the ideal error handle if Database A is online and persisted object X but event store B is down?
What should be the error handle look like if we go vice-versa of point 1?
I do understand that in today's world of distributed high-available systems, systems going down is questionable thing. But, it can happen. I want to understand what needs to be done when either database or event store system/cluster is down?
In general you want to avoid relying on a two-phase commit of the kind you describe.
In general, (presuming an event-sourced system; not sure if that's implicit in your question/an option for you - perhaps SqlStreamStore might be relevant in your context?), this is typically managed by having something project from from a single authoritative set of events on a pull basis - each event being written that requires an associated action against some downstream maintains a pointer to how far it has got projecting events from the base stream, and restarts from there if interrupted.
First of all, an Event store is a type of Persistence, which stores the applications state as a series of events as opposed to a flat persistence that stores the last projected state.
If a microservice 1 persists object X into Database A. In the same time, for micro-service 2 to feed on the data from micro-service 1, micro-service 1 writes the same object X to an event store B.
You are trying to have two sources of truth that must be kept in sync by some sort of distributed transaction which is not very scalable.
This is an unusual mode of using an Event store. In general an Event store is the canonical source of information, the single source of truth. You are trying to use it as an communication channel. The Event store is the persistence of an event-sourced Aggregate (see Domain Driven Design).
I see to options:
you could refactor your architecture and make the object X and event-sourced entity having as persistence the Event store. Then have a Read-model subscribe to the Event store and build a flat representation of the object X that is persisted in the database A. In other words, write first to the Event store and then in the Database A (but in an eventually consistent manner!). This is a big jump and you should really think if you want to go event-sourced.
you could use CQRS without Event sourcing. This means that after every modification, the object X emits one or more Domain events, that are persisted in the Database A in the same local transaction as the object X itself. The microservice 2 could subscribe to the Database A to get the emitted events. The actual subscribing depends on the type of database.
I have a feeling you are using event store as a channel of communication, instead of using it as a database. If you want micro-service 2 to feed on the data from micro-service 1, then you should communicate with REST services.
Of course, relying on REST services might make you less resilient to outages. In that case, using a piece of technology dedicated to communication would be the right way to go. (I'm thinking MQ/Topics, such as RabbitMQ, Kafka, etc.)
Then, once your services are talking to each other, you will still need to persist your data... but only at one single location.
Therefore, you will need to define where you want to store the data.
Ask yourself:
Who will have the governance of the data persistance ?
Is it Microservice1 ? if so, then everytime Microservice2 needs to read the data, it will make a REST call to Microservice1.
is it the other way around ? Microservice2 has the governance of the data, and Microservice1 consumes it ?
It could be a third microservice that you haven't even created yet. It depends how you applied your separation of concerns.
Let's take an example :
Microservice1's responsibility is to process our data to export them in PDF and other formats
Microservice2's responsibility is to expose a service for a legacy partner, that requires our data to be returned in a very proprietary representation.
who is going to store the data, here ?
Microservice1 should not be the one to persist the data : its job is only to convert the data to other formats. If it requires some data, it will fetch them from the one having the governance of the data.
Microservice2 should not be the one to persist the data. After all, maybe we have a number of other Microservices similar to this one, but for other partners, with different proprietary formats.
If there is a service where you can do CRUD operations, this is your guy. If you don't have such a service, maybe you can find an existing Microservice who wouldn't have conflicting responsibilities.
For instance : if I have a Microservice3 that makes sure everytime an my ObjectX is changed, it will send a PDF-representation of it to some address, and notify all my partners that the data are out-of-date. In that scenario, this Microservice looks like a good candidate to become the "governor of the data" for this part of the domain, and be the one-stop-shop for writing/reading in the database.

Spring Cloud Dataflow - Retaining Order of Messages

Let's say I have a stream with 3 applications - a source, processor, and sink.
I need to retain the order of my the messages I received from my source. When I receive messages A,B,C,D, I have to send them to sink as A,B,C,D. (I can't send them as B,A,C,D).
If I have just have 1 instance of each application, everything will run sequentially and the order will be retained.
If I have 10 instances of each application, the messages A,B,C,D might get processed at the same time in different instances. I don't know what order these messages will wind up in.
So is there any way I can ensure that I retain the order of my messages when using multiple instances?
No; when you scale out (either by concurrency in the binder or by deploying multiple instances), you lose order. This is true for any multi-threaded application, not just spring-cloud-stream.
You can use partitioning so that each instance gets a partition of the data, but ordering is only retained within each partition.
If you have sequence information in your messages, you can add a custom module using a Spring Integration Resequencer to reassemble your messages back into the same sequence - but you'll need a single instance of the resequencer before a single sink instance.

How to ensure data is eventually written to two Azure blobs?

I'm designing a multi-tenant Azure Service Fabric application in which we'll be storing event data in Azure Append-Only blobs.
There'll be two kinds of blobs; merge blobs (one per tenant); and instance blobs (one for each "object" owned by a tenant - there'll be 100K+ of these per tenant)
There'll be a single writer per instance blob. This writer keeps track of the last written blob position and can thereby ensure (using conditional writes) that no other writer has written to the blob since the last successful write. This is an important aspect that we'll use to provide strong consistency per instance.
However, all writes to an instance blob must also eventually (but as soon as possible) reach the single (per tenant) merge blob.
Under normal operation I'd like these merge writes to take place within ~100 ms.
My question is about how we best should implement this guaranteed double-write feature:
The implementation must guarantee that data written to an instance blob will eventually also be written to the corresponding merge blob exactly once.
The following inconsistencies must be avoided:
Data is successfully written to an instance blob but never written to the corresponding merge blob.
Data is written more than once to the merge blob.
Most easiest way as for me is to use events: Service Bus or Event Hubs or any other provider to guaranty that an event will be stored and reachable at least somewhere. Plus, it will give a possibility to write events to Blob Storage in batches. Also, I think it will significantly reduce pressure on Service Fabric and will allow to process events at desired timing.
So you could have a lot of Stateless Services or just Web Workers that will pick up new messages from a queue and in batch send them to a Statefull Service.
Let's say that it will be a Merge service. You would need to partition these services and the best way to send a batch of events grouped by one partition is to make such Stateless Service or Web Worker.
Than you can have a separate Statefull Actor for each object. But on your place I would try to create 100k actors or any other real workload and see how expensive it would be. If it is too expensive and you cannot afford such machines, then everything could be handled in another partitioned Stateless Service.
Okay, now we have the next scheme: something puts logs into ESB, something peaks these evetns from ESB in batches or very frequently, handling transactions and processing errors. After that something peaks bunch of events from a queue, it sends it to a particular Merge service that stores data in its state and calls particular actor to do the same thing.
Once actor writes its data to its state and service does the same, then such sevent in ESB can be marked as processed and removed from the queue. Then you just need to write stored data from Merge service and actors to Blob storage once in a while.
If actor is unable to store event, then operation is not complete and Merge service should not store data too. If Blob storage is unreachable for actors or Merge services, it will become reachable in the future and logs will be stored as they are saved in state or at least they could be retrieved from actors/service manually.
If Merge service is unreachable, I would store such event in a poison message queue for later processing, or try to write logs directly to Blob storage but it is a little bit dangerous though chances to write at that moment only to one kind of storage are pretty low.
You could use a Stateful Actor for this. You won't need to worry about concurrency, because there is none. In the state of the Actor you can keep track of which operations were successfully completed. (write 1, write 2)
Still, writing 'exactly once' in a distributed system (without a DTC) is never 100% waterproof.
Some more info about that:
link
link

Resources