NiFi How to avoid hitting redis for each flow file

NiFi How to avoid hitting redis for each flow file - caching

I have a NiFi flow like below,
In my case each flowfile after consumer getting hit to redis, is there a way where I can avoid each hit.
I want to hit after local TTL expires , like read data from redis and store in local cache and then read after some TTL.

Related

Idempotency with Redis in multi-threaded envirnment

I'm creating a POST api in which I'm using REDIS for idempotency.
I'm taking a idempotency-key in header which is getting saved in Redis.
When the same request comes in, I'm returning the cached message.
In redis I'm saving it as idempotency-key: message-body with Http status
During load testing I sent the same request, with same idempotency-key 30 times.
As expected, 5/30 times, the same request was stored in Redis, because new request comes in before the first finished.
In redis, how can I avoid it, without making API slow?
I did not find much material on net.
Apart for redis I only have dynamo as a centralized DB.

How can I do when redis is down?

There I have many spring-boot service depends on a redis to generate a continuous id such as 1,2,3...
How can I do when redis is down?
extra:
one Redis, not master-slave
Does Redis persistence keep data from being lost?

You can configure Redis to persist data on disk, i.e. AOF and RDB format. However, since the persistence is asynchronous (with AOF, you can sync your write for every operation, but in that way, you'll have performance problems), you still might loose data.
In your case, it seems that you might use the INCR command to generate id. If Redis is down without dumping all data, you'll get duplicate ids when Redis restarts.
This problem cannot be solved, even if you have a master-replica setup, since the synchronization between master and replica is also asynchronous.

Microservice failure Scenario

I am working on Microservice architecture. One of my service is exposed to source system which is used to post the data. This microservice published the data to redis. I am using redis pub/sub. Which is further consumed by couple of microservices.
Now if the other microservice is down and not able to process the data from redis pub/sub than I have to retry with the published data when microservice comes up. Source can not push the data again. As source can not repush the data and manual intervention is not possible so I tohught of 3 approaches.
Additionally Using redis data for storing and retrieving.
Using database for storing before publishing. I have many source and target microservices which use redis pub/sub. Now If I use this approach everytime i have to insert the request in DB first than its response status. Now I have to use shared database, this approach itself adding couple of more exception handling cases and doesnt look very efficient to me.
Use kafka inplace if redis pub/sub. As traffic is low so I used Redis pub/sub and not feasible to change.
In both of the above cases, I have to use scheduler and I have a duration before which I have to retry else subsequent request will fail.
Is there any other way to handle above cases.

For the point 2,
- Store the data in DB.
- Create a daemon process which will process the data from the table.
- This Daemon process can be configured well as per our needs.
- Daemon process will poll the DB and publish the data, if any. Also, it will delete the data once published.
Not in micro service architecture, But I have seen this approach working efficiently while communicating 3rd party services.

At the very outset, as you mentioned, we do indeed seem to have only three possibilities
This is one of those situations where you want to get a handshake from the service after pushing and after processing. In order to accomplish the same, using a middleware queuing system would be a right shot.
Although a bit more complex to accomplish, what you can do is use Kafka for streaming this. Configuring producer and consumer groups properly can help you do the job smoothly.
Using a DB to store would be a overkill, considering the situation where you "this data is to be processed and to be persisted"
BUT, alternatively, storing data to Redis and reading it in a cron-job/scheduled job would make your job much simpler. Once the job is run successfully, you may remove the data from cache and thus save Redis Memory.
If you can comment further more on the architecture and the implementation, I can go ahead and update my answer accordingly. :)

How to exchange data between instances of the same service in Consul?

I'm trying a combination of Spring cloud and Consul and I wonder if there is way to exchange data and state between instance the same of a microservice.
For example, I have AuthenticationService1 (AS1) and AuthenticationService2 (AS2). When a user comes to AS1, he logs in, receives a token and next time he comes it's only verified. But at this moment AS2 is not aware of the state of AS1.
I saw ideas about using a database table where information about user sessions is stored, but maybe there is an easier way for AS1 to share its state with AS2 or to send a message about log in?

Consul is a service management (discovery, config, ...), not a cache / pub-sub system.
You may want to use a shared cache behind the scene for your use case. You AS1 service authenticate a user, then put in the cache the token. AS2 can retrieve the token in the cache. For that, you can use application like
redis
hazelcast
infinispan
... other stuff like store data in a DB ...
You can also use a pub-sub system and a local cache for each ASx, but you can have issue when on AS restart (cache lost). So from my point of view, shared cache is better.

Elasticsearch HTTP River

I'm starting to play around with Elasticsearch, and I'd like to periodically pull data via HTTP from a particular location, ideally as a river, every few seconds or so.
The particular address is a simple monitoring endpoint, and outputs some JSON about the state of it's systems. Doesn't need any authentication, just the HTTP request.
Is there a recommended way of doing this inside ES? Or, should I just put together a polling server that pull and pushes?

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

NiFi How to avoid hitting redis for each flow file - caching

I have a NiFi flow like below, In my case each flowfile after consumer getting hit to redis, is there a way where I can avoid each hit. I want to hit after local TTL expires , like read data from redis and store in local cache and then read after some TTL.

Related

Idempotency with Redis in multi-threaded envirnment

How can I do when redis is down?

Microservice failure Scenario

How to exchange data between instances of the same service in Consul?

Elasticsearch HTTP River

Categories

Resources