lost data when stopping server [Infinispan] - caching

In my project i use infinispan to manage my data and improve the performance,
so i have a problem is when we stop de server and restart it all my data are deleted ans it's normal beacause its a cache.
so i demand you if you have a sugggestion for me for saving my data of my application even if the server is stopped ?
I searched in the internet , i found a lot of solution like using database with infinispan or store the data into a file like using (filecacheStore, jdbccachestore, casassandraCachedatastore) and i dont know which one is the best solution!
thank you very much in advance for your reply.

There's a multitude of options and none is best for all use-cases; that's why there are the options. You haven't said much about your app needs.
1) Use persistent cache store (single file store is the simplest option probably). This is and OOTB solution.
2) Before shutdown fetch and persist all data from your app (use streams API to iterate through), and upload them after boot. This does not add any overhead during runtime but requires you to handle the process yourselves.
3) Use cluster of nodes and always keep some nodes with the data up. However, backups (either via 1) or 2)) might be advisable anyway.

Related

How to increase resource allocation to ravendb

I'm trying to process a document and store many documents into ravendb which I have running locally.
I'm getting the error
Tried to send *ravendb.BatchCommand request via POST http://127.0.0.1:8080/databases/mydb/bulk_docs to all configured nodes in the topology, all of them seem to be down or not responding. I've tried to access the following nodes: http://127.0.0.1:8080
I was able to fetch mydb topology from http://127.0.0.1:8080.
Fetched topology: ( url: http://127.0.0.1:8080, clusterTag: A, serverRole: Member)
exit status 1
To me, it sounds like maybe my local cluster is running out of compute to process the large amount of data I'm trying to store.
RavenDB says I'm using 3 of 12 available cores, and I'd also like to make sure it's using a reasonable amount of the ram I have available on the machine (I'd even be happy with giving it a swap)
But reading around online, I'm not finding much helpful information for making sure RavenDB is able to use what it needs. I found the settings.json so I can add in configurations which theoretically should get included into the server but I'm not making much progress.
I also found some settings and changed "reassign cores" to 12 but it says that still 3/12 are being used and 6/31.1 GB of memory are being used.
If an alternative solution is recommended I'm all ears. I just need to run things locally and storing everything as json's doesn't enable fast enough retrieval for my usecase.
Update
I was able to install mongodb and setup a local database. It hasn't given me any problems yet. RavenDB looks appealing if I understood it better but I guess I'll stick with the tried and true for this project.
It is highly unlikely that you managed to run out of resources on the server with 3 cores / 6 GB unless you are pushing hundreds of millions of documents and doing very heavy work.
Do you get any error on the server? There should be more details on the error or in the server log.

HazelCast Member with/without Client is ok for standalone web application

I am new to caching mechanism and just started learning about Hazelcast. I gone through couple of tutorials and hazelcast site but still I am not clear.
I am trying to build a caching for my springboot & angular application. It is a single standalone application.
So in my case, since my application single and no plan in running as multiple instance can I just go with Hazelcast member without client. Is client is needed?
No, the client is not mandatory, and for your case it would seem unnecessary.
The idea is around abstraction, you ask Hazelcast for item X and it is returned if it exists. Hazelcast works out where that item is held, and mostly this is hidden from you.
X could be found in your process:
Your process is a client, has near-caching active, and has a copy.
Your process is one of 1 or more servers, and happens to be the server responsible for storing item X.
X could be found in another process:
Your process is a client, has no near-caching, so is not storing anything
Your process is one of several servers, and it happens that one of the other servers is responsible for item X.
"Mostly this is hidden from you" == There will be a retrieval time difference between data found in the same process and data retrieved from another process, as it has to pass across the network. If this is a significant difference at low volumes, it's time to upgrade the network.

Process Laravel/Redis job from multiple server

We are building a reporting app on Laravel that need to fetch users data from a third-party server that allow 1 request per seconds.
We need to fetch 100K to 1000K rows based on user and we can fetch max 250 rows per request.
So the restriction is:
1. We can send 1 request per seconds
2. 250 rows per request
So, it requires 400-4000 request/jobs to fetch a user data, So, loading data for multiple users is very time-consuming and the server gets slow.
So, now, we are planning to load the data using multiple servers, like 4-10 servers to fetch users data, so we can send 10 requests per second from 10 servers.
How can we design the system and process jobs from multiple servers?
Is it possible to use a dedicated server for hosting Redis and connect to that Redis server from multiple servers and execute jobs? Can any conflict/race-condition happen?
Any hint or prior experience related to this would be really helpful.
The short answer is yes, this is absolutely possible and is something I've implemented in production apps many times before.
Redis is just like any other service and can run anywhere, with clients from anywhere, connecting to it. It's all up to your configuration of the server to dictate how exactly that happens (and adding passwords, configuring spiped, limiting access via the firewall, etc.). I'd reccommend reading up on the documentation they have in the Administration section here: https://redis.io/documentation
Also, when you do make the move to a dedicated Redis host, with multiple clients accessing it, you'll likely want to look into having more than just one Redis server running for reliability, high availability, etc. Redis has efficient and easy replication available with a few simple configuration commands, which you can read more about here: https://redis.io/topics/replication
Last thing on Redis, if you do end up implementing a master-slave set up, you may want to look into high availability and auto-failover if your Master instance were to go down. Redis has a really great utility built into the application that can monitor your Master and Slaves, detect when the Master is down, and automatically re-configure your servers to promote one of the slaves to the new master. The utility is called Redis Sentinel, and you can read about that here: https://redis.io/topics/sentinel
For your question about race conditions, it depends on how exactly you write your jobs that are pushed onto the queue. For your use case though, it doesn't sound like this would be too much of an issue, but it really depends on the constraints of the third-party system. Either way, if you are subject to a race condition, you can still implement a solution for it, but would likely need to use something like a Redis Lock (https://redis.io/topics/distlock). Taylor recently added a new feature to the upcoming Laravel version 5.6 that I believe implements a version of the Redis Lock in the scheduler (https://medium.com/#taylorotwell/laravel-5-6-preview-single-server-scheduling-54df8e0e139b). You can look into how that was implemented, and adapt for your use case if you end up needing it.

Azure Cache - How long will data persist, how to detect restart

With the Azure Cache service, I am trying to find details of the following aspects of the service:
What is the maximum length of time something can be kept in the cache? I presume it would be until the cache service is restarted;
Is there a way to detect that the cache service has been restarted?
My intention is to use Azure Cache to store datasets that are frequently being accessed, and that would be updated / added to over time as data that is incoming into my system is processed.
How would I know / be notified that the cache has restarted (I guess apart from seeing if it is empty) so I could kick off a process to repopulate it?
I think you'll find answer to your questions here: http://msdn.microsoft.com/en-us/library/dn386128.aspx
Alright, information on High Availability of the cache service is here: http://msdn.microsoft.com/en-us/library/dn386134.aspx

Alternative to session replication \ tomcat clustering

We have 3 tomcats with the same web app, using the same DB.
We want to use non-stickey session.
this means we will have to share the session (replicate) between the tomcats (cluster?)
We dont like the idea of the delta-manger since it is an all-to-all replication with preformance cost.
However we dont really like the backup-manager as well (still multiple copies)
My question is:
Is it possible to define a single tomcat that will be a "session manager" and all other tomcats will not keep sessions by themselves?
this way no broadcasting of sessions is needed...
My reading of the Tomcat docs finds:
... when using the delta manager it will replicate to all nodes, even
nodes that don't have the application deployed.
exactly as you say, but then says:
To get around this problem, you'll want to use the BackupManager. This
manager only replicates the session data to one backup node
You seem to object to "multiple copies", but this doesn't seem very different from your proposed suggestion, the BackupManager is, so far as I can see, acting as a Session Manager.
When you don't have sticky sessions you are pretty much guaranteeing that 2 of every 3 requests will need to get a copy of the session data from somewhere else, with only 3 tomcats how much performance cost would all-to-all replication impose?
I suspect that tuning your session sizes is more important. Large sessions tend to be a problem for any sort of replication.

Resources