Read-Through cache with SnappyData - caching

Can we have read-through cache behavior? Meaning application will issue sql query to SnappyData, then SnappyData will check if the data is in the cache (in SnappyData). If it is, it will return the data. If it is not, SnappyData will bring it in to the cache from the data store. The backend data store can be anything sql-compatible data store. This way, application just need to talk to SnappyData, and application does not need to talk to the underlying data storage.
Thanks

No, Snappydata ootb cannot be used as a "read through cache", today. That said, we are contemplating supporting a model where the in-memory tables can be configured to manage "hot" data and query could be delegated to backend in some cases.
What class of queries do you wish to run on SnappyData ? perhaps, there is a way to solve.

SnappyData does not have this feature as of today.But we can build our own application where data will flow from some sql compatible data store to snappydata on demand.

Related

When to use Redis (in-memory db) vs when to use a cache?

I've recently learned that Uber uses a cache to store its map data where as twitter uses Redis to store and retrieve data related to a user's homepage. I'm trying to understand when to use a cache vs an in-memory database such as Redis. It seems like fast retrieval is required in both cases I described.
Thanks!
An in-memory database is also a cache. What we usually mean by cache is that the data is retrieved from memory, and not from disk. In the case of Uber, it looks like they are also using Redis as a cache: https://eng.uber.com/tech-stack-part-one-foundation/.

Clarification on database caching

Correct me if I'm wrong, but from my understanding, "database caches" are usually implemented with an in-memory database that is local to the web server (same machine as the web server). Also, these "database caches" store the actual results of queries. I have also read up on the multiple caching strategies like - Cache Aside, Read Through, Write Through, Write Behind, Write Around.
For some context, the Write Through strategy looks like this:
and the Cache Aside strategy looks like this:
I believe that the "Application" refers to a backend server with a REST API.
My first question is, in the Write Through strategy (application writes to cache, cache then writes to database), how does this work? From my understanding, the most commonly used database caches are Redis or Memcached - which are just key-value stores. Suppose you have a relational database as the main database, how are these key-value stores going to write back to the relational database? Do these strategies only apply if your main database is also a key-value store?
In a Write Through (or Read Through) strategy, the cache sits in between the application and the database. How does that even work? How do you get the cache to talk to the database server? From my understanding, the web server (the application) is always the one facilitating the communication between the cache and the main database - which is basically a Cache Aside strategy. Unless Redis has some kind of functionality that allows it to talk to another database, I don't quite understand how this works.
Isn't it possible to mix and match caching strategies? From how I see it, Cache Aside and Read Through are caching strategies for application reads (user wants to read data), while Write Through and Write Behind are caching strategies for application writes (user wants to write data). Couldn't you have a strategy that uses both Cache Aside and Write Through? Why do most articles always seem to portray them as independent strategies?
What happens if you have a cluster of webs servers? Do they each have their own local in-memory database that acts as a cache?
Could you implement a cache using a normal (not in-memory) database? I suppose this would still be somewhat useful since you do not need to make an additional network hop to the database server (since the cache lives on the same machine as the web server)?
Introduction & clarification
I guess you have one misunderstood point, that the cache is NOT expclicitely stored on the same server as the werbserver. Sometimes, not even the database is sperated on it's own server from the webserver. If you think of APIs, like HTTP REST APIs, you can use caching to not spend too many resources on database connections & queries. Generally, you want to use as few database connections & queries as possible. Now imagine the following setting:
You have a werbserver who serves your application and a REST API, which is used by the webserver to work with some resources. Those resources come from a database (lets say a relational database) which is also stored on the same server. Now there is one endpoint which serves e.g. a list of posts (like blog-posts). Every user can fetch all posts (to make it simple in this example). Now we have a case where one can say that this API request could be cached, to not let all users always trigger the database, just to query the same resources (via the REST API) over and over again. Here comes caching. Redis is one of many tools which can be used for caching. Since redis is a simple in-memory key-value storage, you can just put all of your posts (remember the REST API) after the first DB-query, into the cache. All future requests for the posts-list would first check whether the posts are alreay cached or not. If they are, the API will return the cache-content for this specific request.
This is one simple example to show off, what caching can be used for.
Answers on your question
My first question is, why would you ever write to a cache?
To reduce the amount of database connections and queries.
how is writing to these key-value stores going to help with updating the relational database?
It does not help you with updating, but instead it helps you with spending less resources. It also helps you in terms of "temporary backing up" some data - but that only as a very little side effect. For this, out there are more attractive solutions (Since redis is also not persistent by default. But it supports persistence.)
Do these cache writing strategies only apply if your main database is also a key-value store?
No, it is not important which database you use. Whether it's a NoSQL or SQL DB. It strongly depends on what you want to cache and how the database and it's tables are set up. Do you have frequent changes in your recources? Do resources get updated manually or only on user-initiated actions? Those are questions, leading you to the right caching implementation.
Isn't it possible to mix and match caching strategies?
I am not an expert at caching strategies, but let me try:
I guess it is possible but it also, highly depends on what you are doing in your DB and what kind of application you have. I guess if you find out what kind of application you are building up, then you will know, what strategy you have to use - i guess it is also not recommended to mix those strategies up, because those strategies are coupled to your application type - in other words: It will not work out pretty well.
What happens if you have a cluster of webs servers? Do they each have their own local in-memory database that acts as a cache?
I guess that both is possible. Usually you have one database, maybe clustered or synchronized with copies, to which your webservers (e.g. REST APIs) make their requests. Then whether each of you API servers would have it's own cache, to not query the database at all (in cloud-based applications your database is also maybe on another separated server - so another "hop" in terms of networking). OR (what i also can imagine) you have another middleware between your APIs (clusterd up) and your DB (maybe also clustered up) - but i guess that no one would do that because of the network traffic. It would result in a higher response-time, what you usually want to prevent.
Could you implement a cache using a normal (not in-memory) database?
Yes you could, but it would be way slower. A machine can access in-memory data faster then building up another (local) connection to a database and query your cached entries. Also, because your database has to write the entries into files on your machine, to persist the data.
Conclusion
All in all, it is all about being fast in terms of response times and to prevent much network traffic. I hope that i could help you out a little bit.

Caching all entities in Cache Layer and Synchronizing with Database

Is it possible, reliable and secure to cache all entities in distrubuted cache and notifies dao layer on update? My possible idea is;
Use JPA 2.1 and Hibernate implementation.
On creation persist it db
After persisting it, cache it to distrubuted cache.
Canalise all read actions to cache
on update notify dao layer to update entity .
yes you can design a system that will
On addition: persists data to db and adds to cache
On read: reads data from cache, and considers a cache miss as not
present in database as well.
On update: updates data in db and then updates in cache (or vice
versa)
On delete: deletes data from cache and then deletes from database
This approach will work fine if you have a single application using that database and if data is not that critical. However if data integrity is of more importance, you may face following problems in this approach:
You may face a cache miss when data is present in database(persisted
but not yet cached)
You may get stale data from cache (updated in db but not yet updated
in cache)
Also if data is removed from database by some other application, it
will still ramained cached in distributed cache(invalid data on
reads)
A better mothod my be if you use a rich featured distributed caching solution like NCache / Tayzgrid which provides Read Trough / Write behind features. This way your application will only need to use cache for all reads, writes or updates and cache will keep database updated using configured providers.
Another approach may be to use distributed cache as hibernate's second level cache and you will not need to add a caching layer by your self. See this article for details about hibernate's second level cache.
Distributed caching solutions like Tayzgrid provide caching provider for hibernate that can be easily configured. You can find hibernate providers for other solutions as well.

How can couchbase be used as a caching layer on top of oracle?

I have Oracle as my main RDBMS for read and write, but I want to use couchbase as caching layer as it has map-reduce as can be used as memcache. Any idea as to how i can implement that, and how to transfer and update data in the caching layer, when Oracle is updated or inserted etc.
You are not telling anything about your current performance issues.
I have seen too many applications which do not really take advantage of RDBMS/SQL features, especially if an ORM sits in between.
The cure is to put another cache on top of a database, and to synchronize this in a cluster manually using IP multicasts (SwarmCache for example), message queues (JMS) or nightly import jobs. It could create more problems in the end. And it increases system complexity.
So my answer to your question is: I would not do it, as long as there is room for improvement regarding your data model and/or queries.
I believe your question is about Database synchronization. This can be done through a combination of using DB dependencies and "right-thru" features that I am not too sure about whether couchbase offers. So with DB dependency you have cached items dependent upon Db items and if the DB items are updated or deleted the corresponding dependent item in the cache is removed and at the same time you can write a "right-thru" handler executed at the server level; and the main purpose of this handler is loading fresh copies of the removed items in the cache. So, basically, you'll write the handler once and registerit with the cache server and the cache server will execute it when needed to sync. new items in the DB with the cache. This reading on Db synchronization can be useful . Its based on a product Ncache.
So your question is not directly related to Couchbase, but as other stated more about how you can be alerted when data are changing into your Oracle instance.
One thing that is not well known is the Oracle Database Change Notification feature that is quite cool for this:
http://docs.oracle.com/cd/E11882_01/java.112/e16548/dbchgnf.htm
So you can create an application that is listening to your changes and pushes the data into Couchbase.

Best way to cache persistent data (e.g. code books) in Spring?

I have a series of code books in my database, and I am using plain JDBC calls to fetch them and store them in a collection. I would like to put these in some kind of a cache at application startup time in order to save time later.
I don't need any fancy stuff like automatic object invalidation, TTL etc - the code books change rarely, so I'll trigger the update myself and just reload the whole cache when the need arises.
The project where I need this uses Spring, and this is my first project using it. Is there a standard/elegant way to do this in Spring?
Thanks.
Check out Spring-cache.
Supports EHCache, OSCache and a memory cache, but allows pluggable cache providers too.

Resources