Say I have some objects that need to be created only once in the application yet accessed from within multiple requests. The objects are immutable. What is the best way to do this?
Store them in the session.
If you don't want to lose them after a server's restart, then use a database (SQLite, which is a single file, for example).
You want to persist your objects. Normally you'd do it with some ORM like Active Record or Datamapper. Depending on what is available to you. If you want something dead simple without migrations and you have access to a MongoDB use mongomapper.
If that object is used only for some time, then it is discarded (and if needed again, then recreated), use some caching mechanism, like memcached or redis.
If setting up such services is heavy and you want to avoid it, and - say - you are using Debian/Ubuntu then save your objects into files (with Marshal-ing) in the /shm device which is practically memory.
If the structure of the data is complex, then go with SQLite as suggested above.
Related
I am developing some SPA with a backend written in Java (Spring Boot). In relational DB that backend connects to, there is a table with some dictionary values. Values can edited by users of the app, but it's done really, really rarely (almost never).
Those dictionary values are used in a lot of pages on UI and because of that I would like to "cache" them in a way. What I want to achieve is that I want to load dictionary values on startup to avoid asking DB for values during every request between UI and Backend.
Firstly, I thought about just loading it on the UI part of the app, when user enters the page for the first time. Then I ruled it out, since when one of the users changes the values, it should be reloaded.
What I think might work is just loading them on startup of Backend into some collection (that can be safely used in concurent environment, probably ConcurrentMap) and then during some GET requests asking that collection for the values (instead of DB). When the values are changed, that request just updates the DB table and reloads them into collection.
Then I thought that the collection solution won't be enough, when my backend would be scaled up to more than one instance. In that case, only one of instances will be updated and the second one will provide outdated data. We can avoid it and force refreshes i.e. every 15 minutes (instead of on demand during values update).
But what I think is the best solution is to start some redis service on a side, load dictionary values into it and after every DB update of the values just update the redis instance with the new ones. Every instance of backend would use the same instance of redis, which seems quicker than executing query (select * from _ where _ = _) on DB.
What do you think? Is my thought process is correct? Do you have any ideas that can help solve my issue?
If you are using Spring you could check out Spring Cache Abstraction. That way your cache will be up-to-date whenever some change occurs.
Out of the box few implementations are supported by Spring:
Spring provides a few implementations of that abstraction: JDK java.util.concurrent.ConcurrentMap based caches, Ehcache 2.x, Gemfire cache, Caffeine, and JSR-107 compliant caches (such as Ehcache 3.x). See Plugging-in Different Back-end Caches for more information on plugging in other cache stores and providers.
If you decide to use Memcached implementation you can check out this library (uses Xmemcached under the hood) here.
You could also check a small demo app of how to use Spring Cache Abstraction in your project (link).
I think your in the right path with your approach in terms of 'caching'. I suggest you also check Memcached for it simplicity. Redis is a good choice but still it depends on your requirements and if you need that much feature. just my 2cent
https://aws.amazon.com/elasticache/redis-vs-memcached/
https://devcenter.heroku.com/articles/spring-boot-memcache#add-caching-to-spring-boot
Thanks,
Correct me if I'm wrong, but from my understanding, "database caches" are usually implemented with an in-memory database that is local to the web server (same machine as the web server). Also, these "database caches" store the actual results of queries. I have also read up on the multiple caching strategies like - Cache Aside, Read Through, Write Through, Write Behind, Write Around.
For some context, the Write Through strategy looks like this:
and the Cache Aside strategy looks like this:
I believe that the "Application" refers to a backend server with a REST API.
My first question is, in the Write Through strategy (application writes to cache, cache then writes to database), how does this work? From my understanding, the most commonly used database caches are Redis or Memcached - which are just key-value stores. Suppose you have a relational database as the main database, how are these key-value stores going to write back to the relational database? Do these strategies only apply if your main database is also a key-value store?
In a Write Through (or Read Through) strategy, the cache sits in between the application and the database. How does that even work? How do you get the cache to talk to the database server? From my understanding, the web server (the application) is always the one facilitating the communication between the cache and the main database - which is basically a Cache Aside strategy. Unless Redis has some kind of functionality that allows it to talk to another database, I don't quite understand how this works.
Isn't it possible to mix and match caching strategies? From how I see it, Cache Aside and Read Through are caching strategies for application reads (user wants to read data), while Write Through and Write Behind are caching strategies for application writes (user wants to write data). Couldn't you have a strategy that uses both Cache Aside and Write Through? Why do most articles always seem to portray them as independent strategies?
What happens if you have a cluster of webs servers? Do they each have their own local in-memory database that acts as a cache?
Could you implement a cache using a normal (not in-memory) database? I suppose this would still be somewhat useful since you do not need to make an additional network hop to the database server (since the cache lives on the same machine as the web server)?
Introduction & clarification
I guess you have one misunderstood point, that the cache is NOT expclicitely stored on the same server as the werbserver. Sometimes, not even the database is sperated on it's own server from the webserver. If you think of APIs, like HTTP REST APIs, you can use caching to not spend too many resources on database connections & queries. Generally, you want to use as few database connections & queries as possible. Now imagine the following setting:
You have a werbserver who serves your application and a REST API, which is used by the webserver to work with some resources. Those resources come from a database (lets say a relational database) which is also stored on the same server. Now there is one endpoint which serves e.g. a list of posts (like blog-posts). Every user can fetch all posts (to make it simple in this example). Now we have a case where one can say that this API request could be cached, to not let all users always trigger the database, just to query the same resources (via the REST API) over and over again. Here comes caching. Redis is one of many tools which can be used for caching. Since redis is a simple in-memory key-value storage, you can just put all of your posts (remember the REST API) after the first DB-query, into the cache. All future requests for the posts-list would first check whether the posts are alreay cached or not. If they are, the API will return the cache-content for this specific request.
This is one simple example to show off, what caching can be used for.
Answers on your question
My first question is, why would you ever write to a cache?
To reduce the amount of database connections and queries.
how is writing to these key-value stores going to help with updating the relational database?
It does not help you with updating, but instead it helps you with spending less resources. It also helps you in terms of "temporary backing up" some data - but that only as a very little side effect. For this, out there are more attractive solutions (Since redis is also not persistent by default. But it supports persistence.)
Do these cache writing strategies only apply if your main database is also a key-value store?
No, it is not important which database you use. Whether it's a NoSQL or SQL DB. It strongly depends on what you want to cache and how the database and it's tables are set up. Do you have frequent changes in your recources? Do resources get updated manually or only on user-initiated actions? Those are questions, leading you to the right caching implementation.
Isn't it possible to mix and match caching strategies?
I am not an expert at caching strategies, but let me try:
I guess it is possible but it also, highly depends on what you are doing in your DB and what kind of application you have. I guess if you find out what kind of application you are building up, then you will know, what strategy you have to use - i guess it is also not recommended to mix those strategies up, because those strategies are coupled to your application type - in other words: It will not work out pretty well.
What happens if you have a cluster of webs servers? Do they each have their own local in-memory database that acts as a cache?
I guess that both is possible. Usually you have one database, maybe clustered or synchronized with copies, to which your webservers (e.g. REST APIs) make their requests. Then whether each of you API servers would have it's own cache, to not query the database at all (in cloud-based applications your database is also maybe on another separated server - so another "hop" in terms of networking). OR (what i also can imagine) you have another middleware between your APIs (clusterd up) and your DB (maybe also clustered up) - but i guess that no one would do that because of the network traffic. It would result in a higher response-time, what you usually want to prevent.
Could you implement a cache using a normal (not in-memory) database?
Yes you could, but it would be way slower. A machine can access in-memory data faster then building up another (local) connection to a database and query your cached entries. Also, because your database has to write the entries into files on your machine, to persist the data.
Conclusion
All in all, it is all about being fast in terms of response times and to prevent much network traffic. I hope that i could help you out a little bit.
I want to persist objects into Isolated Storage, so far I could think of these ways:
Serialize them into an xml file when saving and then serialize them back when saving.
Use an Object DB. Doubt abounds about a good or recommended one (Examples are Perst, winphone7db and Sterling DB)
Anyone can suggest some best practices?
As a basic guideline:
If you need the functionality of a database (relations, transactions, search, etc.) then you should use a database.
If you just need an object store, then you should just save your objects into Isolated Storage directly (serialising where necessary).
I haven't used each of the different DB options available but would probably go with Perst as it's the most established (there's also a good guide here), winphone7db is also not available yet.
If I use Sequel in a Ruby app like this:
DB = Sequel.sqlite('testdb.db')
does it make the database shared? Can I acces this same file from a different ruby app AT THE SAME TIME and get the database to perform locking etc?
I'm thinking probably not and i'd have to actually have a separate instance of the database running.
Yes, if you use a file backed database, you can access it by multiple processes. They don't even have to be ruby processes. Note that in SQLite, writers block all readers, so multi-process or multi-threaded write performance is not very good.
This is not up to neither Ruby nor Sequel. It's up to sqlite. Take a look at sqlite FAQ, and see whether it answers your question.
I have a series of code books in my database, and I am using plain JDBC calls to fetch them and store them in a collection. I would like to put these in some kind of a cache at application startup time in order to save time later.
I don't need any fancy stuff like automatic object invalidation, TTL etc - the code books change rarely, so I'll trigger the update myself and just reload the whole cache when the need arises.
The project where I need this uses Spring, and this is my first project using it. Is there a standard/elegant way to do this in Spring?
Thanks.
Check out Spring-cache.
Supports EHCache, OSCache and a memory cache, but allows pluggable cache providers too.