I know I'm sure this answer will vary greatly depending on business needs, but any insights on this? If my main purpose to use Redis Cache is for a fast in memory data storage and retrieval, should I set the expiry time for data to be 10 minutes? 15 minutes? 1 hour? 1 day?
Probably for as long as it doesn't get stale? Sometimes it's set to minutes, hours or even days.
However, if this means that your cache will just start filling up and hitting memory limits, then you might experience other issues due to overhead with OOM errors, evictions, higher page faulting etc.
So how long data stays in the cache will likely need to be a balance between ensuring that your cache has up-to-date data but at the same time always leaves some room and generally tries to not run at full memory all the time.
Related
We have a batch process that executes every day. This week, a job that usually does not past 18 minutes of execution time (real time, as you can see), now is taking more than 45 minutes to finish.
Fullstimmer option is already active, but we don't know why only the real time was increased.
In old documentation there are Fullstimmer stats that could help identify the problem but they do not appear in batch log. (The stats are those down below: Page Faults, Context Switches, Block Operation and so on, as you can see)
It might be an I/O issue. Does anyone know how we can identify if it is really an I/O problem or if it could be some other issue (network, for example)?
To be more specific, this is one of the queries that have increased in time dramatically. As you can see, it is reading from a data base (SQL Server, VAULT schema) and work and writing in work directory.
Number of observations its almost the same:
We asked customer about any change in network traffic, and they said still the same.
Thanks in advance.
For a process to complete, much more needs to be done than the actual calculations on the CPU.
Your data has te be read and your results have to be written.
You might have to wait for other processes to finish first, and if your process includes multiple steps, writing to and reading from disk each time, you will have to wait for the CPU each time too.
In our situation, if real time is much larger than cpu time, we usually see much trafic to our Network File System (nfs).
As a programmer, you might notice that storing intermediate results in WORK is more efficient then on remote libraries.
You might safe much time by creating intermediate results as views instead of tables, IF you only use them once. That is not only possible in SQL, but also in data steps like this
data MY_RESULT / view=MY_RESULT;
set MY_DATA;
where transaction_date between '1jan2022'd and 30jun2022'd;
run;
I have a few queries to a database that return absolutely constant responses, i.e. some entries on this database are never changed after written.
I'm wondering if I'm to implement caching on them with Redis, should I set an expiration time?
Pros and cons of not doing that -
Pros: Users will always benefit from caching (except for the first query)
Cons: The number of these entries to be queried is growing. So Redis will end up using more and more memory.
Edit
To give more context, the queries run quite slow. Each of them may take seconds. It will be beneficial to minimize the number of users that experience this.
Also, each of these results has size around the magnitude a several kB; The number (not size) of entries may be increasing for 1 per minute.
Sorry for answering with questions. Still waiting for enough reputation to comment and clarify.
Answering your direct question:
Are the number of queries you expect unbounded?
No: You could improve first user experience by triggering the queries on startup and leaving in cache. Other responses that are expected to change you could attach a TTL to and use any of the following maxmemory-policy settings in the config: volatile-ttl, allkeys-lru, 'volatile-lfu, or volatile-random` to only evict keys with TTLs.
Yes: Prioritize these by attaching a TTL and updating each time it's requested to keep in cache as long as possible and use any of the memory management policies that best fit the rest of your use case.
Related concerns:
If these are really static values, why are you querying a database rather than reading from a flat file of constants generated once and read at startup?
Have you attempted to optimize your queries?
I am evaluating ClickHouse's performance for potential use in a project. The write performance has been encouraging up to this point but as I was running my tests and had to restart the server a few times, I noticed an issue which has the potential of being a hard showstopper: the server startup time is fluctuating and most of the times extremely high.
My evaluation server contains 26 databases holding about 54 billion records and taking up 697.32 GB on disk.
With this amount of data I have been getting startup times from as low as 7m35s to almost 3h.
Is this normal? Can it be solved with some fancier configuration? Am I doing something really wrong? Because, as it stands, such a long startup time is a showstopper.
The main cause of slow startup time is due to the gigantic amount of metadata needed to be loaded, which has a positive correlation with the number of data files. In order to boost startup time, you need to either shrink the file count or get more memory in order to preserve all dentry and inode caches.
My evaluation server contains 26 databases holding about 54 billion records and taking up 697.32 GB on disk.
I'd suggest the following:
Try adjustint current data partition schemes in a coarser manner
Use OPTIMIZE TABLE <table> FINAL to compact all data files
Upgrade data disk to SSD or efficient RAID, or use file systems like btrfs to store meta data separately on fast storage.
I have a 33MB collection with around 33k items in it. This has been working perfectly for the past month and the queries were responsive and no slow queries. The collection have all the required indexes, and normally the response is almost instant(1-2ms)
Today I spotted that there was a major query queue and the requests were just not getting processed. The Oplog was filling up and just not clearing. After some searching I found the post below which suggest compacting and databaseRepair. I ran the repair and it fixed the problem. Ridiculously slow mongoDB query on small collection in simple but big database
My question is what could have gone wrong with the collection and how did databaseRepair fix the problem? Is there a way for me to ensure this does not happen again?
There are many things that could be an issue here, but ultimately if a repair/compact solved things for you it suggests storage related issues. Here are a few suggestions to follow up on:
Disk performance: Ensure that your disks are performing properly and that you do not have bad sectors. If part of your disk is damaged it could have spiked access times and you may run into this again. You may want to test your RAM modules as well.
Fragmentation: Without knowing your write profile it's hard to say, but your collections and indexes could have fragmented all over your storage system. Running repair will have rebuilt them and brought them back into a more contiguous form, allowing your disk access times to be much faster, especially if you're using mechanical disks and are going to disk for a lot of data.
If this was the issue then you may want to adjust your paddingFactor to reduce the frequency of this in the future, especially if your updates are growing the size of your documents over time. (Assuming you're using MMAPv1 storage).
Page faults: I'm assuming you may have brought the system down to do the repair, which may have reset your memory/working set. You might want to monitor for hard page faults that indicate that your queries are being bottlenecked by IO rather than being served by your in-memory working set. If this is consistently the case, your application behavior may change unexpectedly as data gets pushed in and out of memory, and you may need to add more RAM.
When you have peaks of 600 requests/second, then the memcache flushing an item due to the TTL expiring has some pretty negative effects. At almost the same time, 200 threads/processes find the cache empty and fire of a DB request to fill it up again
What is the best practice to deal with these situations?
p.s. what is the term for this situation? (gives me chance to get better google results on the topic)
If you have memcached objects which will be needed on a large number of requests (which you imply is the case), then I would look into having a separate process or cron job that regularly calculated and refreshed these objects. That way it should never go TTL. It's a common trade-off: you add a little unnecessary load during low traffic time to help reduce the load during peaking (the time you probably care the most about).
I found out this is referred to as "stampeding herd" by the memcached folks, and they discuss it here: http://code.google.com/p/memcached/wiki/NewProgrammingTricks#Avoiding_stampeding_herd
My next suggestion was actually going to be using soft cache limits as discussed in the link above.
If your object is expiring because you've set an expiry and it's gone past date, there is nothing you can do but increase the expiry time.
If you are worried about stale data, a few techniques exist you could consider:
Consider making the cache the authoritative source for whatever data you are looking at, and make a thread whose job is to keep it fresh. This will make the other threads block on refilling the cache, so it may only make sense if you can
Rather than setting a TTL on the data, change whatever process updates the data to update the cache. One technique I use for frequently changing data is to do this probabilistically -- 10% of the time data is written, it is updated. You can tune this for whatever is sensible, depending on how expensive the DB query is and how severe the impact of stale data.