Redis high memory usage for almot no keys - heroku

I have a redis instance hosted by heroku ( https://elements.heroku.com/addons/heroku-redis ) and using the plan "Premium 1"
This redis is usued only to host a small queue system called Bull ( https://www.npmjs.com/package/bull )
The memory usage is now almost at 100 % ( of the 100 Mo allowed ) even though there is barely any job stored in redis.
I ran an INFO command on this instance and here are the important part ( can post more if needed ) :
# Server
redis_version:3.2.4
# Memory
used_memory:98123632
used_memory_human:93.58M
used_memory_rss:470360064
used_memory_rss_human:448.57M
used_memory_peak:105616528
used_memory_peak_human:100.72M
total_system_memory:16040415232
total_system_memory_human:14.94G
used_memory_lua:280863744
used_memory_lua_human:267.85M
maxmemory:104857600
maxmemory_human:100.00M
maxmemory_policy:noeviction
mem_fragmentation_ratio:4.79
mem_allocator:jemalloc-4.0.3
# Keyspace
db0:keys=45,expires=0,avg_ttl=0
# Replication
role:master
connected_slaves:1
master_repl_offset:25687582196
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:25686533621
repl_backlog_histlen:1048576
I have a really hard time figuring out how I can be using 95 Mo with barely 50 object stored. These objects are really small, usually a JSON with 2-3 fields containing small strings and ids
I've tried https://github.com/gamenet/redis-memory-analyzer but it crashes on me when I try to run it
I can't get a dump because Heroku does not allow it.
I'm a bit lost here, there might be something obvious I've missed but I'm reaching the limit of my understanding of Redis.
Thanks in advance for any tips / pointer.
EDIT
We had to upgrade our Redis instance to keep everything running but it seems the issue is still here. Currently sitting at 34 keys / 34 Mo
I've tried redis-cli --bigkeys :
Sampled 34 keys in the keyspace!
Total key length in bytes is 743 (avg len 21.85)
9 strings with 43 bytes (26.47% of keys, avg size 4.78)
0 lists with 0 items (00.00% of keys, avg size 0.00)
0 sets with 0 members (00.00% of keys, avg size 0.00)
24 hashs with 227 fields (70.59% of keys, avg size 9.46)
1 zsets with 23 members (02.94% of keys, avg size 23.00)
I'm pretty sure there is some overhead building up somewhere but I can't find what.
EDIT 2
I'm actually blind : used_memory_lua_human:267.85M in the INFO command I run when first creating this post and now used_memory_lua_human:89.25M on the new instance
This seems super high, and might explain the memory usage

You have just 45 keys in database, so what you can do is:
List all keys with KEYS * command
Run DEBUG OBJECT <key> command for each or several keys, it will return serialized length so you will get better understanding what keys consume lot of space.
Alternative option is to run redis-cli --bigkeys so it will show biggest keys. You can see content of the key by specific for the data type command - for strings it's GET command, for hashes it's HGETALL and so on.

After a lot of digging, the issue is not coming from Redis or Heroku in anyway.
The queue system we use has a somewhat recent bug where Redis ends up caching a Lua script repeatedly eating up memory as time goes on.
More info here : https://github.com/OptimalBits/bull/issues/426
Thanks for those who took the time to reply.

Related

Writing small amount of data to large number of files on GlusterFS 3.7

I'm experimenting with 2 Gluster 3.7 servers in 1x2 configuration. Servers are connected over 1 Gbit network. I'm using Debian Jessie.
My use case is as follows: open file -> append 64 bytes -> close file and do this in a loop for about 5000 different files. Execution time for such loop is roughly 10 seconds if I access files through mounted glusterfs drive. If I use libgfsapi directly, execution time is about 5 seconds (2 times faster).
However, the same loop executes in 50ms on plain ext4 disk.
There is huge performance difference between Gluster 3.7 end earlier versions which is, I believe, due to the cluster.eager-lock setting.
My target is to execute the loop in less than 1 second.
I've tried to experiment with lots of Gluster settings but without success. dd tests with various bsize values behave like that TCP no-delay option is not set, although from Gluster source code it seems that no-delay is default.
Any idea how to improve the performance?
Edit:
I've found a solution that works in my case so I'd like to share it in case anyone else faces the same issue.
The root cause of the problem is the number of roundtrips between client and Gluster server during execution of open/write/close sequence. I don't know exactly what is happening behind but timing measurements shows exactly that pattern. Now, the obvious idea would be to "pack" open/write/close sequence into a single write function. Roughly, the C prototype of such function would be:
int write(const char* fname, const void *buf, size_t nbyte, off_t offset)
But, there is already such API function glfs_h_anonymous_write in libgfapi (thanks goes to Suomya from Gluster mailing group). Kind of hidden thing there is the file identifier which is not plain file name, but something of type struct glfs_object. Clients obtain an instance of such object through API calls glfs_h_lookupat/glfs_h_creat. The point here is that glfs_object representing filename is "stateless" in a sense that corresponding inode is left intact (not ref counted). One should think of glfs_object as plain filename identifier and use it as you would use filename (actually, glfs_object stores plain pointer to corresponding inode without ref counting it).
Finally, we should use glfs_h_lookupat/glfs_h_creat once and write many times to the file using glfs_h_anonymous_write.
That way I was able to append 64 bytes to 5000 files in 0.5 seconds, which is 20 times faster than using mounted volume and open//write/close sequence.

Neo4J tuning or just more RAM?

I have a Neo4J-enterprise database running on a DigitalOcean VPS with 8Gb RAM and 80Gb SSD.
The performance of the Neo4J instance is awful at the moment:
match (n) where n.gram='0gram' AND n.word=~'a.' return n.word LIMIT 5 # 349ms
match (n) where n.gram='0gram' AND n.word=~'a.*' return n.word LIMIT 25 # 1588ms
I understand regex are expensive, but on likewise queries where I replace the 'a.' or 'a.*' part with any other letter, Neo4j simply crashes. I can see a huge build-up in memory before that (towards 90%), and the CPU sky-rocketing.
My Neo4j is populated as follows:
Number Of Relationship Type Ids In Use: 1,
Number Of Node Ids In Use: 172412046,
Number Of Relationship Ids In Use: 172219328,
Number Of Property Ids In Use: 344453742
The VPS only runs Neo4J (on debian 7/amd64). I use the NUMA+parallelGC flags as they're supposed to be faster. I've been tweaking my RAM settings, and although it doesn't crash at often now, I have a feeling there should be some gainings to be made
neostore.nodestore.db.mapped_memory=1024M
neostore.relationshipstore.db.mapped_memory=2048M
neostore.propertystore.db.mapped_memory=6144M
neostore.propertystore.db.strings.mapped_memory=512M
neostore.propertystore.db.arrays.mapped_memory=512M
# caching
cache_type=hpc
node_cache_array_fraction=7
relationship_cache_array_fraction=5
# node_cache_size=3G
# relationship_cache_size=1G --> these throw a not-enough-heap-mem error
The data is essentially a series of tree, where on node0 only a full text search is needed, the following nodes are searched by a property with floating point values.
node0 -REL-> node0.1 -REL-> node0.1.1 ... node0.1.1.1.1
\
-REL-> node0.2 -REL-> node0.2.1 ... node0.2.1.1
There are aprox. 5.000 top-nodes like node0.
Should I reconfigure my memory/cache usage, or should I just add more RAM?
--- Edit on Indexes ---
Because all tree's of nodes al always 4-levels deep, each level has a label for quick finding.in this case all node0 nodes have a label (called 0gram). the n.gram='0gram' should use the index coupled to the label.
--- Edit on new Config ---
I upgraded the VPS to 16Gb. The nodeStore has 2.3Gb (11%), PropertyStore 13.8Gb (64%) and the relastionshipStore amounts to 5.6Gb (26%) on the SSD.
On this basis I created a new config (detailed above).
I'm waiting for the full set of queries and will do some additional testing in the mean time
Yes you need to create an index, what's your label called? Imagine it being called :NGram
create index on :NGram(gram);
match (n:NGram) where n.gram='0gram' AND n.word=~'a.' return n.word LIMIT 5
match (n:NGram) where n.gram='0gram' AND n.word=~'a.*' return n.word LIMIT 25
What you're doing is not a graph search but just a lookup via full scan + property comparison with a regexp. Not a very efficient operation. What you need is FullTextSearch (which is not supported with the new schema indexes but still with the legacy indexes).
Could you run this query (after you created the index) and say how many nodes it returns?
match (n:NGram) where n.gram='0gram' return count(*)
which is the equivalent to
match (n:NGram {gram:'0gram'}) return count(*)
I wrote a blog post about it a few days ago, please read it and see if it applies to your case.
How big is your Neo4j database on disk?
What is the configured heap size? (in neo4j-wrapper.conf?)
As you can see you use more RAM than you machine has (not even counting OS or filesystem caches).
So you would have to reduce the mmio sizes, e.g. to 500M for nodes 2G for rels and 1G for properties.
Look at your store-file sizes and set mmio accordingly.
Depending on the number of nodes having n.gram='0gram' you might benefit a lot from setting a label on them and index for the gram property. If you have this in place a index lookup will directly return all 0gram nodes and apply regex matching only on those. Your current statement will load each and every node from the db and inspect its properties.

AT+CPMS "SM" Storage

AT+CPMS? query to a SIM returns
+CPMS: "SM",0,60, "SM,0,60,"SM",0,60
OK
The total storage is 60 messages maximum, and 0 are currently stored.
All messages are stored on the SIM.
An existing legacy application fails when trying to populate all 60 message boxes because message box 55 is the last successful write location.
When trying to populate the spaces 56-60, the command returns CMS Error 322 which is a memory full error.
The text in each SMS is simply "P H" which is 3 ASCII bytes, so each text doesn't use an excessive amount of memory.
Has anyone an explanation why the spaces 56-60 are unavailable, although the CPMS? query shows them as available? Thanks.

Ruby GC::Profiler "Use Size" is greater than "Total Size"

I'm trying to slim down a part of my application, so I'm using the GC::Profiler to tell whether this or that does actually slim down anything, but I'm getting a result I can't interpret:
Index Invoke Time(sec) Use Size(byte) Total Size(byte) Total Object GC Time(ms)
1 2.409 13943640 15840440 396011 36.48399999999973886133
.
.
102 7.074 30720080 30525000 763125 30.90900000000029734792
103 7.359 31267800 30525000 763125 34.99699999999972277465
.
.
1664 461.066 739610760 30525000 763125 32.36799999996264887159
According to the docs, this would seem to say that more than 2000% of the allocated heap is currently used...? And would indicate that my memory usage is growing, except that the total heap size apparently isn't...
My task is: I'm reading a large incoming HTTP stream directly to a file, and then reading the file one line at a time via lazy evaluation, performing a transform on each line, and writing it out to a file.
Any ideas as to what's going on?
Edit: Ruby 2.0 on a Mac
Edit 2: ps aux reports 2936436 VSS and 455836 RSS

Ruby Memory Management

I have been using Ruby for a while now and I find, for bigger projects, it can take up a fair amount of memory. What are some best practices for reducing memory usage in Ruby?
Please, let each answer have one "best practice" and let the community vote it up.
When working with huge arrays of ActiveRecord objects be very careful... When processing those objects in a loop if on each iteration you are loading their related objects using ActiveRecord's has_many, belongs_to, etc. - the memory usage grows a lot because each object that belongs to an array grows...
The following technique helped us a lot (simplified example):
students.each do |student|
cloned_student = student.clone
...
cloned_student.books.detect {...}
ca_teachers = cloned_student.teachers.detect {|teacher| teacher.address.state == 'CA'}
ca_teachers.blah_blah
...
# Not sure if the following is necessary, but we have it just in case...
cloned_student = nil
end
In the code above "cloned_student" is the object that grows, but since it is "nullified" at the end of each iteration this is not a problem for huge array of students. If we didn't do "clone", the loop variable "student" would have grown, but since it belongs to an array - the memory used by it is never released as long as array object exists.
Different approach works too:
students.each do |student|
loop_student = Student.find(student.id) # just re-find the record into local variable.
...
loop_student.books.detect {...}
ca_teachers = loop_student.teachers.detect {|teacher| teacher.address.state == 'CA'}
ca_teachers.blah_blah
...
end
In our production environment we had a background process that failed to finish once because 8Gb of RAM wasn't enough for it. After this small change it uses less than 1Gb to process the same amount of data...
Don't abuse symbols.
Each time you create a symbol, ruby puts an entry in it's symbol table. The symbol table is a global hash which never gets emptied.
This is not technically a memory leak, but it behaves like one. Symbols don't take up much memory so you don't need to be too paranoid, but it pays to be aware of this.
A general guideline: If you've actually typed the symbol in code, it's fine (you only have a finite amount of code after all), but don't call to_sym on dynamically generated or user-input strings, as this opens the door to a potentially ever-increasing number
Don't do this:
def method(x)
x.split( doesn't matter what the args are )
end
or this:
def method(x)
x.gsub( doesn't matter what the args are )
end
Both will permanently leak memory in ruby 1.8.5 and 1.8.6. (not sure about 1.8.7 as I haven't tried it, but I really hope it's fixed.) The workaround is stupid and involves creating a local variable. You don't have to use the local, just create one...
Things like this are why I have lots of love for the ruby language, but no respect for MRI
Beware of C extensions which allocate large chunks of memory themselves.
As an example, when you load an image using RMagick, the entire bitmap gets loaded into memory inside the ruby process. This may be 30 meg or so depending on the size of the image.
However, most of this memory has been allocated by RMagick itself. All ruby knows about is a wrapper object, which is tiny(1).
Ruby only thinks it's holding onto a tiny amount of memory, so it won't bother running the GC. In actual fact it's holding onto 30 meg.
If you loop over a say 10 images, you can run yourself out of memory really fast.
The preferred solution is to manually tell the C library to clean up the memory itself - RMagick has a destroy! method which does this. If your library doesn't however, you may need to forcibly run the GC yourself, even though this is generally discouraged.
(1): Ruby C extensions have callbacks which will get run when the ruby runtime decides to free them, so the memory will eventually be successfully freed at some point, just perhaps not soon enough.
Measure and detect which parts of your code are creating objects that cause memory usage to go up. Improve and modify your code then measure again. Sometimes, you're using gems or libraries that use up a lot of memory and creating a lot of objects as well.
There are many tools out there such as busy-administrator that allow you to check the memory size of objects (including those inside hashes and arrays).
$ gem install busy-administrator
Example # 1: MemorySize.of
require 'busy-administrator'
data = BusyAdministrator::ExampleGenerator.generate_string_with_specified_memory_size(10.mebibytes)
puts BusyAdministrator::MemorySize.of(data)
# => 10 MiB
Example # 2: MemoryUtils.profile
Code
require 'busy-administrator'
results = BusyAdministrator::MemoryUtils.profile(gc_enabled: false) do |analyzer|
BusyAdministrator::ExampleGenerator.generate_string_with_specified_memory_size(10.mebibytes)
end
BusyAdministrator::Display.debug(results)
Output:
{
memory_usage:
{
before: 12 MiB
after: 22 MiB
diff: 10 MiB
}
total_time: 0.406452
gc:
{
count: 0
enabled: false
}
specific:
{
}
object_count: 151
general:
{
String: 10 MiB
Hash: 8 KiB
BusyAdministrator::MemorySize: 0 Bytes
Process::Status: 0 Bytes
IO: 432 Bytes
Array: 326 KiB
Proc: 72 Bytes
RubyVM::Env: 96 Bytes
Time: 176 Bytes
Enumerator: 80 Bytes
}
}
You can also try ruby-prof and memory_profiler. It is better if you test and experiment different versions of your code so you can measure the memory usage and performance of each version. This will allow you to check if your optimization really worked or not. You usually use these tools in development / testing mode and turn them off in production.

Resources