cma_alloc fails to create memory chunk of 10M - memory-management

I am working on Camera Driver and whenever I try to allocate the memory around 10M, it fails but 4-5M memory is created. Is there a limit to memory allocation using cma_alloc? If yes, how do I increase it?

At first glance I'd say allocating 10M of continuous memory is already very suspicious. Why do you need that much?
You're on Android, so that indicates an embedded platform, making 10M of memory even more suspicious.
From the documentation I've found on-line you also need to specify how much cma memory the system needs at boot time. Did you specify more than 10M?

Related

My CUDA JIT cache stays persistently far below CUDA_CACHE_MAXSIZE

I have an OpenCL application which runs on CUDA v7.5.
The application has very many large kernels. I am setting CUDA_CACHE_MAXSIZE to the maximum possible value, 4294967296 i.e. 4GB. However, the total size of the files stored in the cache directory never grows above ~307MB. It does appear that cache entries are being added / evicted (I see small changes in the total file size, and my application is definitely hitting the cache when querying for recent kernels). It behaves as if there were some cache size limit lower than CUDA_CACHE_MAXSIZE being enforced, maybe by the opencl driver?
I would like to know what caused this, and if it is possible for me to access the full cache size of 4GB.
Sorry for taking so long to respond. I just found this is a bug in libcuda. I will submit a fix shortly.
For now, a workaround is to set CUDA_CACHE_MAXSIZE to 2Gb-1 (2147483647). Setting it to a value between 2Gb and 4Gb-1 could end up with a really high cache size, and setting it to 4Gb should result in a cache size of 256Mb, which is the default cache size since R334, instead of 32Mb, as said here.
I hope this workaround will help you.

What is the cost of mmaping on Mac OS X?

I have an algorithm where my life would be greatly simplified if I could reserve about 20 blocks of memory addresses of size 4GB. In practice, I never use more than 4GB, but I do not know which block will fill up in advance.
If I mmap 20 blocks of 4GB everything seems to work fine -- until I write to memory the OS does not seem to actually allocate anything.
Is there any reason I should not use mmap to allocate 80GB of memory, and then only using a small amount of it? I assume there is some cost to setting up these buffers. Can I measure it?
The only drawback of mmap-ing 80GB at once is that a page table has to be created for the full 80GB. So if the pages are 4kB, this table could consume a lot of memory (unless huge pages are used).
For sizes like that it is probably better to use one or more sliding mmap-ed views (i.e. create and remove them when needed).
On Windows, memory usage for mmap/page tables can be checked with RamMap, not sure about Mac.

Will Windows be still able to allocate memory when free space in physical memory is very low?

On Windows 32-bit system the application is being developed using Visual Studio:
Lets say lots of other application running on my machine and they have occupied almost all of physical memory and only 1 MB memory is left free. If my application (which has not yet allocated any memory) tries to allocate, say 2 MB, will the call be successful?
My guess: In theory, each Windows application has 2GB of virtual memory available.
So I believe this call should be successful (regardless how much physical memory is available). But I am not sure on this. That's why asking here.
Windows gives a rock-hard guarantee that this will always work. A process can only allocate virtual memory when Windows can commit space in the paging file for the allocation. If necessary, it will grow the paging file to make the space available. If that fails, for example when the paging file grows beyond the preset limit, then the allocation fails as well. Windows doesn't have the equivalent of the Linux "OOM killer", it does not support over-committing that may require an operating system to start randomly killing processes to find RAM.
Do note that the "always works" clause does have a sting. There is no guarantee on how long this will take. In very extreme circumstances the machine can start thrashing where just about every memory access in the running processes causes a page fault. Code execution slows down to a crawl, you can lose control with the mouse pointer frozen when Explorer or the mouse or video driver start thrashing as well. You are well past the point of shopping for RAM when that happens. Windows applies quotas to processes to prevent them from hogging the machine, but if you have enough processes running then that doesn't necessarily avoid the problem.
Of course. It would be lousy design if memory had to be wasted now in order to be used later. Operating systems constantly re-purpose memory to its most advantageous use at any moment. They don't have to waste memory by keeping it free just so that it can be used later.
This is one of the benefits of virtual memory with a page file. Because the memory is virtual, the system can allocate more virtual memory than physical memory. Virtual memory that cannot fit in physical memory, is pushed out to the page file.
So the fact that your system may be using all of the physical memory does not mean that your program will not be able to allocate memory. In the scenario that you describe, your 2MB memory allocation will succeed. If you then access that memory, the virtual memory will be paged in to physical memory and very likely some other pages (maybe in your process, maybe in another process) will be pushed out to the page file.
Well, it will succeed as long as there's some memory for it - apart from physical memory, there's also the page file.
However, once you reach the limit of both RAM and the page file, you're done for and that's when the out of memory situation really starts being fun.
Now, systems like Windows Vista will try to use all of your available RAM, pretty much for caching. That's a good thing, and when there's a request for memory from an application, the cache will be thrown away as needed.
As for virtual memory, you can request much more than you have available, regardless of your RAM or page file size. Only when you commit the memory does it actually need some backing - either RAM or the page file. On 64-bit, you can easily request terabytes of virtual memory - that doesn't mean you'll get it when you try to commit it, though :P
If your application is unable to allocate a physical memory (RAM) block to store information, the operating system takes over and 'pages' or stores sections that are in RAM on disk to free up physical memory so that your program is able to perform the allocation. This is done automatically and is completely invisible to your applications.
So, in your example, on a system that has 1MB RAM free, if your application tries to allocate memory, the operating system will page certain contents of physical memory to disk and free up RAM for your application. Your application will not crash in this case.
This, obviously is much more complicated than that.
There are several ways to configure a page file on Windows (fixed size, variable size and on which disk). If you run out of physical memory, and out of hard drive space (because your page file has grown very large due to excessive 'paging') or reach the limit of your paging file (if it is a static limit) then your applications will fail due out an out-of-memory exception. With today's systems with large local storage however, this is a rare event.
Be sure to read about paging for the full picture. Check out:
http://en.wikipedia.org/wiki/Paging
In certain cases, you will notice that you have sufficient free physical memory. Say 100MB and your program tries to allocate a 10MB block to store a large object but fails. This is caused by physical memory fragmentation. Although the total free memory is 100MB, there is no single contiguous block of 10MB that can be used to store your object. This will result in an exception that needs to be handled in your code. If you allocate large objects in your code you may want to separate the allocation into smaller blocks to facilitate allocation, and then aggregate them back in your code logic. For example, instead of having a single 10m vector, you can declare 10 x 1m vectors in an array and allocate memory for each individual one.

Xcode Memory Utilized

So in xcode, the Debug Navigator shows CPU Usage and Memory usage. When you click on Memory it says 'Memory Utilized'.
In my app I am using the latest Restkit (0.20.x) and every time I make a GET request using getObjectsAtPath (which doesn't even return a very large payload), the memory utilized increases about 2mb. So if I refresh my app 100 times, the Memory Utilized will have grown over 200mb.
However, when I run the Leaks tool, the Live Bytes remain fairly small and do not increase with each new request. Live bytes remains below 10mb the whole time.
So do I have a memory issue or not? Memory Utilized grows like crazy, but Live Bytes suggests everything is okay.
You can use Heapshot Analysis to evaluate the situation. If that shows no growth, then the memory consumption may be virtual memory which may (for example) reside in a cache/store which may support eviction and recreation -- so you should also identify growth in Virtual Memory regions.
If you keep making requests (e.g. try 200 refreshes), the memory will likely decrease at some point - or you will have memory warnings and ultimately allocation requests may fail. Determine how memory is reduced, if this is the case. Otherwise, you will need to determine where it is created and possibly referenced.
Also, test on a device in this case. The simulator is able to utilise much more memory than a device simply because it has more to work with. Memory constraints are not simulated.

why does the redis memory usage not reduce when del half of keys

Redis is used to save data but it costs a lot of memory, and its memory usage up to 52.5%.
I deleted half of the keys in redis, and the return code of the delete operation is ok, but its memory usage doesn't reduce.
What's the reason? Thanks in Advance.
My operation code is as below:
// save data
m_pReply = (redisReply *)redisCommand(m_pCntxt, "set %b %b", mykey.data(), mykey.size(), &myval, sizeof(myval));
// del data
m_pReply = (redisReply *)redisCommand(m_pCntxt, "del %b", mykey.data(), mykey.size());
The redis info:
redis 127.0.0.1:6979> info
redis_version:2.4.8
redis_git_sha1:00000000
redis_git_dirty:0
arch_bits:64
multiplexing_api:epoll
gcc_version:4.4.6
process_id:28799
uptime_in_seconds:1289592
uptime_in_days:14
lru_clock:127925
used_cpu_sys:148455.30
used_cpu_user:38023.92
used_cpu_sys_children:23187.60
used_cpu_user_children:123989.72
connected_clients:22
connected_slaves:0
client_longest_output_list:0
client_biggest_input_buf:0
blocked_clients:0
used_memory:31903334872
used_memory_human:29.71G
used_memory_rss:34414981120
used_memory_peak:34015653264
used_memory_peak_human:31.68G
mem_fragmentation_ratio:1.08
mem_allocator:jemalloc-2.2.5
loading:0
aof_enabled:0
changes_since_last_save:177467
bgsave_in_progress:0
last_save_time:1343456339
bgrewriteaof_in_progress:0
total_connections_received:820
total_commands_processed:2412759064
expired_keys:0
evicted_keys:0
keyspace_hits:994257907
keyspace_misses:32760132
pubsub_channels:0
pubsub_patterns:0
latest_fork_usec:11672476
vm_enabled:0
role:slave
master_host:192.168.252.103
master_port:6479
master_link_status:up
master_last_io_seconds_ago:0
master_sync_in_progress:0
db0:keys=66372158,expires=0
Please refer to Memory allocation section on the following link:
http://redis.io/topics/memory-optimization
I quoted it here:
Redis will not always free up (return) memory to the OS when keys are
removed. This is not something special about Redis, but it is how most
malloc() implementations work. For example if you fill an instance
with 5GB worth of data, and then remove the equivalent of 2GB of data,
the Resident Set Size (also known as the RSS, which is the number of
memory pages consumed by the process) will probably still be around
5GB, even if Redis will claim that the user memory is around 3GB. This
happens because the underlying allocator can't easily release the
memory. For example often most of the removed keys were allocated in
the same pages as the other keys that still exist.
Since Redis 4.0.0 there's a command for this:
MEMORY PURGE
Should do the trick: https://redis.io/commands/memory-purge
Note however that command docs state:
This command is currently implemented only when using jemalloc as an allocator, and evaluates to a benign NOOP for all others.
And the README reminds us that:
Redis is compiled and linked against libc
malloc by default, with the exception of jemalloc being the default on Linux
systems. This default was picked because jemalloc has proven to have fewer
fragmentation problems than libc malloc.
A good starting point is to use the Redis CLI command: MEMORY DOCTOR.
It can give you very valuable information and point you to the potential issue.
some useful links:
MEMORY DOCTOR command docs
What is defragmentation and what are the Redis defragmentation configs
example:
Peak memory: In the past this instance used more than 150% the memory that is currently using. The allocator is normally not able to release memory after a peak, so you can expect to see a big fragmentation ratio, however this is actually harmless and is only due to the memory peak, and if the Redis instance Resident Set Size (RSS) is currently bigger than expected, the memory will be used as soon as you fill the Redis instance with more data. If the memory peak was only occasional and you want to try to reclaim memory, please try the MEMORY PURGE command, otherwise the only other option is to shutdown and restart the instance.
High total RSS: This instance has a memory fragmentation and RSS overhead greater than 1.4 (this means that the Resident Set Size of the Redis process is much larger than the sum of the logical allocations Redis performed). This problem is usually due either to a large peak memory (check if there is a peak memory entry above in the report) or may result from a workload that causes the allocator to fragment memory a lot. If the problem is a large peak memory, then there is no issue. Otherwise, make sure you are using the Jemalloc allocator and not the default libc malloc. Note: The currently used allocator is "jemalloc-5.1.0".
High allocator fragmentation: This instance has an allocator external fragmentation greater than 1.1. This problem is usually due either to a large peak memory (check if there is a peak memory entry above in the report) or may result from a workload that causes the allocator to fragment memory a lot. You can try enabling 'activedefrag' config option.

Resources