I've searched a lot in the past days about this, but I haven't found anything that answered it.
I have a big memory pool created. Now, lets say that it's the first time that I access the pool and I want to alloc an array of 5 elements from that pool. I give a start address from it to the array so I can work with it.
Now I'll run the array with a loop like the following:
for (i=0; i<10; ++i)
array[i]=i;
In a normal allocation way it was supposed to an exception occur when i=5, but in my case it doesn't happens because I have a big memory allocated after the start address given.
How can I prevent writing/access in addresses where I wasn't supposed to write/access? Is there a way?
Thanks in advance.
Related
I am having this weird error
failed to allocate memory (NoMemoryError)
zlib(finalizer): the stream was freed prematurely.
When I try to do this, it goes well with the first iteration
array.each_slice(40000) do |result|
#writing this result array into the excel
end
But when the second iteration goes, it thows above mentioned error. What's the problem here? Someone can help me here?
The same problem is occurring in the first iteration if the count is 50000.
Well, you are probably running out of memory as the error says.
Maybe try iterating in 1000 batches over 10_000
You are better off creating a file and writing to it rather than creating a giant in memory Array. Note that writing to a file is optimal only when your hardware supports, you might/might not have the same hardware as your MacBook on your production server. So you might want to check with that.
Also, It's not a bad idea to call GC.start if this is a background job for every 5000 records or so.
Finally, check the memory allocations using something like memory profiler
I am currently trying to get a better understanding of cache optimization and have read various articles on the subject. I believe I am getting a decent understanding of it, but I need help with clarification of my understanding.
Let's say I have two large arrays that I am going to iterate over. Both are contiguous arrays and I am going to iterate over them in order. Spatially, neither arrays are close to each other in memory. The operation that is performed on the arrays is a for loop that simply adds the value of the index of the second array to the first.
int[] someArray;
int[] someOtherArray; //assume both arrays are initialized with some values and 100 elements
for(int i = 0; i < someArray.Length; i++)
{
someArray[i] += someOtherArray[i];
}
In this example when we get someArray[i] we initially get a cache miss, then when we load someOtherArray[i] we get another cache miss, but then am I correct in assuming for the next 8 iterations or so we don't get an L1 cache miss because 64 bytes of the arrays for both should now be loaded into memory?
And in general, is this how the cache will work? Anytime I access some random spot in memory, it is going to get loaded along with whatever the processors cache line size is worth of address space into memory, and as long as I use those same lines frequently and contiguously I will not have to travel to main memory?
For example, say I have a 32KB L1 cache, and I do the operation above. 200 4 byte ints is 600 bytes, so all of them should now be in the L1 cache. If I do another operation with them, this time multiplying the value and assigning it to someOtherArray[i], I will never once have to load values from main memory, assuming I do the operation immediately after.
Answering each question separately:
Yes. You're correct, this is how the cache works. That's why spatial locality speeds things up (in the context of caches).
Yes.
(Your example) Depends. In this case, probably.
If you go through the whole array, it gives some time for the first cache pages to be evicted. If your program were the only one running on the computer, then the answer would be yes, but you have to consider that there are other programs running on the machine at the same time, and the OS scheduler can switch between them whenever it wants.
A possible scenario is that your process gets switched out for another one during execution, that process fills up the cache, then when your process gets control again, the cache could no longer have your data. This is unlikely with the size of the program and array that you're talking about, but just goes to show that you can't make guarantees about the cache as long as there are other programs running on the same computer.
Suppose we want to maintain a pool of memory in a device driver or module. How can that pool be created and be available to multiple processes lets say 4 processes, accessing this driver/module.
Assume 1 MB of memory in the pool.
When I was reading LDD I came across api's mempool_create() but then there's also kmalloc.
If someone has done such a thing kindly share the knowledge.
My initial approach is to allocate using kmalloc() and then maintain start and end pointers in the private object for each process that opens the module.
EDIT: Thanks #kikigood for spending some time on this. So based on your comments, I do something like this.
Lets say I allocated 1MB of mempool during init.
And I want to restrict the count of processes to 4, so I keep a count.
Increment this count at every
atomic_t count =0;
open()
{
if(count >4)
return -ENOMEM;
count++;
}
Also I maintain a buffer within my private device structure per process.
How to assign some memory from pool to this buffer.
In order to create a memory pool, you need to use the kernel's slab allocator, or by maintaining the memory pool by yourself like what you did (kmalloc). By using kernel's slab allocator, you can use one of those:
kmem_cache_create()
mempool_create()
I think the key problem for you to maintain a pool by yourself is a risk of creating memory fragmentation issue which will quickly run out of your memory or you can't allocate a large memory block even if there are lots of free memory.
Another benefit of using kernel's slab allocator is you can easily monitor the memory usage by looking into your /proc/slab entries.
So I have autoreleased/released every object that I alloc/init/copy...and the allocations instrument seems to show minimal leaks...however...my program's memory usage does not stop increasing. I have included a screenshot of my allocations run (I have run allocations for longer but it remains relatively constant...it certainly does not compare to the amount the program gains when actually running. When running my program it will double in memory over the course of about 10 hours. The memory drastically increases in the first 5 minutes however (2-3MB), and just keeps on going. I don't understand why allocations would remain constant when running in instruments but my program would just keep gaining memory when actually run.
Since I can't post images yet...here is the link to the screenshot:
allocations run
UPDATE: Here are some screenshots from my memory heapshot analysis...I am not allocating these objects explicitly and don't really know where they are coming from. Almost all of them have their source with something similar to the second screenshot details on the right (lots of HTTPs and URLs in the call tree). Anybody know where these are coming from? I know I've read about some NSURLConnection leaks but I have tried all of the cache clearing that those suggest to no avail. Thanks for all the help so far!
memory heap analysis 1
memory heap analysis 2
Try heapshots.
Are you running with different environment variables when you run in different environments?
For example, you could have NSZombie enabled when you launch your app (causing all your objects to not be free'd) but not when you run in Instruments?
Just for a sanity check - How are you determining memory usage? You say that memory usage keeps going up, but not when you run in Instruments. Given that Instruments is a reliable way of measuring memory usage (the most reliable way?) this sounds a little odd - a bit like saying memory keeps going up except when i try to measure it.
If you are using autoreleased objects (like [NSString stringWithFormat:]) in a loop the pool won't be drained until that loop is exited and the program is allowed to complete the main event loop, at which point the autorelease pool is drained and a new one is instantiated.
If you have code like this the solution is to instantiate a new auto release pool before entering your loop, then draining it periodically during your loop (and reinstantiating the auto release pool after you drain it).
You can use Instruments to find out the location of where your allocations are originating. When running Instruments in the Allocation mode:
Move your mouse over the Category field in the Object Summary
Click on the Grey Circle with an arrow which appears next to the field name
This will bring up a list of locations where the objects in that category have been instantiated from, and the stats of how many allocations each have made.
If your memory usage is rising (but not leaking) you should be able to see where that memory was created, and then track down why it is hanging around.
This tool is also very useful in reducing your memory profile for mobile applications.
This is a homework question from compiler design course. I just need an explanation of certain parts of the question.
It is claimed that returning blocks to the standard memory manager
would require much administration. Why
is it not enough to have a single
counter per block, which holds the
number of busy records for that block,
and to return the block when it
reaches 0?
The context in which it refers to speaks about linked lists.
The answer from the answer sheet states:
How do you find this counter starting
from the pointer to the record and how
do you get the pointer by which to
return the block?
Coming from a C based background. Could someone explain to me what:
block is?
the counter does?
a busy record is?
A reference to documents that provide a walk-through of what happens during this counting phase. Diagrams would be helpful.
Thanks.
I think it may help if I change some terms, to better explain what I am guessing is going on.
If you have a page of memory, we can say a page is 8k in size. This is the minimum size that is allocated by the memory manager.
You have 10 requests of 100 bytes each, so 1000 bytes are in various places on the page.
The counter would be 10, but, how do you know what is actually freed, or has already been allocated, as the 10 requests may not be contiguous, as there may have been other requests that have already been freed.
So, we have 10 busy records.
Now, you would need to come up with your own answers to the question in the answer sheet, but, hopefully by looking at an example it may be simpler.
A "block" most likely is a basic block.
I'm not familiar with the term "busy record"; most likely, it refers to some data flow analysis result for variables (i.e. variables might be considered "busy"). Several definitions seem plausible:
a variable may be considered "busy" if it holds a value (i.e. has been "written") which also will be "read" (meaning that you can't eliminate the variable easily)
a variable may be considered "busy" if it is used "often" (more often than other variables), meaning that you should try to allocate it to a register.
However, you should really find out how the term was defined in your course.
Then, the counter would count, per basic block, the number of variables that are busy. Why that counter may become 0 after some processing is unclear - most likely, "busy" has yet another meaning in your course.
block is? the manager had divided the memory space into blocks. one or more blocks compose of a memory area unit which is usable for accessing continuously by the user. if require more memory, the manager will add extra block(s) to that memory area. while the manager is always trying to give continuous blocks to the user.
the counter does? for a specific block, it may be used by different users, that is, the memory area is shared by multiple users.
a busy record is? the counter value which is stored in above "counter".
for example:
struct block {
struct block *next;
long counter; //#< the busy record
};
EDIT: changing "area" to "user"
struct user {
struct block *head;
...
};
EDIT: answer the question "why is a counter not enough for a block?"
Needs more information when move a block from a "free block list" to a "allocated block list" or vice versa, e.g. order used to locate a position in a list quickly. while i just guess per this point.