I'm hosting IronPython in a c#-based WebService to be able to provide custom extension scripts. However, I'm finding that memory usage sharply increases when I do simple load testing by executing the webservice repeatedly in a loop.
IronPython-1.1 implemented IDisposable on its objects so that you can dispose of them when they are done. The new IronPython-2 engine based on the DLR has no such concept.
From what I understood, everytime you execute a script in the ScriptEngine a new assembly is injected in the appdomain and can't be unloaded.
Is there any way around this?
You could try creating a new AppDomain every time you run one of your IronPython scripts. Although assebmlies cannot be unloaded from memory you can unload an AppDomain and this will allow you to get the injected assembly out of memory.
You need to disable the optimized code generation:
var runtime = Python.CreateRuntime();
var engine = runtime.GetEngine("py");
PythonCompilerOptions pco = (PythonCompilerOptions)engine.GetCompilerOptions();
pco.Module &= ~ModuleOptions.Optimized;
// this shouldn't leak now
while(true) {
var code = engine.CreateScriptSourceFromString("1.0+2.0").Compile(pco);
code.Execute();
}
Turns out, after aspnet_wp goes to about 500mb, the garbage collector kicks in and cleans out the mess. The memory usage then drops to about 20mb and steadily starts increasing again during load testing.
So there's no memory 'leak' as such.
Related
Every time I upload a dicom file, the memory grow up. How can I clean the memory of the old ones
You can see various examples of how to free the memory on the examples code, example on the loader:
let loader = new LoadersVolume();
loader.free(); // Free memory
loader = null;
Another one:
let stackHelper = new HelpersStack();
stackHelper.dispose(); // Free memory
stackHelper = null;
I suggest to read the following document to know how garbage collection works on most browsers.
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Memory_Management
The garbage collector deletes from the memory everything that is not referenced somewhere.
Having a high use of memory even if you don't you the object anymore means there still a reference to it somewhere. Look for a variable that can still access to your old data, including the 3D scene, the AMI stackHelper, the AMI loader...
We are using gsoap for C client and server webservices implemented for blackfin running Linux.
We don't use any malloc in the application. But we see memory usage climbs over time. We are using soap_end to do a cleanup at the end of the call. But when the calls are invoked repeatedly memory usage slowly increasing, may be because of memory fragmentation. This is also impacting performance of the system
What's the preferred usage of gsoap where soap_malloc is not used much. For eg: If we use static arrays etc will it help?
Thanks,
nkr
I would not recommend using static data, there is no need for that.
To debug memory use, compile all your sources files with -DDEBUG. When you run your application you will see three files:
SENT.log the messages sent
RECV.log the messages received
TEST.log the debug log
The TEST.log is useful to check on messaging issues.
The other valuable information produced at runtime are error messages related to memory leaks or heap memory that is damaged (e.g. overruns) in your code. It is unlikely these will happen in the gSOAP engine, but better check.
To ensure proper allocation and deallocation of managed data:
soap_destroy(soap);
soap_end(soap);
I am using the auto-generated functions to allocate managed data:
SomeClass *obj = soap_new_SomeClass(soap);
and sporadically use soap_malloc for raw managed allocation, or to allocate an array of pointers, or a C string:
const char *s = soap_malloc(soap, 100);
but better is to allocate strings with:
std::string *s = soap_new_std__string(soap);
and arrays can be allocated with the second parameter, e.g. an array of 10 strings:
std::string *s = soap_new_std__string(soap, 10);
All managed allocations are deleted with soap_destroy() followed by soap_end(). After that, you can start allocating again and delete again, etc.
If you want to preserve data that otherwise gets deleted with these calls, use:
soap_unlink(soap, obj);
Now obj can be removed later with delete obj. But be aware that all pointer members in obj that point to managed data have become invalid after soap_destroy() and soap_end(). So you may have to invoke soap_unlink() on these members or risk dangling pointers.
A new cool feature of gSOAP is to generate deep copy and delete function for any data structures automatically, which saves a HUGE amount of coding time:
SomeClass *otherobj = soap_dup_SomeClass(NULL, obj);
This duplicates obj to unmanaged heap space. This is a deep copy that checks for cycles in the object graph and removes such cycles to avoid deletion issues. You can also duplicate the whole (cyclic) managed object to another context by using soap instead of NULL for the first argument of soap_dup_SomeClass.
To deep delete:
soap_del_SomeClass(obj);
This deletes obj but also the data pointed to by its members, and so on.
To use the soap_dup_X and soap_del_X functions use soapcpp2 with options -Ec and -Ed, respectively.
In principle, static and stack-allocated data can be serialized just as well. But consider using the managed heap instead.
Hope this helps.
I know that Windows has an option to clear the page file when it shuts down.
Does Windows do anything special with the actual physical/virtual memory when it goes in or out of scope?
For instance, let's say I run application A, which writes a recognizable string to a variable in memory, and then I close the application. Then I run application B. It allocates a large chunk of memory, leaves the contents uninitialized, and searches it for the known string written by application A.
Is there ANY possibility that application B will pick up the string written by application A? Or does Windows scrub the memory before making it available?
Windows does "scrub" the freed memory returned by a process before allocating it to other processes. There is a kernel thread specifically for this task alone.
The zero page thread runs at the lowest priority and is responsible for zeroing out free pages before moving them to the zeroed page list[1].
Rather than worrying about retaining sensitive data in the paging file, you should be worried about continuing to retain it in memory (after use) in the first place. Clearing the page-file on shutdown is not the default behavior. Also a system crash dump will contain any sensitive info that you may have in "plain-text" in RAM.
Windows does NOT "scrub" the memory as long as it is allocated to a process (obviously). Rather it is left to the program(mer) to do so. For this very purpose one can use the SecureZeroMemory() function.
This function is defined as the RtlSecureZeroMemory() function ( see WinBase.h). The implementation of RtlSecureZeroMemory() is provided inline and can be used on any version of Windows ( see WinNT.h)
Use this function instead of ZeroMemory() when you want to ensure that your data will be overwritten promptly, as some C++ compilers can optimize a call to ZeroMemory() by removing it entirely.
WCHAR szPassword[MAX_PATH];
/* Obtain the password */
if (GetPasswordFromUser(szPassword, MAX_PATH))
{
UsePassword(szPassword);
}
/* Before continuing, clear the password from memory */
SecureZeroMemory(szPassword, sizeof(szPassword));
Don't forget to read this interesting article by Raymond Chen.
The bottleneck of my current project is heap allocation... profiling stated about 50% of the time one critical thread spends with/in the new operator.
The application cannot use stack memory here and needs to allocate a lot of one central job structure—a custom job/buffer implementation: small and short-lived but variable in size. The object are itself heap memory std::shared_ptr/std::weak_ptr objects and carry a classic C-Array (char*) payload.
Depending on the runtime configuration and workload in different parts 300k-500k object might get created and are in use at the same time (but this should usually not happen). Since its a x64 application memory fragmentation isn't that big a deal (but it might get when also targeted at x86).
To increase speed and packet throughput and as well be save to memory fragmentation in the future I was thinking about using some memory management pool which lead me to boost::pool.
Almost all examples use fixed size object... but I'm unsure how to deal with a variable lengthed payload? A simplified object like this could be created using a boost::pool but I'm unsure what to do with the payload? Is it usable with a boost:pool at all?
class job {
public:
static std::shared_ptr<job> newObj();
private:
delegate_t call;
args_t * args;
unsigned char * payload;
size_t payload_size;
}
Usually the objects are destroyed when all references to the shared_ptr run out of scope and I wouldn't want to change the shared-ptr back to a c-ptr. A deferred destruction of the objects should also work to increase performance and from what I read should work better with a boost:pool. I haven't found if the pool supports an interaction with the smart_ptr? The alternative but quirky way would be to save a reference to the shared_ptr on creation together with the pool and release them in blocks.
Does anyone have experiences with the two? boost:pool usage with variable sized objects and smart pointer interaction?
Thank you!
Consider a complex, memory hungry, multi threaded application running within a 32bit address space on windows XP.
Certain operations require n large buffers of fixed size, where only one buffer needs to be accessed at a time.
The application uses a pattern where some address space the size of one buffer is reserved early and is used to contain the currently needed buffer.
This follows the sequence:
(initial run) VirtualAlloc -> VirtualFree -> MapViewOfFileEx
(buffer changes) UnMapViewOfFile -> MapViewOfFileEx
Here the pointer to the buffer location is provided by the call to VirtualAlloc and then that same location is used on each call to MapViewOfFileEx.
The problem is that windows does not (as far as I know) provide any handshake type operation for passing the memory space between the different users.
Therefore there is a small opportunity (at each -> in my above sequence) where the memory is not locked and another thread can jump in and perform an allocation within the buffer.
The next call to MapViewOfFileEx is broken and the system can no longer guarantee that there will be a big enough space in the address space for a buffer.
Obviously refactoring to use smaller buffers reduces the rate of failures to reallocate space.
Some use of HeapLock has had some success but this still has issues - something still manages to steal some memory from within the address space.
(We tried Calling GetProcessHeaps then using HeapLock to lock all of the heaps)
What I'd like to know is there anyway to lock a specific block of address space that is compatible with MapViewOfFileEx?
Edit: I should add that ultimately this code lives in a library that gets called by an application outside of my control
You could brute force it; suspend every thread in the process that isn't the one performing the mapping, Unmap/Remap, unsuspend the suspended threads. It ain't elegant, but it's the only way I can think of off-hand to provide the kind of mutual exclusion you need.
Have you looked at creating your own private heap via HeapCreate? You could set the heap to your desired buffer size. The only remaining problem is then how to get MapViewOfFileto use your private heap instead of the default heap.
I'd assume that MapViewOfFile internally calls GetProcessHeap to get the default heap and then it requests a contiguous block of memory. You can surround the call to MapViewOfFile with a detour, i.e., you rewire the GetProcessHeap call by overwriting the method in memory effectively inserting a jump to your own code which can return your private heap.
Microsoft has published the Detour Library that I'm not directly familiar with however. I know that detouring is surprisingly common. Security software, virus scanners etc all use such frameworks. It's not pretty, but may work:
HANDLE g_hndPrivateHeap;
HANDLE WINAPI GetProcessHeapImpl() {
return g_hndPrivateHeap;
}
struct SDetourGetProcessHeap { // object for exception safety
SDetourGetProcessHeap() {
// put detour in place
}
~SDetourGetProcessHeap() {
// remove detour again
}
};
void MapFile() {
g_hndPrivateHeap = HeapCreate( ... );
{
SDetourGetProcessHeap d;
MapViewOfFile(...);
}
}
These may also help:
How to replace WinAPI functions calls in the MS VC++ project with my own implementation (name and parameters set are the same)?
How can I hook Windows functions in C/C++?
http://research.microsoft.com/pubs/68568/huntusenixnt99.pdf
Imagine if I came to you with a piece of code like this:
void *foo;
foo = malloc(n);
if (foo)
free(foo);
foo = malloc(n);
Then I came to you and said, help! foo does not have the same address on the second allocation!
I'd be crazy, right?
It seems to me like you've already demonstrated clear knowledge of why this doesn't work. There's a reason that the documention for any API that takes an explicit address to map into lets you know that the address is just a suggestion, and it can't be guaranteed. This also goes for mmap() on POSIX.
I would suggest you write the program in such a way that a change in address doesn't matter. That is, don't store too many pointers to quantities inside the buffer, or if you do, patch them up after reallocation. Similar to the way you'd treat a buffer that you were going to pass into realloc().
Even the documentation for MapViewOfFileEx() explicitly suggests this:
While it is possible to specify an address that is safe now (not used by the operating system), there is no guarantee that the address will remain safe over time. Therefore, it is better to let the operating system choose the address. In this case, you would not store pointers in the memory mapped file, you would store offsets from the base of the file mapping so that the mapping can be used at any address.
Update from your comments
In that case, I suppose you could:
Not map into contiguous blocks. Perhaps you could map in chunks and write some intermediate function to decide which to read from/write to?
Try porting to 64 bit.
As the earlier post suggests, you can suspend every thread in the process while you change the memory mappings. You can use SuspendThread()/ResumeThread() for that. This has the disadvantage that your code has to know about all the other threads and hold thread handles for them.
An alternative is to use the Windows debug API to suspend all threads. If a process has a debugger attached, then every time the process faults, Windows will suspend all of the process's threads until the debugger handles the fault and resumes the process.
Also see this question which is very similar, but phrased differently:
Replacing memory mappings atomically on Windows