C++/CLI: Preallocating memory for string handles - interop

I'm passing a map of strings from native c++ class to c# using c++/CLI. Native code using stl map. In C++/CLI I convert each stl string to CString and insert to a Dictionary^ using String^ str = gcnew String(umngd.c_str()).
Apart from the need to iterate the map which I wonder if there is a built in way to do, my problem is that this piece of code a very slow probably due to the many gcnew discrete memory allocations. My question is how do I preallocate all needed memory and then insert the values into this preallocated memory.
Thank you.

gcnew creates an instance of a managed type on the garbage collected heap. The .NET CLR already preallocates space for the heap and manages its size, and it's pretty smart about it.
You cannot preallocate managed objects. If you want a million managed string objects, you'll need a million gcnew's. On my laptop, this takes a few hundred milliseconds. Is this too slow?
Test your code. If it's actually too slow, maybe you can use a different approach. There's a bit of discussion of alternatives here.

Related

Can a Swift object have arbitrary inline storage in the instance?

For example, an immutable CFString can store the length and the character data in the same block of memory. And, more generally, there is NSAllocateObject(), which lets you specify extra bytes to be allocated after the object’s ivars. The amount of storage is determined by the particular instance rather than being fixed by the class. This reduces memory use (one allocation instead of two) and improves locality of reference. Is there a way to do this with Swift?
A rather later reply. 😄 NSAllocateObject() is now deprecated for some reason. However, NSAllocateObject() is really a wrapper around class_createInstance which is not deprecated. So, in principle, one could use this to allocate extra bytes for an object instance.
I can't see why this wouldn't work in Swift. But accessing the extra storage would be messy because you'd have to start fooling around with unsafe pointers and the like. Moreover, if you're not the author of the original class, then you risk conflicting with Apple's ivars, especially in cases where you might be dealing with a class cluster which could potentially have a number of different instance sizes, according to the specific concrete implementation.
I think a safter approach would be to make use of objc_setAssociatedObject and objc_getAssociatedObject, which are accessible in Swift. E.g. Is there a way to set associated objects in Swift?

What is the Go language garbage collection approach compared to others?

I do not know much about the Go programming language, but I have seen several claims that said Go has latency-free garbage collection, and it is much better than other garbage collectors (like JVM garbage collector). I have developed application for JVM and i know that JVM garbage collector is not latency-free (specially in large memory usage).
I was wondering, what is difference between the garbage collection approach in Go and and the others which make it latency-free?
Thanks in advance.
Edit:
#All I edited this question entirely, please vote to reopen this question if you find it constructive.
Go does not have latency-free garbage collection. If you can point out where those claims are, I'd like to try to correct them.
One advantage that we believe Go has over Java is that it gives you more control over memory layout. For example, a simple 2D graphics package might define:
type Rect struct {
Min Point
Max Point
}
type Point struct {
X int
Y int
}
In Go, a Rect is just four integers contiguous in memory. You can still pass &r.Max to function expecting a *Point, that's just a pointer into the middle of the Rect variable r.
In Java, the equivalent expression would be to make Rect and Point classes, in which case the Min and Max fields in Rect would be pointers to separately allocated objects. This requires more allocated objects, taking up more memory, and giving the garbage collector more to track and more to do. On the other hand, it does avoid ever needing to create a pointer to the middle of an object.
Compared to Java, then, Go gives you the programmer more control over memory layout, and you can use that control to reduce the load on the garbage collector. That can be very important in programs with large amounts of data. Control over memory layout may also be important for extracting performance from the hardware due to cache effects and such, but that's tangential to the original question.
The collector in the current Go distributions is reasonable but by no means state of the art. We have plans to spend more effort improving it over the next year or two. To be clear,
Go's garbage collector is certainly not as good as modern Java garbage collectors, but we believe it is easier in Go to write programs that don't need as much garbage collection to begin with, so the net effect can still be that garbage collection is less of an issue in a Go program than in an equivalent Java program.

Java code ported to Objective-C is very slow

I want to illustrate a concrete example to understand if there are best (and worst) practices when java code is rewritten in Objective-C.
I've ported the Java implementation of org.apache.wicket.util.diff.myers to Objective-C on OSX Snow Leopard (Xcode 4) but the result runs very slowly compared to the Java version.
The method with worst performances is buildPath, it mainly does
sparse array access (diagonal variable, this array is allocated inside method and isn't returned)
random array access (orig and rev variables)
allocation of PathNode and its subclasses (an object with three properties, only property is an element using internally by array)
strings comparison
Cocoa hasn't any collection class to work easily with sparse arrays so I've allocated an array with malloc, this dramatically improved the first version based on NSDictionary and zillion of NSNumber's object allocated to be used as key.
The PathNode(s) allocation is done using the normal syntax [[MyClass alloc] init], they aren't autoreleased because are added to an NSMutableArray (but are released immediately after adding it to array)
Random access to array is done using [NSArray objectAtIndex:index] I think (but I can be wrong) that moving it to an C-like doesn't speedup so much.
Do you have any idea to improve performance, where bottlenecks can be found?
Using instruments 74% of time is spent on allocation, how can I improve allocation?
EDIT I've submitted my actual implementation to github, obviously is an alpha version not ready for production and doesn't use any efficient objective-c construct
You're off to an excellent start. You've profiled the code, isolated the actual bottleneck, and are now focused on how to address it.
The first question is which allocation is costly? Obviously you should focus on that one first.
There are several efficient ways to deal with sparse arrays. First, look at NSPointerArray, which is designed to hold NULL values. It does not promise to be efficient for sparse arrays, but #bbum (who knows such things) suggests it is.
Next, look at NSHashMap, which is certainly efficient for sparse collections (it's a dictionary), and supports non-object keys (i.e. you don't need to create an NSNumber).
Finally, if allocation really is your problem, there are various tricks to work around it. The most common is to reuse objects rather than destroying one and creating another. This is how UITableViewCell works (and NSCell in a different way).
Finally, if you switch to Core Foundation objects, you can create your own specialized memory allocator, but that really is a last resort.
Note that 10.6 supports ARC (without zeroing weak references). ARC dramatically improves performance around a lot of common memory management patterns. For example, the very common pattern of "retain+autorelease+return" is highly optimized under ARC. ("retain" doesn't exist in the language in ARC, but it does still exist in the compiler, and ARC is much faster than doing by hand.) I highly recommend switching to ARC in any code you can.
You can use the NSPointerArray class as a replacement for your sparse array. NSPointerArray allows null elements.
If you post the code thats generating the bulk of your allocations, we might be able to help you more.

Overhead of memory allocator

I've been creating a temporary object stack- mainly for the use of heap-based STL structures which only actually have temporary lifetimes, but any other temporary dynamically sized allocation too. The one stack performs all types- storing in an unrolled linked list.
I've come a cropper with alignment. I can get the alignment with std::alignment_of<T>, but this isn't really great, because I need the alignment of the next type I want to allocate. Right now, I've just arbitrarily sized each object at a multiple of 16, which as far as I know, is the maximal alignment for any x86 or x64 type. But now, I'm having two pointers of memory overhead per object, as well as the cost of allocating them in my vector, plus the cost of making every size round up to a multiple of 16.
On the plus side, construction and destruction is fast and reliable.
How does this compare to regular operator new/delete? And, what kind of test suites can I run? I'm pretty pleased with my current progress and don't want to find out later that it's bugged in some nasty subtle fashion, so any advice on testing the operations would be nice.
This doesn't really answer your question, but Boost has just recently added a memory pool library in the most recent version.
It may not be exactly what you want, but there is a thorough treatment of alignment which might spark an idea? If the docs are not enough, there is always the source code.

What are your strategies to keep the memory usage low?

Ruby is truly memory-hungry - but also worth every single bit.
What do you do to keep the memory usage low? Do you avoid big strings and use smaller arrays/hashes instead or is it no problem to concern about for you and let the garbage collector do the job?
Edit: I found a nice article about this topic here - old but still interesting.
I've found Phusion's Ruby Enterprise Edition (a fork of mainline Ruby with much-improved garbage collection) to make a dramatic difference in memory usage... Plus, they've made it extraordinarily easy to install (and to remove, if you find the need).
You can find out more and download it on their website.
I really don't think it matters all that much.
Making your code less readable in order to improve memory consumption is something you should only ever do if you need it. And by need, I mean have a specific case for the performance profile and specific metrics that indicate that any change will address the issue.
If you have an application where memory is going to be the limiting factor, then Ruby may not be the best choice. That said, I have found that my Rails apps generally consume about 40-60mb of RAM per Mongrel instance. In the scheme of things, this isn't very much.
You might be able to run your application on the JVM with JRuby - the Ruby VM is currently not as advanced as the JVM for memory management and garbage collection. The 1.9 release is adding many improvements and there are alternative VM's under development as well.
Choose date structures that are efficient representations, scale well, and do what you need.
Use algorithms that work using efficient data structures rather than bloated, but easier ones.
Look else where. Ruby has a C bridge and its much easier to be memory conscious in C than in Ruby.
Ruby developers are quite lucky since they don’t have to manage the memory themselves.
Be aware that ruby allocates objects, for instance something as simple as
100.times{ 'foo' }
allocates 100 string objects (strings are mutable and each version requires its own memory allocation).
Make sure that if you are using a library allocating a lot of objects, that other alternatives are not available and your choice is worth paying the garbage collector cost. (you might not have a lot of requests/s or might not care for a few dozen ms per requests).
Creating a hash object really allocates more than an object, for instance
{'joe' => 'male', 'jane' => 'female'}
doesn’t allocate 1 object but 7. (one hash, 4 strings + 2 key strings)
If you can use symbol keys as they won’t be garbage collected. However because they won’t be garbage collected you want to make sure to not use totally dynamic keys like converting the username to a symbol, otherwise you will ‘leak’ memory.
Example: Somewhere in your app, you apply a to_sym on an user’s name like :
hash[current_user.name.to_sym] = something
When you have hundreds of users, that’s could be ok, but what is happening if you have one million of users ? Here are the numbers :
ruby-1.9.2-head >
# Current memory usage : 6608K
# Now, add one million randomly generated short symbols
ruby-1.9.2-head > 1000000.times { (Time.now.to_f.to_s).to_sym }
# Current memory usage : 153M, even after a Garbage collector run.
# Now, imagine if symbols are just 20x longer than that ?
ruby-1.9.2-head > 1000000.times { (Time.now.to_f.to_s * 20).to_sym }
# Current memory usage : 501M
Be aware to never convert non controlled arguments in symbol or check arguments before, this can easily lead to a denial of service.
Also remember to avoid nested loops more than three levels deep because it makes the maintenance difficult. Limiting nesting of loops and functions to three levels or less is a good rule of thumb to keep the code performant.
Here are some links in regards:
http://merbist.com
http://blog.monitis.com
When deploying a Rails/Rack webapp, use REE or some other copy-on-write friendly interpreter.
Tweak the garbage collector (see https://www.engineyard.com/blog/tuning-the-garbage-collector-with-ruby-1-9-2 for example)
Try to cut down the number of external libraries/gems you use since additional code uses memory.
If you have a part of your app that is really memory-intensive then it's maybe worth rewriting it in a C extension or completing it by invoking other/faster/better optimized programs (if you have to process vast amounts of text data, maybe you can replace that code with calls to grep, awk, sed etc.)
I am not a ruby developer but I think some techniques and methods are true of any language:
Use the minimum size variable suitable for the job
Destroy and close variables and connections when not in use
However if you have an object you will need to use many times consider keeping it in scope
Any loops with manipulations of a big string dp the work on a smaller string and then append to bigger string
Use decent (try catch finally) error handling to make sure objects and connections are closed
When dealing with data sets only return the minimum necessary
Other than in extreme cases memory usage isn't something to worry about. The time you spend trying to reduce memory usage will buy a LOT of gigabytes.
Take a look at Small Memory Software - Patterns for Systems with Limited Memory. You don't specify what sort of memory constraint, but I assume RAM. While not Ruby-specific, I think you'll find some useful ideas in this book - the patterns cover RAM, ROM and secondary storage, and are divided into major techniques of small data structures, memory allocation, compression, secondary storage, and small architecture.
The only thing we've ever had which has actually been worth worrying about is RMagick.
The solution is to make sure you're using RMagick version 2, and call Image#destroy! when you're done using your image
Avoid code like this:
str = ''
veryLargeArray.each do |foo|
str += foo
# but str << foo is fine (read update below)
end
which will create each intermediate string value as a String object and then remove its only reference on the next iteration. This junks up the memory with tons of increasingly long strings that have to be garbage collected.
Instead, use Array#join:
str = veryLargeArray.join('')
This is implemented in C very efficiently and doesn't incur the String creation overhead.
UPDATE: Jonas is right in the comment below. My warning holds for += but not <<.
I'm pretty new at Ruby, but so far I haven't found it necessary to do anything special in this regard (that is, beyond what I just tend to do as a programmer generally). Maybe this is because memory is cheaper than the time it would take to seriously optimize for it (my Ruby code runs on machines with 4-12 GB of RAM). It might also be because the jobs I'm using it for are not long-running (i.e. it's going to depend on your application).
I'm using Python, but I guess the strategies are similar.
I try to use small functions/methods, so that local variables get automatically garbage collected when you return to the caller.
In larger functions/methods I explicitly delete large temporary objects (like lists) when they are no longer needed. Closing resources as early as possible might help too.
Something to keep in mind is the life cycle of your objects. If you're objects are not passed around that much, the garbage collector will eventually kick in and free them up. However, if you keep referencing them it may require some cycles for the garbage collector to free them up. This is particularly true in Ruby 1.8, where the garbage collector uses a poor implementation of the mark and sweep technique.
You may run into this situation when you try to apply some "design patterns" like decorator that keep objects in memory for a long time. It may not be obvious when trying example in isolation, but in real world applications where thousands of objects are created at the same time the cost of memory growth will be significant.
When possible, use arrays instead of other data structures. Try not to use floats when integers will do.
Be careful when using gem/library methods. They may not be memory optimized. For example, the Ruby PG::Result class has a method 'values' which is not optimized. It will use a lot of extra memory. I have yet to report this.
Replacing malloc(3) implementation to jemalloc will immediately decrease your memory consumption up to 30%. I've created 'jemalloc' gem to achieve this instantly.
'jemalloc' GEM: Inject jemalloc(3) into your Ruby app in 3 min
I try to keep arrays & lists & datasets as small as possible. The individual object do not matter much, as creation and garbage collection is pretty fast in most modern languages.
In the cases you have to read some sort of huge dataset from the database, make sure to read in a forward/only manner and process it in little bits instead og loading everything into memory first.
dont use a lot of symbols, they stay in memory until the process gets killed.. this because symbols never get garbage collected.

Resources