Does C# 6 string interpolation use boxing like string.Format() does for its arguments? - boxing

I am asking this for performance sake - using lots of boxing makes lots of heap allocations which brings more GC collects which sometimes causes apps to freeze for a glimpse which annoy users.

All string interpolation does (at least in the common case) is to call string.Format().
Right now, calling string.Format() allocates quite a lot and not just due to boxing (for example, string.Format("{0:s} - {1:B}: The value is: {2:C2}", DateTime.UtcNow, Guid.NewGuid(), 3.50m) makes 13 allocations, only 3 of those due to boxing), though there is talk about improving that in the future.
Though as usual when it comes to performance, you generally should not just blindly write unreadable code everywhere because the readable version has known performance issues. Instead, limit the unreadable efficient code to the parts of your code that actually need it.

Related

Will using Sorbet's ruby type checker have an impact on a ruby app's performance?

Maybe a newb's question but if you never ask youll never know
Will using Stripe's Sorbet (https://sorbet.org/) on a RoR app, can potentially improve the app's performance?
(performance meaning response times, not robustness \ runtime error rate)
I did some reading on dynamically typed languages (particularly Javascript in this case) and found out that if we keep sending some function (foo for example) the same type of objects, the engine does some optimising work on that function, so that when it is invoked again with the same types, there interpreting work would be quicker.
I thought maybe ruby interpreter does a similar work which can potentially mean that type-checking may increase interpreting speed
I thought maybe ruby interpreter does a similar work which can potentially mean that type-checking may increase interpreting speed
It doesn't yet, but one could potentially build this one day.
Goal of Sorbet was to build a type system for people, compared to building a type system for computers(compiler). It can introduce some performance overhead, but as Stripe runs it in production, we keep it in check. Internally, we page us if overhead is >7% of cpu time.
I did some reading on dynamically typed languages (particularly Javascript in this case) and found out that if we keep sending some function (foo for example) the same type of objects, the engine does some optimising work on that function, so that when it is invoked again with the same types, there interpreting work would be quicker.
Yes, this can be done. What you're describing is a common optimization in Just-In-Time(JIT) compilers. The technique that you seem to refer to uses run time profiling and actually is a common alternative technique that allows to achieve this result in absence of type system. It's also worth noting that well-build JITs can do it more frequently than a type system, as type system encodes what could happen, while profiling & JITs can optimize for what actually happens in practice.
That said, building a JIT is frequently much more work than building an online compiler, thus, depending on amount of investment one wants to put into speeding up Ruby, either using building a JIT or using types can prove better under different real-world constrains.
I thought maybe ruby interpreter does a similar work which can potentially mean that type-checking may increase interpreting speed
Summarizing the previous paragraph, Sorbet typesystem does not currently speedup Ruby, but it doesn't slow it down much either.
Type systems could be indeed used to speed up languages, but they aren't your only tool, with profiling & JIT compilation being the major competitor.
the optimizations you are talking about apply more to the JIT that is beeing worked on for the ruby runtime.
in general, sorbet aims at type-safety by introducing type interfaces or method signatures. they enable static type-checks that are applied before deploying the application in order to get rid of "type errors".
sorbet comes with a runtime component that can enforce type checks at runtime in your runnable application, but those are going to decrease the applications performance as they wrap method-calls in order to check for correct types https://sorbet.org/docs/runtime#runtime-checked-sig-s

Performance loss through obfuscation?

I’ve read that good obfuscation techniques not merely do things like replacing method names with something obscure, but also, for instance, replace strings in the source code with byte arrays and add methods to convert those back to the original strings.
This might be one of those questions leading to opinion-based answers, but I’m going to ask it anyway: Is there any general notion how much performance loss an application would suffer from in case such an obfuscation method is applied? I’ve got in mind a software that is heavily leaning on a database, i.e., queries exist in the code, for instance, as C# strings or StringBuilder entities.
Yes, string obfuscation has a significant performance impact, at the micro-level. With obfuscation, instead of a direct memory lookup you have code that has to execute (every time), and it is usually somewhat complicated, so it is necessarily much worse at the micro-performance level.
However, that cost usually doesn't matter; the time required for the database call (or showing the UI dialog, or sending the error to a log, or network traffic, or ...) is going to be orders of magnitude higher than the cost of converting the string. In most cases, the cost of the conversion is essentially invisible.
As with everything, careful testing is wise, but usually the costs are only "visible" if you are accessing obfuscated strings in a tight loop that is already CPU-performance-sensitive.

What's wrong with my logic here?

In java they say don't concatenate Strings, instead you should make a stringbuffer and keep adding to that and then when you're all done, use toString() to get a String object out of it.
Here's what I don't get. They say do this for performance reasons, because concatenating strings makes lots of temporary objects. But if the goal was performance, then you'd use a language like C/C++ or assembly.
The argument for using java is that it is a lot cheaper to buy a faster processor than it is to pay a senior programmer to write fast efficient code.
So on the one hand, you're supposed let the hardware take care of the inefficiencies, but on the other hand, you're supposed to use stringbuffers to make java more efficient.
While I see that you can do both, use java and stringbuffers, my question is where is the flaw in the logic that you either use a faster chip or you spent extra time writing more efficient software.
Developers should understand the performance implications of their coding choices.
It's not terribly difficult to write an algorithm that results in non-linear performance - polynomial, exponential or worse. If you don't understand to some extent how the language, compiler, and libraries support your algorithm you can fall into trap that no amount of processing power will dig you out of. Algorithms whose runtime or memory usage is exponential can quickly exceed the ability of any hardware to execute in a reasonable time.
Assuming that hardware can scale to a poorly designed algorithm/coding choice is a bad idea. Take for example a loop that concatenates 100,000 small strings together (say into an XML message). This is not an uncommon situation - but when implementing using individual string concatenations (rather than a StringBuffer) this will result in 99,999 intermediate strings of increasing size that the garbage collector has to dispose of. This can easily make the operation fail if there's not enough memory - or at best just take forever to run.
Now in the above example, some Java compilers can usually (but not always) rewrite the code to use a StringBuffer behind the scenes - but this is the exception, not the rule. In many situations the compiler simply cannot infer the intent of the developer - and it becomes the developer's responsibility to write efficient code.
One last comment - writing efficient code does not mean spending all your time looking for micro-optimizations. Premature optimization is the enemy of writing good code. However, you shouldn't confuse premature optimization with understanding the O() performance of an algorithm in terms of time/storage and making good choices about which algorithm or design to use in which situation.
As a developer you cannot ignore this level of knowledge and just assume that you can always throw more hardware at it.
The argument that you should use StringBuffer rather than concatenation is an old java cargo-cult myth. The Java compiler itself will convert a series of concatenations into a single StringBuffer call, making this "optimization" completely unnecessary in source code.
Having said that, there are legitimate reasons to optimize even if you're using a "slow" bytecode or interpreted language. You don't want to deal with the bugs, instability, and longer development cycle of C/C++, so you use a language with richer capabilities. (Built-in strings, whee!) But at the same time, you want your code to run as fast as possible with that language, so you avoid obviously inefficient constructs. IOWs just because you're giving up some speed by using java doesn't mean that you should forget about performance entirely.
The difference is that StringBuffer is not at all harder or more time-consuming to use than concatenating strings. The general principle is that if it's possible to gain efficiency without increasing development time/difficulty, it should be done: your principle only applies when that's not possible.
The language being slower isn't an excuse to use a much slower algorithm (and Java isn't that slow these days).
If we concatenate a 1-character to an n-character string, we need to copy n+1 characters into the new string. If we do
string s;
for (int i = 0; i < N; ++ i)
s = s + "c";
then the running time will be O(N2).
By contrast, a string buffer maintain a mutable buffer which reduces the running time to O(N).
You cannot double the CPU to reduce a quadratic algorithm into a linear one.
(Although the optimizer may have implicitly created a StringBuffer for you already.)
Java != ineffecient code.
You do not buy a faster processor to avoid writing efficient code. A bad programmer will write bad code regardless of language. The argument that C/C++ is more efficient than Java is an old argument that does not matter anymore.
In the real world, programming languages, operating systems and developpement tools are not selected by the peoples who will actually deal with it.
Some salesman of company A have lunch with your boss to sell its operating system ... and then some other salesman invite your boss at the strippers to sell its database engine ... and so on.
Then, and only then, they hire a bunch of programmers to put all that together. They want it nice, fast and cheap.
That's why you may end up programming high end performance applications with Java on a mobile device or nice 3D graphics on Windows with Python ...
So, your right, but it doesn't matter. :)
You should always put optimizations where you can. You shouldn't be "lazy coding" just because you have a fast processor...
I don't really know how stringbuffer works, nor do i work with Java, but assuming that java defines a string as char[], you're allocating a ton of dummy strings when doing str1+str2+str3+str4+str5, where you really only need to make a string of length str1.length+...str5.length and copy everything ONCE...
However, a smart compiler would optimize and automatically use stringbuffer

Do many old ColdFusion Performance admonitions still apply in CFMX 8?

I have an old standards document that has gone through several iterations and has its roots back in the ColdFusion 5 days. It contains a number of admonitions, primarily for performance, that I'm not so sure are still valid.
Do any of these still apply in ColdFusion MX 8? Do they really make that much difference in performance?
Use compare() or compareNoCase() instead of is not when comparing strings
Don't use evaluate() unless there is no other way to write your code
Don't use iif()
Always use struct.key or struct[key] instead of structFind(struct,key)
Don't use incrementValue()
I agree with Tomalak's thoughts on premature optimization. Compare is not as readable as "eq."
That being said there is a great article on the Adobe Developer Center about ColdFusion Performance: http://www.adobe.com/devnet/coldfusion/articles/coldfusion_performance.html
Compare()/CompareNoCase(): comparing case-insensitively is more expensive in Java, too. I'd say this still holds true.
Don't use evaluate(): Absolutely - unless there's no way around it. Most of the time, there is.
Don't use Iif(): I can't say much about this one. I don't use it anyway because the whole DE() stuff that comes with it sucks so much.
struct.key over StructFind(struct,key): I'd suspect that internally both use the same Java method to get a struct item. StructFind() is just one more function call on the stack. I've never used it, since I have no idea what benefit it would bring. I guess it's around for backwards compatibility only.
IncrementValue(): I've never used that one. I mean, it's 16 characters and does not even increment the variable in-place. Which would have been the only excuse for it's existence.
Some of the concerns fall in the "premature optimization" corner, IMHO. Personal preference or coding style apart, I would only start to care about some of the subtleties in a heavy inner loop that bogs down the app.
For instance, if you do not need a case-insensitive string compare, it makes no sense using CompareNoCase(). But I'd say 99.9% of the time the actual performance difference is negligible. Sure you can write a loop that times 100000 iterations of different operations and you'd find they perform differently. But in real-world situations these academic differences rarely make any measurable impact.
Coldfusion MX 8 is several times faster than MX 7 from all accounts. When it came out, I read many opinions that simply upgrading for the performance boost without changing a line of code was well worth it... It was worth it. With the gains in processing power, memory availability, generally, you can do a lot more with less optimized code.
Does this mean we should stop caring and write whatever? No. Chances are where we take the most shortcuts, we'll have to grow the system the most there.
Finding that find line between enough engineering and not over-engineering a solution is a fine balance. There's a quote there by Knuth I believe that says "Premature optimizations is the root of all evil"
For me, I try to base it on:
how much it will be used,
how expensive that will be across my expected user base,
how critical/central it is to everything,
how often I may be coming back to the code to extend it into other areas
The more that these types of ideas lie in the "probably or one way or another I will", I pay more attention to it. If it needs to be readable and a small performance hit results, it's the better way to go for sustainability of the code.
Otherwise, I let items fight for my attention while I solve and build things of real(er) value.
The single biggest favour we can do ourselves is use a framework with any project, no matter how small and do the small things right from the beginning.
That way there is no sense of dread in going back to work on a system that was originally meant to be a temporary hack but never got re-factored.

What are your strategies to keep the memory usage low?

Ruby is truly memory-hungry - but also worth every single bit.
What do you do to keep the memory usage low? Do you avoid big strings and use smaller arrays/hashes instead or is it no problem to concern about for you and let the garbage collector do the job?
Edit: I found a nice article about this topic here - old but still interesting.
I've found Phusion's Ruby Enterprise Edition (a fork of mainline Ruby with much-improved garbage collection) to make a dramatic difference in memory usage... Plus, they've made it extraordinarily easy to install (and to remove, if you find the need).
You can find out more and download it on their website.
I really don't think it matters all that much.
Making your code less readable in order to improve memory consumption is something you should only ever do if you need it. And by need, I mean have a specific case for the performance profile and specific metrics that indicate that any change will address the issue.
If you have an application where memory is going to be the limiting factor, then Ruby may not be the best choice. That said, I have found that my Rails apps generally consume about 40-60mb of RAM per Mongrel instance. In the scheme of things, this isn't very much.
You might be able to run your application on the JVM with JRuby - the Ruby VM is currently not as advanced as the JVM for memory management and garbage collection. The 1.9 release is adding many improvements and there are alternative VM's under development as well.
Choose date structures that are efficient representations, scale well, and do what you need.
Use algorithms that work using efficient data structures rather than bloated, but easier ones.
Look else where. Ruby has a C bridge and its much easier to be memory conscious in C than in Ruby.
Ruby developers are quite lucky since they don’t have to manage the memory themselves.
Be aware that ruby allocates objects, for instance something as simple as
100.times{ 'foo' }
allocates 100 string objects (strings are mutable and each version requires its own memory allocation).
Make sure that if you are using a library allocating a lot of objects, that other alternatives are not available and your choice is worth paying the garbage collector cost. (you might not have a lot of requests/s or might not care for a few dozen ms per requests).
Creating a hash object really allocates more than an object, for instance
{'joe' => 'male', 'jane' => 'female'}
doesn’t allocate 1 object but 7. (one hash, 4 strings + 2 key strings)
If you can use symbol keys as they won’t be garbage collected. However because they won’t be garbage collected you want to make sure to not use totally dynamic keys like converting the username to a symbol, otherwise you will ‘leak’ memory.
Example: Somewhere in your app, you apply a to_sym on an user’s name like :
hash[current_user.name.to_sym] = something
When you have hundreds of users, that’s could be ok, but what is happening if you have one million of users ? Here are the numbers :
ruby-1.9.2-head >
# Current memory usage : 6608K
# Now, add one million randomly generated short symbols
ruby-1.9.2-head > 1000000.times { (Time.now.to_f.to_s).to_sym }
# Current memory usage : 153M, even after a Garbage collector run.
# Now, imagine if symbols are just 20x longer than that ?
ruby-1.9.2-head > 1000000.times { (Time.now.to_f.to_s * 20).to_sym }
# Current memory usage : 501M
Be aware to never convert non controlled arguments in symbol or check arguments before, this can easily lead to a denial of service.
Also remember to avoid nested loops more than three levels deep because it makes the maintenance difficult. Limiting nesting of loops and functions to three levels or less is a good rule of thumb to keep the code performant.
Here are some links in regards:
http://merbist.com
http://blog.monitis.com
When deploying a Rails/Rack webapp, use REE or some other copy-on-write friendly interpreter.
Tweak the garbage collector (see https://www.engineyard.com/blog/tuning-the-garbage-collector-with-ruby-1-9-2 for example)
Try to cut down the number of external libraries/gems you use since additional code uses memory.
If you have a part of your app that is really memory-intensive then it's maybe worth rewriting it in a C extension or completing it by invoking other/faster/better optimized programs (if you have to process vast amounts of text data, maybe you can replace that code with calls to grep, awk, sed etc.)
I am not a ruby developer but I think some techniques and methods are true of any language:
Use the minimum size variable suitable for the job
Destroy and close variables and connections when not in use
However if you have an object you will need to use many times consider keeping it in scope
Any loops with manipulations of a big string dp the work on a smaller string and then append to bigger string
Use decent (try catch finally) error handling to make sure objects and connections are closed
When dealing with data sets only return the minimum necessary
Other than in extreme cases memory usage isn't something to worry about. The time you spend trying to reduce memory usage will buy a LOT of gigabytes.
Take a look at Small Memory Software - Patterns for Systems with Limited Memory. You don't specify what sort of memory constraint, but I assume RAM. While not Ruby-specific, I think you'll find some useful ideas in this book - the patterns cover RAM, ROM and secondary storage, and are divided into major techniques of small data structures, memory allocation, compression, secondary storage, and small architecture.
The only thing we've ever had which has actually been worth worrying about is RMagick.
The solution is to make sure you're using RMagick version 2, and call Image#destroy! when you're done using your image
Avoid code like this:
str = ''
veryLargeArray.each do |foo|
str += foo
# but str << foo is fine (read update below)
end
which will create each intermediate string value as a String object and then remove its only reference on the next iteration. This junks up the memory with tons of increasingly long strings that have to be garbage collected.
Instead, use Array#join:
str = veryLargeArray.join('')
This is implemented in C very efficiently and doesn't incur the String creation overhead.
UPDATE: Jonas is right in the comment below. My warning holds for += but not <<.
I'm pretty new at Ruby, but so far I haven't found it necessary to do anything special in this regard (that is, beyond what I just tend to do as a programmer generally). Maybe this is because memory is cheaper than the time it would take to seriously optimize for it (my Ruby code runs on machines with 4-12 GB of RAM). It might also be because the jobs I'm using it for are not long-running (i.e. it's going to depend on your application).
I'm using Python, but I guess the strategies are similar.
I try to use small functions/methods, so that local variables get automatically garbage collected when you return to the caller.
In larger functions/methods I explicitly delete large temporary objects (like lists) when they are no longer needed. Closing resources as early as possible might help too.
Something to keep in mind is the life cycle of your objects. If you're objects are not passed around that much, the garbage collector will eventually kick in and free them up. However, if you keep referencing them it may require some cycles for the garbage collector to free them up. This is particularly true in Ruby 1.8, where the garbage collector uses a poor implementation of the mark and sweep technique.
You may run into this situation when you try to apply some "design patterns" like decorator that keep objects in memory for a long time. It may not be obvious when trying example in isolation, but in real world applications where thousands of objects are created at the same time the cost of memory growth will be significant.
When possible, use arrays instead of other data structures. Try not to use floats when integers will do.
Be careful when using gem/library methods. They may not be memory optimized. For example, the Ruby PG::Result class has a method 'values' which is not optimized. It will use a lot of extra memory. I have yet to report this.
Replacing malloc(3) implementation to jemalloc will immediately decrease your memory consumption up to 30%. I've created 'jemalloc' gem to achieve this instantly.
'jemalloc' GEM: Inject jemalloc(3) into your Ruby app in 3 min
I try to keep arrays & lists & datasets as small as possible. The individual object do not matter much, as creation and garbage collection is pretty fast in most modern languages.
In the cases you have to read some sort of huge dataset from the database, make sure to read in a forward/only manner and process it in little bits instead og loading everything into memory first.
dont use a lot of symbols, they stay in memory until the process gets killed.. this because symbols never get garbage collected.

Resources