Does low-level optimization really works? Do templates bring speed? Two articles by Raymond Chen and Christopher Baus started a discussion, that I’m about to join. I’m arguing that in most cases, you do not want to do low-level optimization, since the speed you gain isn’t worth the effort, and you should rather concentrate on the design. At least if you are an application developer.
Raymond Chen illustrates that optimization is often counter-intuitive. Well, with the right example, you can prove anything, and Raymond’s example seems rather bizarre to me. Maybe since I’m an application developer that couldn’t care less about low-level details.
Now Christopher Baus raises the question up to a level I’m familiar with. He starts with shortly discussing C++ function inlining and optimization and after that quickly jumps to talk about template meta programming and its impact on performance. Christopher’s says that
It is extremely difficult to optimize for modern processors and results can vary even with in a product line; therefore most application developers should not spend their time trying to optimize the speed of a single operation such as a function call.
I agree with him, that it’s useless to do low level optimizations and spend your time thinking about inlining or not. However, I’d like to give another argument. Even if you can be sure that some optimization really speeds up things, it is still a waste of time in nearly all cases. Why? Because the amount of time your function is faster is way to small to be notable. Think of some inlined accessor and let’s say you safe 10 CPU instructions (I’m too lazy to look up the actual value). On a 1.5 MHz processor, 1.5 Millions instructions are executed a second. Let’s assume that one instruction takes one CPU cycle (indeed, AMD processors are able to execute even several instructions in one CPU cycle), these 10 instructions safe you 0.0666 milliseconds. You therefore need to call this accessor 15,000 times to make your application just one second faster. And you better do this in a loop.
From my experience as an application developer, bottlenecks result far more often from wrong design decisions. For example when using a linked list where a hash table would be more appropriate. If you use a database, defining your indexes correctly is something to worry. However, optimization comes last, and you better let a profiler tell you where to look.
Back to Christopher’s article: He continues arguing that
Recently a new version of the inline argument has re-surfaced. Many proponents of "generic programming" in C++ claim that by binding functions at compile time the resulting executable is faster (less code to execute to make a function call, sound familiar?). I am not convinced. What you do get is BIGGER object code. Bigger binaries are more difficult to cache, and performance on modern processors is all about effectively using the cache.
This may be or may not be, I actually know too little about processors to judge this statement. Nevertheless, I’d like to introduce another argument here: Design. Again, I speak as an application developer, and things may look different if you are programming devices or tricky algorithms.
When you are doing applications, you should concentrate on the design rather then speed. There is a time to optimize and it is at the end, when you know where the bottlenecks are. Your design should be clear, flexible and understandable. It contains the logic of your application, and you will spend a lot of time changing and extending your code, because requirements will change and grow over time. You also will spend a lot of time compiling (I compile after each change, since compiling is the first test the code must pass), so you should consider compile time dependencies. Whether you use template meta programming (which is far more then just using templates here and there) or not, should be primary a design decision.
Christopher ends up with some ranther ranty comment about template meta programming in generel (and is second by Len Holgate):
Template programming has become a mental game for those who want to be C++ giants. It is a proving ground. C++ templates are not a meta-programming language. Those that use templates for advanced meta programming are abusing a side effect of a language construct. The fact that the syntax is so obtuse (because it was never intend to be used in this way) provides a way to prove mental prowess.
Well, maybe. I tend to believe that we never will see large and complex business applications build entirely on templates. However, template meta programming can be useful sometimes, and you can do neat little things. Think of Andrei Alexandrescu’s Modern C++ Design, for example.