You are hereHome / Development / Optimization, Templates And The Rest

Optimization, Templates And The Rest


By Gerd Riesselmann - Posted on 18 December 2004

Does low-level optimization really works? Do templates bring speed? Two articles by Raymond Chen and Christopher Baus started a discussion, that I'm about to join. I'm arguing that in most cases, you do not want to do low-level optimization, since the speed you gain isn't worth the effort, and you should rather concentrate on the design. At least if you are an application developer.

Raymond Chen illustrates that optimization is often counter-intuitive. Well, with the right example, you can prove anything, and Raymond's example seems rather bizarre to me. Maybe since I'm an application developer that couldn't care less about low-level details.

Now Christopher Baus raises the question up to a level I'm familiar with. He starts with shortly discussing C++ function inlining and optimization and after that quickly jumps to talk about template meta programming and its impact on performance. Christopher's says that

It is extremely difficult to optimize for modern processors and results can very even with in a product line; therefore most application developers should not spend their time trying to optimize the speed of a single operation such as a function call.

I agree with him, that it’s useless to do low level optimizations and spend your time thinking about inlining or not. However, I'd like to give another argument. Even if you can be sure that some optimization really speeds up things, it is still a waste of time in nearly all cases. Why? Because the amount of time your function is faster is way to small to be notable. Think of some inlined accessor and let's say you safe 10 CPU instructions (I'm too lazy to look up the actual value). On a 1.5 MHz processor, 1.5 Millions instructions are executed a second. Let's assume that one instruction takes one CPU cycle (indeed, AMD processors are able to execute even several instructions in one CPU cycle), these 10 instructions safe you 0.0666 milliseconds. You therefore need to call this accessor 15,000 times to make your application just one second faster. And you better do this in a loop.

From my experience as an application developer, bottlenecks result far more often from wrong design decisions. For example when using a linked list where a hash table would be more appropriate. If you use a database, defining your indexes correctly is something to worry. However, optimization comes last, and you better let a profiler tell you where to look.

Back to Christopher's article: He continues arguing that

Recently a new version of the inline argument has re-surfaced. Many proponents of "generic programming" in C++ claim that by binding functions at compile time the resulting executable is faster (less code to execute to make a function call, sound familiar?). I am not convinced. What you do get is BIGGER object code. Bigger binaries are more difficult to cache, and performance on modern processors is all about effectively using the cache.

This may be or may not be, I actually know too little about processors to judge this statement. Nevertheless, I'd like to introduce another argument here: Design. Again, I speak as an application developer, and things may look different if you are programming devices or tricky algorithms.

When you are doing applications, you should concentrate on the design rather then speed. There is a time to optimize and it is at the end, when you know where the bottlenecks are. Your design should be clear, flexible and understandable. It contains the logic of your application, and you will spend a lot of time changing and extending your code, because requirements will change and grow over time. You also will spend a lot of time compiling (I compile after each change, since compiling is the first test the code must pass), so you should consider compile time dependencies. Whether you use template meta programming (which is far more then just using templates here and there) or not, should be primary a design decision.

Christopher ends up with some ranther ranty comment about template meta programming in generel (and is second by Len Holgate):

Template programming has become a mental game for those who want to be C++ giants. It is a proving ground. C++ templates are not a meta-programming language. Those that use templates for advanced meta programming are abusing a side effect of a language construct. The fact that the syntax is so obtuse (because it was never intend to be used in this way) provides a way to prove mental prowess.

Well, maybe. I tend to believe that we never will see large and complex business applications build entirely on templates. However, template meta programming can be useful sometimes, and you can do neat little things. Think of Andrei Alexandrescu's Modern C++ Design, for example.

[...] ; inefficiencies they’ve introduced. I totally agree. As I already argued regarding C++ optimization, performance is rather a question of desig [...]

Gerd,

I know I quickly threw this off, but this is a topic that has been on my mind for a year. I wanted to stir up a bit of controversy. Isn't that what blogging is all about? I have a more complete article on this, but I have been having a difficult time getting it through the powers that be, so some of that frustration came out here. The revision process has sucked the voice out of my article.

I am not a fan of template based PDB. It does have applications, but I feel they are limited. You loose the interface specification for the policy. This is a problem with C++. It lacks concept checking. I personally think concept checking and specification is vital, and with out, I almost always avoid PDB with templates.

I've heard Alexandrescu speak a couple times on PDB, and he always brings up the performance issue. But there is no performance issue. And I argue that it can hurt performance.

Gerd,

I agree that template meta programming can be useful and the stuff in it's just that Alexandrescu’s book is interesting and thought provoking. BUT it's an idiom that many people are currently unfamiliar with and it's also an idiom that many people are keen to play around with, because of this I'd need some solid justification of why it's being used before I'd want to see it in a normal production system. Right now many uses of meta programming seem to fit into the "doing clever things just to be clever" category and I prefer to see production code that's as simple as it can be...

By the way I completely agree with you that the design is more important than optimisation. Design it, test it, profile it, optimise it - and you usually wont need the last one but if they design was good then it should be relatively easy to isolate the code that needs optimising and replace it with something faster - and by the time you do that you already have the tests that you wrote for the original version to make sure that your faster version still actually does the right thing.

Please excuse me for answering that late, but since it is the end of the year, a lot of things are to be finished.

Chris, I actually honor a good rant and yours even taught me two new words (obtuse and prowess), which alone made my day. I also think, that you are right about people using template programming just because it is en vogue.

I for example remember a colleague solving a quite simple task with templates and function pointers, leading to nearly unreadable code. Since the template functions had to be specialized for each type, there was no code reuse, and when I refactored it, I came up with five static functions each of which contained about 10 lines of code. Even OO would have been overkill here...

However, policy based design can be effective. An example: Our applications all use a core system that offers several kinds of containers (maps, vectors, lists), for several types (Instances, Links). They all do the same, but they all do it a little bit different. Using templates and PBD, I managed to refactor about 90% of the functionality into one class. About five different implementations of iterators were reduced to one template class (which consists of six really simple functions having an average of five lines of code) and two policy classes (that contain just two lines of code). Which actually is a good thing, since it makes the code more clear and easier to maintain.

Anyhow, extensive use of templates with lot's of parameters and template template parameters surely turns your code into a monster. And since it isn't even properly debuggable, it is a monster without a cage. I agree with Len here: Don't try this at home.