C++ Exception Handling and It's Effect on Real World Performance, Part I

May 14, 2015 | 3:14 pm PDT

219 views

0 comments

Perspective: Fox

There seems to be a lot of debate among C++ developers, regarding the performance implications of using exception handling. I've personally used exception handling on many of my projects. I've also worked on many projects where exception handling was expressly prohibited.

Before I go off on a rant, or bore 80% of the readers with facts, test results, and coding examples, let me cut right to the chase:

The exception handling framework used in most, if not all, contemporary C++ compilers absolutely will not noticeably, adversely affect the runtime performance of any "real world" system! At the same time, the inappropriate, naive use of exception handling absolutely will noticeably, adversely affect the runtime performance of any "real world" system.

"But wait", you say. "How is that possible?" Glad you asked.

In short, it is not the exception handling framework, nor the the actual throwing and catching of exceptions that may harm performance, but rather the effect that potentially throwing an exception has on the ability of the compiler's optimizer to effective apply optimizations. There are certain optimizations which, in the presence of a throw expression, the compiler may not be able to apply.

Now, let's get into the proof of the matter.

The Proof

First, let's consider some completely unrealistic, amazingly oversimplified scenarios which, as good, professional software engineers (or developers, if you prefer that term) we are unlikely to ever encounter in the real world. Why? To clarify the point, before we move on to some more real world cases.

In the below code example, we have a very simple main() function, which contains a loop that executes 10,000,000,000 (that's ten billion, for those that are intimidated by too many zeros) times. Each time through the loop, we call a very simple function which takes 3 parameters and does nothing of significance. The wcout calls are just to print the results to the terminal window (I'm a big fan of developing internationalized software so I tend to use wide characters whenever possible).


unsigned long SimpleFunc ( int nArg1, int nArg2, long nArg3 );

int main ( int argc, char** argv )
{
	//	This prevents the optimizer from being able to unroll the loop.
	unsigned long nIters = 10000000000;
	if ( argc == 2 )
		nIters = strtol ( argv[1], NULL, 10 );

	unsigned long g_nSum = 0;
	//	Generate psuedo random values to use as input to the function, so the compiler
	//		doesn't optimize them away.
	int nFoo1 = rand ( ) % 5;
	int nFoo2 = rand ( ) % 7;
	long nFoo3 = rand ( ) % 13;

	time_t tmStart = time ( NULL );
	clock_t nStart = clock ( );
	//	Execute the loop 10 billion times
	for ( unsigned long i = 0; i < nIters; ++i ) {
		unsigned long nFuncResult  = SimpleFunc ( nFoo1, nFoo2, nFoo3 );
		if ( nFuncResult == 0 ) {
			wcout << L"   *** SimpleFunc() failed!!!" << endl;
			continue;
		}
		g_nSum += nFuncResult;
	}
	clock_t nEnd = clock ( );
	time_t tmEnd = time ( NULL );

	wcout << L"    g_nSum = " << g_nSum << endl <<
			 L"    Total processor time:   " << nEnd - nStart << L" ticks." << endl <<
			 L"    Total wall clock time:  " << tmEnd - tmStart << L" seconds." << endl;

	return 0;
}

unsigned long SimpleFunc ( int nArg1, int nArg2, long nArg3 )
{
	int nReturn = 0;
	if ( nArg1 == nArg2 && nArg2 == nArg3 ) {
		return nReturn;
	}
	nReturn += nArg1 + nArg2 + nArg3;

	return nReturn;
}

Very simple. Very straightforward. If by chance, the three parameters to SimpleFunc() are the same value then it's treated as an error and SimpleFunc() returns 0. In other words, we're using the return value of 0 to indicate an error. You can download a copy of the complete source file here. We'll call this "test case 1".

The above code, with compiler optimizations disabled (using the -O0 flag with g++), takes a CPU time of 56,225,981 ticks, or 56 seconds of wall clock time to execute on my machine.

Now, let's add some exception handling and see what happens. We're not actually going to throw any exceptions yet - just add a try/catch block around the call to SimpleFunc(). The relevant, changed code will look like such:


int main ( int argc, char** argv )
{
	...
	for ( unsigned long i = 0; i < nIters; ++i ) {
		unsigned long nFuncResult;
		try {
			nFuncResult = SimpleFunc ( nFoo1, nFoo2, nFoo3 );
		} catch ( exception& exc ) {
			wcout << L"   *** SimpleFunc() threw an exception!!!" << endl;
		}
		if ( nFuncResult == 0 ) {
			wcout << L"   *** SimpleFunc() failed!!!" << endl;
			continue;
		}
		g_nSum += nFuncResult;
	}
	...
}

We'll call this "test case 2". You can download the entire source file here.

After adding the try/catch block, the code took 55,914,383 ticks of CPU time, or 55 seconds of wall time. Slightly less than the first test case, but close enough to consider it equivalent.

Now, let's change SimpleFunc() to throw an exception rather than returning 0 on error. We'll get rid of the conditional that checks the return value of SimpleFunc(), since we're now using exception handling. The relevant changes are thus:

int main ( int argc, char** argv )
{
	...
	for ( unsigned long i = 0; i < nIters; ++i ) {
		unsigned long nFuncResult;
		try {
			nFuncResult = SimpleFunc ( nFoo1, nFoo2, nFoo3 );
		} catch ( exception& exc ) {
			wcout << L"   *** SimpleFunc() threw an exception!!!" << endl;
			continue;
		}
		g_nSum += nFuncResult;
	}
	...
}

unsigned long SimpleFunc ( int nArg1, int nArg2, long nArg3 )
	throw ( exception )
{
	int nReturn = 0;
	if ( nArg1 == nArg2 && nArg2 == nArg3 ) {
		throw exception ( );
	}
	nReturn += nArg1 + nArg2 + nArg3;

	return nReturn;
}

We'll call this "test case 3". The complete source file can be downloaded here.

After replacing the early return call, in SimpleFunc(), with a throw expression, the runtime on my machine is as follows: CPU time - 53,509,521; wall time - 53 seconds. Say WHAT!?!? The runtime actually decreased!?!? Indubitably!

But What About Compiler Optimizations?

This is all swell, but so far we've only been building and running without any compiler optimizations. Obviously, as excellent, diligent, software engineers we'd never put our code into production without appropriate compiler optimizations. So, let's run the same tests, but with optimization level "2" (the -O2 flag for g++).

The following table shows the results:

	Level 2 Optimizations	No Optimizations
No Exception Handling	9,171,492	56,225,981
Try/Catch Block Only	9,184,395	55,914,383
Throw Expression in Function	21,384,638	53,509,521

Aha! So now the truth comes out! Now we see that C++ exception handling adds 233% to the runtime! Well now, slow down there, professor. Let's not get ahead of ourselves. What we're seeing here is NOT the result of exception handling overhead, but rather the presence of a throw expression, within a function, preventing the compiler from being able to perform certain optimizations. Specifically which optimizations is probably beyond the scope of this discussion (a future post, perhaps), but certainly function inlining and return value optimization are likely to be forfeited.

Saying that exception handling hurts performance is kind of like saying passing by value hurts performance. Sure, if you pass a large, complex data structure by value it will hurt performance due to invoking the copy constructor which may result in any number of unnecessary copy operations (depending on the complexity of the class); but passing an int or a char by value will, obviously, not affect performance. Likewise, if you have throw expressions within functions which the compiler will not be able to optimize anyway, then there will be no performance hit. What kinds of functions cannot be optimized? Well, there's a good chance that some optimizations will not be performed on virtual functions - because the compiler cannot know at compile time which implementation of the virtual function will actually be called (unless, maybe, if you explicitly cast it to a specific type at the point where you call it - but even then, it's not guaranteed).

I have a bit more time before I have to run over to Starbucks for my daily cappuccino, so just for yuks and the sake of completeness, lets try running the same test cases but with level 3 optimizations (the -O3 flag for g++).

Here are the results:

	Level 3 Optimizations	Level 2 Optimizations	No Optimizations
No Exception Handling	3,114,487	9,171,492	56,225,981
Try/Catch Block Only	3,084,043	9,184,395	55,914,383
Throw Expression in Function	3,094,137	21,384,638	53,509,521

Holy, freakin', smokes! At level 3 optimization the runtime with exception handling is actually lower than without exception handling!

What About Exception Specifications

There is a lot of information on the Internet and in books, suggesting - or even blatantly claiming - that including an exception specification in your function signature will harm performance. That is simply not true when the function does not throw an exception! But don't take my word for it - run the tests provided in this post; then remove the exception specification from SimpleFunc() and run them again. The results will be identical.

Now, in the case where the function does actually throw an exception - that's a different story. But we're going to address that in the next post.

Conclusion

Regardless of what so many people in software development claim, exception handling in C++ just does not affect runtime performance. Not adversely, anyway.

Still, we need to be intelligent about how we implement it within our code. But that's a discussion for another post (because that's more of a design issue).

Comments

No Responses to: C++ Exception Handling and It’s Effect on Real World Performance, Part I