C++ Exception Handling and It’s Effect on Real World Performance, Part II

May 17, 2015 | 7:33 pm PDT

231 views

0 comments

Perspective: Fox

In my previous post we talked about the actual overhead and performance implications of using exception handling in C++. To prove the point that exception handling doesn't actually hurt performance, I used an example that executes a loop 10 billion times, with and without exception handling. What we found was, for the most part, the exception handling mechanism didn't, itself, affect performance. However, it did interfere with the compiler's ability to perform certain optimizations.

One important consideration was left out of that post: the effect of exception handling when exceptions actually get thrown. That is the topic of this post.

Here, we'll have a look at what the performance implications are of using exception handling and actually throwing exceptions. To prove our points we'll extend the sample code from the previous post.

In this post, you're going to see that with our very unrealistic sample code the performance of throwing exceptions can be horrendous. I am going to emphasize, repeatedly that these numbers MUST be taken with a grain of salt, though. They are useful only for academic knowledge and have very little relevance to real world development. Why? Because any real world system (or at least any worth working on) is hopefully going to consist of a lot more than an infinite loop which does nothing but throw an exception every n-th iteration. In real development you would only throw an exception when an error actually occurs AND the throwing of that exception is going to be surrounded by a lot of other code which doesn't throw any exceptions. I'll provide a concrete example of this point later in the post.

We're not going to get into the details of how the compiler actually implements the throwing and catching of exceptions. Our focus here is on the runtime performance of throwing and catching exceptions. If you are interested in how the compiler implements exception handling, I recommend the following article: The true cost of zero cost exceptions.

The Proof

Today's test cases consist of two very simple implementations: they are identical except that one uses the return value from a function to report an error to the caller; and, the other throws an exception which the caller catches.

The code was compiled and the tests were run on my local workstation, with the following environment:

Intel Core i3-3220, 3.3GHz (2 cores, 2 hyperthreads)
16GB system memory
OpenSUSE Linux 13.2 (kernel version 3.16.7)
GNU g++ 4.8.3

During the test runs no other significant activity was occurring on the workstation which would interfere with the results, and total system memory usage was only about 33%, so available memory was not a factor.

Here are the relevant parts of our implementation without exception handling. We'll call this "test case 4". For the sake of brevity, I've ellipsed some lines which are not significant to the discussion. If you'd like to see the entire source file you can grab it here (except-04.cpp).


int main ( int argc, char** argv )
{
	...

	unsigned long nSum = 0;
	//	Generate psuedo random values to use as input to the function, so the compiler
	//		doesn't optimize them away.
	int nArg1 = rand ( ) % 5;
	int nArg2 = rand ( ) % 7;
	long nArg3 = rand ( ) % 13;

	...

	//	Execute the loop 10 billion times or however many were specified on the
	//		command line.
	for ( unsigned long i = 0; i < nIters; ++i ) {
		unsigned long nFuncResult = SimpleFunc ( nArg1, nArg2, nArg3, nFailIter );
		if ( nFuncResult == 0 ) {
			continue;
		}
		nSum += nFuncResult;
	}

	...

	return 0;
}

unsigned long SimpleFunc ( int nArg1, int nArg2, long nArg3, unsigned long nFailIter )
{
	static unsigned long nCurIter = 1;

	int nReturn = 0;
	//	If an interval of 0 was specified then never fail.
	if ( nFailIter ) {
		//	Otherwise, if this is an iteration we should fail on...
		if ( nCurIter == nFailIter ) {
			nCurIter = 1;
			return nReturn;
		}
		else {
			++nCurIter;
		}
	}

	nReturn += nArg1 + nArg2 + nArg3;

	return nReturn;
}

As you can see, there's nothing fancy; nothing complicated. On line 17 above, we're calling SimpleFunc(), then checking the return value for an error. As long as it's not 0 then we add the return value to nSum. Within SimpleFunc(), if the current loop iteration is one in which we're supposed to fail then we do the error reporting code and reset the counter to zero; otherwise, we just increment the counter, add the three input values and return the result.

And, here's the relevant parts of our implementation using exception handling. We'll call this "test case 5". And, again I've stripped out some code which is not relevant to this discussion - to make it easier to focus on the important points. If you'd like to review the entire source file you can get it here (except-05.cpp).

int main ( int argc, char** argv )
{
	...

	unsigned long nSum = 0;
	//	Generate psuedo random values to use as input to the function, so the compiler
	//		doesn't optimize them away.
	int nArg1 = rand ( ) % 5;
	int nArg2 = rand ( ) % 7;
	long nArg3 = rand ( ) % 13;

	...

	//	Execute the loop 10 billion times or however many were specified on the
	//		command line.
	for ( unsigned long i = 0; i < nIters; ++i ) {
		unsigned long nFuncResult;
		try {
			nFuncResult = SimpleFunc ( nArg1, nArg2, nArg3, nFailIter );
		} catch ( unsigned long& exc ) {
			continue;
		}
		nSum += nFuncResult;
	}

	...

	return 0;
}

unsigned long SimpleFunc ( int nArg1, int nArg2, long nArg3, unsigned long nFailIter )
{
	static unsigned long nCurIter = 1;

	int nReturn = 0;
	//	If an interval of 0 was specified then never fail.
	if ( nFailIter ) {
		//	Otherwise, if this is an iteration we should fail on...
		if ( nCurIter == nFailIter ) {
			nCurIter = 1;
			throw (unsigned long) 0;
		}
		else {
			++nCurIter;
		}
	}

	nReturn += nArg1 + nArg2 + nArg3;

	return nReturn;
}

You'll notice the only differences are in the for loop in main() - we put the call to SimpleFunc() into a try/catch block; and in SimpleFunc() - on line 41, we throw an exception rather than returning zero. Otherwise, the two implementations are the same.

Before we proceed, I want to take a moment to address a question I'm sure some of you have about the implementation of SimpleFunc(). You might be wondering why I didn't just use the modulus operator rather than using a counter (nCurIter) variable and resetting it to one every nFailIter-th invocation. It was purely a matter of performance. The modulus operator really is that inefficient (on x86 processors (but it's very fast on SPARC processors)). Using the modulus operator would save a few lines of code in SimpleFunc(), but it increased the runtime by 388% (in the non-exception implementation) - and that's just not good server programming.

Back to the topic at hand!

The Hard Numbers

So, given our two test cases, above, here are the actual runtime results. The times are measured in CPU clock ticks. 1 million ticks is approximately 1 second of wall clock time, since no other operations are competing for CPU time.

Runtime Performance without Compiler Optimizations

Without Compiler Optimizations
Number of Failures	Without Exceptions	With Exceptions
10	60,609,251	60,479,078
100	61,004,688	59,638,788
1,000	61,387,944	59,448,682
10,000	60,668,671	59,419,694
100,000	60,664,220	60,740,331
1,000,000	60,803,370	62,853,636
10,000,000	60,629,392	80,632,689
100,000,000	61,365,771	270,400,963
1,000,000,000	59,029,813	1,989,067,404

First, let me say, I'm not going to spend much time talking about the non-optimized results because that probably shouldn't be of interest to anyone doing professional C++ development. If you actually do release your code to production without compiler optimizations you should be slapped.

The results without compiler optimizations are mainly useful for helping to determine how much of the performance impact is caused by the exception handling preventing the compiler from being able to perform optimizations, as opposed to the actual cost of throwing and catching the exceptions.

One thing you might notice from the table above is that the performance of the exception handling implementation is slightly better than the implementation without exception handling, until the number of exceptions thrown reaches 1 million (or an exception every 10,000 iterations of the loop). After that point, we're throwing exceptions so frequently that performance just completely drops off. That last number, 1,989,067,404 is accurate. It literally took about 33 minutes to run.

I was going to provide the results using a failure iteration of 1 (so that every call to SimpleFunc() throws an exception) but it would have taken over 5 hours to run.

Now, let's have a look at the performance with level 2 optimizations.

Runtime Performance with Level 2 Compiler Optimizations

Level 2 Compiler Optimizations
Number of Failures	Without Exceptions	With Exceptions
10	9,274,434	24,934,579
100	9,294,174	24,960,059
1,000	9,278,503	24,759,965
10,000	9,250,334	24,498,404
100,000	9,206,505	24,779,964
1,000,000	9,256,397	26,969,827
10,000,000	9,263,343	48,805,107
100,000,000	9,769,718	256,854,640
1,000,000,000	9,264,776	2,154,706,534

Clearly, the first thing you should notice is the significant discrepancy between the non-exception version and the exception version for ALL test runs. The non-exception version starts out at just over 9.2 million ticks; and only goes as high as 9.7 million. Whereas, the exception implementation starts at 24.9 million ticks and stays around there until we hit that nasty 10 million exceptions mark, then goes all to hell. In the last test run, where we have 10 billion iterations, and we throw 1 billion exceptions (that's one every 10 iterations), the runtime was even higher than the non-optimized version.

But, how does all of this help us make better design decisions in the real world? We'll get to that momentarily. But first, lets have a look at the results with level 3 optimizations.

Runtime Performance with Level 3 Compiler Optimizations

Level 3 Compiler Optimizations
Number of Failures	Without Exceptions	With Exceptions
10	6,240,653	22,865,303
100	6,291,640	22,125,345
1,000	6,285,047	22,150,231
10,000	6,221,640	22,949,937
100,000	6,186,891	21,573,882
1,000,000	6,200,055	21,448,277
10,000,000	6,337,831	38,851,078
100,000,000	7,844,211	205,552,756
1,000,000,000	9,187,911	1,705,801,299

Alright, then, let's see what we have. The non-exception implementation stays around 6.2 million ticks until we run it with 10 million failures. The last two lines of the non-exception version are, frankly, unsettling, considering there should be more processing when an error doesn't occur. I say that because we're not doing the addition operations when an error occurs. But, at level 3 optimization the compiler is doing so much funky stuff it's hard to say what the resulting code looks like. We just have to accept that the people writing the g++ compiler are wicked smart.

As for the exception handling version...well, starts out about 3.7 times slower then the non-exception implementation, and once we reach the point of throwing an exception every 100 iterations then the performance has gone right out the window.

Now you have the facts; you have the source code that was used to generate the test results; you're able to compile and run it on your own machine; and you're able to verify the accuracy of the test results for yourself. Now, let's talk about why these test results, and test cases like this, are irrelevant.

Why These Results Are Meaningless

It is unfortunate that so much of what is offered as "proof" on the Internet and in schools is based on test cases and scenarios like the ones presented here. It is even more unfortunate that so many intelligent people are willing to accept such test results as proof and proceed to make engineering decisions based on them. While these tests show that throwing an exception IS significantly more expensive than returning an integer, you have to consider that in the context of a the entire code base.

Consider, if we take the code that we've been using in this post, and simply insert the following single swprintf() call:

swprintf ( g_strFoo, 127, L"Foobar: %d, %d, %ld, %lu", nArg1, nArg2, nArg3, nFailIter );

into the block in SimpleFunc() which reports the error/exception (lines 39 and 41, respectively), then rerun the tests with level 2 compiler optimizations, we get the following results.

Runtime Performance with a Single Call to swprintf

With `swprintf()` Call
Number of Failures	Without Exceptions	With Exceptions
1,000	24,500,424	18,658,313
10,000	24,541,007	19,316,947
100,000	24,480,973	20,688,354
1,000,000	24,777,916	23,422,812
10,000,000	28,381,515	48,478,125

By simply adding one call to swprintf(), which only executes when an error occurs/exception is thrown, the exception handling version actually performs better than the non-exception version, until we throw more than 1 million exceptions. The modified versions of the source files can be downloaded here (test case 4a) and here (test case 5a).

You may also notice, that the exception handling version is actually faster after adding the call to swprintf(), than it is without it.

And that's why I say these types of tests (or almost ANY benchmarks) are meaningless in the real world. They provide interesting, academic insight, but they're of little use to people concerned with real software development.

If the code you're developing does execute a tight loop with a huge number of iterations ("huge" meaning billions, or at least millions, not 30 to 50 thousand), then these types of tests may be applicable and you should NOT be throwing exceptions from within that loop. Or, if what you're developing will run in a very resource constrained environment (e.g. many embedded systems), then you should probably not be using exceptions at all.

However, if your code is like 98% of all of the server code in the world, and all the server code which will ever be written, then test cases and scenarios like these have very little applicability.

Another reason I would consider these test results meaningless in the context of real world development is because if you have a case where your application is throwing more than 1 million exceptions, in rapid succession, from the same point in the code, then there's probably something wrong with how the surrounding or calling code is implemented. For example, say you have a function which takes a pointer as a parameter, and if that pointer is null then your function throws an exception (perfectly reasonable). But you find that that function is throwing an exception more often than not because it keeps getting passed a null pointer. In such a case, the problem is not with the function that keeps throwing the exception, but with the code that keeps trying to pass the null pointer.

Propagating Exceptions Up the Call Stack

You're generally not going to throw an exception and catch it within the same function. Although perfectly legal, it's not very practical because you can just handle the error immediately. Usually, when you throw an exception the intent is for some function up the call stack to catch, and hopefully handle, it.

So, we're going to take a quick look at how much of the performance difference we've seen so far is the result of propagating the exception up the call stack.

First, let's look at what happens if we catch the exception within SimpleFunc(), rather than propagating it up to main(). The changes to the code are as such, and you can get the entire source file here (except-05b.cpp):

int main ( int argc, char** argv )
{
	...

	for ( unsigned long i = 0; i < nIters; ++i ) {
		unsigned long nFuncResult = SimpleFunc ( nArg1, nArg2, nArg3, nFailIter );
		if ( nFuncResult == 0 )
			continue;
		nSum += nFuncResult;
	}

	...
}

unsigned long SimpleFunc ( int nArg1, int nArg2, long nArg3, unsigned long nFailIter )
{
	...

	if ( nFailIter ) {
		try {
			//	Otherwise, if this is an iteration we should fail on...
			if ( nCurIter == nFailIter ) {
				nCurIter = 1;
				throw (unsigned long) 0;
			}
			else {
				++nCurIter;
			}
		} catch ( unsigned long exc ) {
			return 0;
		}
	}

	...
}

The rest of the code is the same as "test case 5". As you can see, we're now throwing the exception on line 24 and catching it on line 29, within SimpleFunc(), so the exception is not being propagated.

Here are the runtime results (with level 2 optimization):

Exception Propagation
Number of Failures	Exception Propagated	Exception Not Propagated
10	24,934,579	21,908,537
100	24,960,059	21,533,972
1,000	24,759,965	21,639,925
10,000	24,498,404	21,443,953
100,000	24,779,964	21,610,841
1,000,000	26,969,827	23,445,152
10,000,000	48,805,107	37,830,669
100,000,000	256,854,640	181,090,749
1,000,000,000	2,154,706,534	1,434,655,757

As you can see, not propagating the exception up the call stack results in consistently better performance. But, as mentioned above, that really kind of defeats the purpose of having an error reporting mechanism - the point is to be able to report the error to the caller and let them decide how to handle it.

Exception Specifications

You may notice that the sample code we're using today doesn't include an exception specification for the SimpleFunc() function. The reason for that is twofold:

An exception specification may adversely affect performance when an exception is actually thrown; and
In C++ an exception specification provides little, if any, actual value. The exception specification doesn't prevent any other type of exception from being thrown.

Here are the runtime differences of the test cases, with and without providing an exception specification.

Exception Propagation
Number of Failures	Without Exception Specification	With Exception Specification
10	24,934,579	24,509,037
100	24,960,059	24,480,011
1,000	24,759,965	24,499,674
10,000	24,498,404	24,501,432
100,000	24,779,964	24,881,300
1,000,000	26,969,827	27,115,811
10,000,000	48,805,107	49,843,490
100,000,000	256,854,640	275,066,896
1,000,000,000	2,154,706,534	2,358,673,305

Here, we see that there is not a significant performance difference whether your provide an exception specification for the function or not. Once the number of exceptions being thrown gets over 10 million then the gap grows a bit, but not enough worth considering. Namely since, if your application is throwing more than 10 million exceptions in very rapid succession, then you probably have much larger issues to look into.

There are some claims on the Internet, that using exception specifications can harm performance. That may be the case in some very specific scenarios, but I've never seen any real world evidence to support such claims.

Conclusion

Throwing an exception is very likely going to be substantially more expensive than returning an integer. But unless you're throwing millions of exceptions per second, it's not likely going to make a noticeable difference to your runtime performance. And, if you are throwing millions of exceptions per second then there is probably something very wrong with your code.

When implementing tight loops which are expected to have a huge number of iterations, throwing exceptions should be avoided, because in those cases there is the potential for noticeable performance degradation.

The farther up the call stack the exception is propagated, the more overhead it will incur. But, just as with throwing exceptions in general, this won't likely be noticeable or have any affect on performance unless it is happening a ridiculous number of times. By "ridiculous" I mean, literally, millions of times per second.

Exception specifications in your function signatures will not have any effect on runtime performance. They will likely cause you other headaches over time, but certainly nothing related to performance.

In my next post on this topic we're going to look at how exception handling affects the performance of object oriented code - particularly, virtual functions, which compilers are generally not able to optimize as extensively as non-virtual functions.

Previous Post:
« C++ Exception Handling and It’s Effect on Real World Performance, Part I

Next Post:
The Problems with Current Interview Practices; Why You Can’t Find Good Senior Developers »

Comments

No Responses to: C++ Exception Handling and It’s Effect on Real World Performance, Part II