Projects
Contact
Patrick Fox
Torrance, CA     90503
fox@patrickfox.org

Realistic C++/Java Server Comparison (realcomp)

The realcomp project is a simple, yet reasonably real world, implementation of a fairly standard server system, written concurrently in both C++ and in Java.

The fundamental goal of the project is to attempt to determine, with reasonable accuracy and reliability, the actual, real world performance, scalability, and maintainability differences of a server system and code base implemented in Java as opposed to C++.

I emphasize the use of the term "real world" because we are not interested in the performance and resource utilization differences of the languages/platforms in completely contrived scenarios which a professional Software Engineer is unlikely to ever encounter in the course of his career. We are interested in being able to make informed and intelligent decisions, based on realistic, firsthand knowledge and experience, about how a system should be implemented.

Overview

The realcomp project is intended to simulate the general functionality of a typical, high performance, scalable, sockets based server. Much of the design (the classes and abstractions) is based on real, commercial, production, server systems I have designed and/or worked on over the years (without, of course, violating any non-disclosure agreements or copyrights - all of the code has been completely written from scratch). Though, of course, the actual implementation is substantially simplified.

The project consists of four modules or targets: a client application; a C++ implementation of the server; a Java implementation of the server; and a shared library containing common code used in both the client and the C++ server.

Requirements

Functional Equivalence

A critical requirement is that the C++ version and the Java version MUST be functionally equivalent. There may be differences in implementation details, due to either constraints of the languages/libraries, or due to differences in how something is typically approached in the Java world as opposed to the C++ world. But otherwise, the two versions must be the same.

Minimum Concurrent Connections

The minimum requirement for concurrent, active connections is 30,000.

Since, we are interested in a system which will be continuously accepting new connections while it is concurrently handling existing connections, the client is implemented such that you can specify a minimum and a maximum number of connections which it will maintain. While the client runs, it will continuously establish and terminate a pseudo random number of connections while staying between the specified minimum and maximum.

Real World Designs & Implementations

Since the purpose of this project is to determine the performance and scalability differences between a server written in C++ and in Java; and whether Java is really feasible as a server language/platform, we need to design our classes and implement our code as they would be in a real system.

To that end, we need to include the same level of error checking and handling; use of threads and synchronization mechanisms; logging; et cetera; that we would expect a real, production system to contain.

Sure, we could just focus on the specific points of interest, but that would produce misleading results. Why? Because in a real system some results will be mitigated by the overhead of such mundane requirements. And, at the same time, it is likely that such necessary, yet uninteresting, processing may also adversely affect the runtime performance.

Design Simplicity

Simple designs and implementations are easier to test, to maintain, and to extend. Complexity is justifiable only if it provides a proven, identifiable benefit, and there are no simpler approaches which would provide a comparable benefit.

Unfortunately, there is a lot of code in the world which is unnecessarily complicated. Much of it is the result of developers with outdated knowledge or misinformation. Much of it is the result of developers not wanting to take the time to really understand the code they are responsible for maintaining.

It should be expected that a server code base is going to be in use for some duration of time. To that end, we must expect it will be maintained, extended, improved upon. Therefore, we must avoid the use of unnecessary complexity.

Development Environment

Currently, all development is being done on OpenSUSE 13.2 Linux. I've not tried building or running it on other distributions, but I don't imagine there would be difficulties. The C++ projects (modules) are all makefile based and can be built from the command line. I am not one of those antiquated Unix developers, who refuses to get with the times, though, and I do do most of my work in a contemporary IDE. My IDE of choice is Netbeans.

I've included the Netbeans 8.0.2 project files along with the source code, in SVN and in the zip file. If you are a Netbeans user then that can save you some time and effort.

Background and Motivation

On almost every project I've worked on within the past 10 years there have always been at least a couple of pro-Java developers or managers around, who would engage me in regular, healthy debates about the feasibility of Java over C++ in developing a large, high performance, scalable, server system.

Coming from a predominantly C++ background, with a few, occasional forays with Java, I unquestionably have a bias toward C++. But, at the same time, professionally, I always try to remain objective and do what I believe will be best for the project.

I've had to do test projects like this in the past: One time in particular, around 2008, at a company I was working with, someone in a much too high level of management made the decision that we should phase out all of our C++ code and start doing all new development is Java, because it was, apparently, "the future". I was able to get approval to spend a couple of weeks developing some comparative tests to determine whether such an endeavor would actually be feasible. At that time, it turned, out it was not, and the plan to transition to Java was abandoned.

On a recent project, I have periodically engaged in debates with team leads and project managers, who were very pro-Java. The current product I was working on was in C++, but they intended for the next product to be entirely in Java.

I promptly set out, on the Internet, to find recent, reliable research and test results, to determine what the current state of the issue really is. What I found was hordes of people claiming that Java used to be much slower than C++, but that it has come a long way and is currently just as performant, and scalable as something written in C++. But it was all just claims, just talk. No actual, hard test results, no source code so that other could objectively review the two versions to make sure they are, actually, functionally equivalent, and well written.

The one and only test that I was able to find, which provided specific results, explanation, and some source code, was performed by Robert Hundt at Google, in 2011. The report, can be found here: Loop Recognition in C++/Java/Go/Scala. They found that C++ significantly outperformed Java. However, one concern I would have with that test is that the focus was on including as much of the high level data structures of the languages, as they could, in order to get an overall comparison. I thought their work was very well done and found the report very informative, but it just didn't exactly meet my requirements. What I needed was something focused, specifically, on developing a high performance, network based, scalable server product.

All of the other comparative tests I found focused on overly simplistic and narrow comparison, such as searching a binary tree a million times for random keys, or resorting a linked list. None of which, in my opinion, is realistic because no real world system is going to spend much of it's time perpetually resorting a list or searching a tree infinitely, in a tight loop.

Ultimately, I was unable to find any tests that compared a functionally equivalent, server product, which is required to support at least 30,000 concurrent socket connections, each of which is actively sending and receiving data.

I finished the contract with that client in March 2015 and, having some time off, decided I would create such a test.