ABOUT \ EMAIL \ TWITTER \ RSS \ ARCHIVE
BlackBerry Java - Avoid String Concat

To better my knowledge of the BlackBerry platform and its implementation of Java, as well as to share some useful information on Blurry Words, I’ve setup this sub-category of Mobile Development.  Deem yourself introduced to the BlackBerry Java category.  Though the name is a general blanket to anything I might be interested in covering for the platform, don’t consider this category a good place to learn the basics.  This is in no way an introductory tutorial, more of a study into what works best with the BlackBerry implementation of Java.  There are plenty of great sources out there to get you started, I don’t see a need for me to add to that.  If you are here looking for a starting point to BlackBerry development though, check out the official BlackBerry Developer Site.  It’s full of tutorials and example code, everything you need to get going.

Also, for full disclosure, I’m a software developer at Research In Motion.  I’ve been with the company for about 6 months at the time of writing this post (though I’ve been working in mobile development for nearly five years now).  All the development and experimentation that I do in order to provide hard data with my posts is done using only the external development tools provided by RIM.  Blurry Words related activities are performed from a personal laptop dedicated to the content of the blog.  This is done in order to avoid any cross contamination between RIM’s internal development tools/knowledge and what is provided to the public.

With that, let’s get into the topic of this inaugural post, avoiding String concatenation.  String concats are usually one of the core parts of an application.  Every app needs to collect information and provide it back to the user, which in most cases is impossible to do in a meaningful way without being able to concatenate strings.  The BlackBerry Java implementation has a few different ways of accomplishing this task; the String operators + and +=, the String.concat() function,and using StringBuffer.append() to build the string before converting it to a String object (StringBuilder, which basically made StringBuffer obsolete as part of J2SE 1.5, is not part of J2ME or the BlackBerry version of Java).

The question to answer is which method is best from an optimization point of view?  Without benchmarking, most Java developers already know the answer.  The StringBuffer class exists for a reason.  As String’s mutable side kick, StringBuffer exists to perform string manipulation with streamlined memory allocation, something immutable objects like String can’t accomplish.  Don’t misunderstand, immutable objects are highly preferable, where mutable objects should be minimized due to their difficulty to implement and possibilities for abuse, as described in Joshua Bloch’s Effective Java:

Immutable classes are easier to design, implement, and use than mutable classes.  They are less prone to error and are more secure…Classes should be immutable unless there’s a very good reason to make them mutable.  Immutable classes provide many advantages, and their only disadvantage is the potential for performance problems under certain circumstances.

He goes on to say that small value objects should always be immutable and that bigger valued objects, or any object that may require multistep operations, like String, should be immutable with a mutable companion class to perform the operations.  The reason for this is that an immutable object can never be altered.  So, at each step of an operation a new object would need to be created, leaving the possibility for large amounts of garbage and as a result, degraded performance.  The immutable sidekick should be developed to provide the same results without having the memory creation overhead.  Hence the String and StringBuffer partnership, a perfect pair to test and provide hard proof of the goodness of this development approach.  Here’s the code that was used to test each method.

String += Operator:

    private String strcatTest() {
        String test = "";
        for(int i = 0; i < StrcatSettings.NUM_CONCATS; i++) {
            test += StrcatSettings.CONCAT_STRING;
            test += StrcatSettings.CONCAT_STRING;
            test += StrcatSettings.CONCAT_STRING;
            test += StrcatSettings.CONCAT_STRING;
            test += StrcatSettings.CONCAT_STRING;
        }
        return test;
    }

String.concat(String):

    private String strcatTest() {
        String test = "";
        for(int i = 0; i < StrcatSettings.NUM_CONCATS; i++) {
            test = test.concat(StrcatSettings.CONCAT_STRING);
            test = test.concat(StrcatSettings.CONCAT_STRING);
            test = test.concat(StrcatSettings.CONCAT_STRING);
            test = test.concat(StrcatSettings.CONCAT_STRING);
            test = test.concat(StrcatSettings.CONCAT_STRING);
        }
        return test;
    }

StringBuffer.append():

    private String strcatTest() {
        StringBuffer sb = new StringBuffer();
        for(int i = 0; i < StrcatSettings.NUM_CONCATS; i++) {
            sb.append(StrcatSettings.CONCAT_STRING);
            sb.append(StrcatSettings.CONCAT_STRING);
            sb.append(StrcatSettings.CONCAT_STRING);
            sb.append(StrcatSettings.CONCAT_STRING);
            sb.append(StrcatSettings.CONCAT_STRING);
        }
        return sb.toString();
    }

In order to benchmark these functions, each was implemented in its own class extending CodeBlock, an abstract class I created that provides one method, called run(), to be overridden.  The code that you want to test/benchmark should be implemented within/through run().  Then all the different implementations are registered within a separate class called ComparisonTest, which wraps calls of each version of run() inside of benchmark timing code.  I don’t get fancy with the timing, just the wall clock approach.

The tests are then run on a few different BlackBerry devices to 1) see how the devices compare, and 2) to make sure the results match in separate environments.  I also use this code to obtain memory allocation and code size statistics.  I use the BlackBerry JDE’s memory profile for allocations statistics and an application called jclasslib bytecode viewer to see how much bytecode was generated by the different tests.

Timing Results

As expected, StringBuffer lived up to the expectations laid upon it earlier in the post.  Below shows the timing results collected on three different BlackBerrys, the Curve 8320, the Bold and the Storm.  Using StringBuffer to append each String first and then creating and returning a String object at the very end of all the processing resulted in a significant performance improvement over either of the String methods.

Test Curve 8320(ms) Bold(ms) Storm(ms)
String += 37196 2658 5829
String.concat() 13116 1376 2801
StringBuffer.append() 312 89 119

These results show the cumulative runtime time of 10 runs of concatenating the string, “String Concatenation Comparison Testn” 500 times.  It’s a hard test to get an accurate idea of what is going on though as so much garbage is created that you can’t be sure what is going on with the garbage collector during any run.  I do request the garbage collector be run and yield the thread before each test, but there is still no way to be sure it doesn’t get fired off during the test.  Here are some results from 100 runs of 50 concatenations.

Test Curve 8320(ms) Bold(ms) Storm(ms)
String += 5312 844 1471
String.concat() 1324 185 339
StringBuffer.append() 388 98 131

What you can take from this is that as the String objects grow in size, the performance of the String concatenation operator and function gets progressively worse, an expected result considering larger and larger String objects will be created each time.  At the same time, the StringBuffer’s append method’s performance remains comparable in both test cases.

Memory  and Garbage

As mentioned above, what’s really slowing down the += operator and the String.concat() function is the fact that a new objects must be created for each concatenation in order for the String object to be immutable.  The StringBuffer isn’t constrained by these rules.  The only time additional objects need to be created during the StringBuffer appending process is when the byte array that stores all the characters needs to be expanded.  This process results in a new, larger byte array being created, the content of the original array being copied into the new array and the original array becoming garbage.  I note this as you can avoid this reallocation by providing a good guess to exactly how many characters you will be appending during the process.  That way the amount of reallocating that is needed for the byte array is cut down, possibly completely removed from the process.

Before discussing further, lets look at the memory results.  This is one run of each test, performing only 50 concatenations during the run.  The table shows the number of objects created during the processing and how many bytes were needed in total for all the created objects.  Note, there might have been other objects created during the test.  However, I focused just on String, StringBuffer and byte[] as those are the main objects being created relating to concatenation.

Test Objects Created Memory Used (bytes)
String += 308 Objects (String: 106, StringBuffer: 51, byte\[\]: 151) 406,524 bytes (String: 60,392, StringBuffer: 1960, byte\[\]: 344,172)
String.concat() 110 Objects (String: 110, StringBuffer: 0, byte\[\]: 0) 48,500 bytes (String: 48,500, StringBuffer:0, byte\[\]: 0)
StringBuffer.append() 14 Object (String: 3, StringBuffer: 1, byte\[\]: 10) 7,268 bytes (String: 1,376, StringBuffer: 20, byte\[\]: 5,872)

The first thing I found interesting about these results, which you may have expected looking at the timing results, is that the String concat() function and the String concat operator are implemented differently.  The += and the + actually use a StringBuffer to perform their concatenations, resulting in a ton of StringBuffer and byte[] garbage.  The concat function avoids using the StringBuffer (though I’m note sure how it accomplishes this) and avoids all the extra overhead of allocating a new instance of it every time, easily responsible for its better performance.  Nevertheless, neither compares with creating 14 objects and only 7200 bytes of data during the processing, which is what we get from the StringBuffer.append() method.

One final note on the memory allocation for both of the String class concatenation methods.  Both created a blank String object for each concatenation.  I don’t know what it is used for, but it is just another piece of garbage cluttering up the heap that doesn’t exist when using the StringBuffer’s append function.

Application Size

Finally, just as frosting on the cake, using the StringBuffer to do the concatenation actually results in a smaller bytecode size. Well, at least compared to the += String operator, which results in 119 bytes worth of bytecode.  The StringBuffer.append implementation shown above results in only 62 bytes of bytecode.  The String.concat() implementation results in 54 bytes of bytecode, but we can match that by chaining StringBuffer.append() calls together, as seen below:

    private String strcatTest() {
        StringBuffer sb = new StringBuffer();
        for(int i = 0; i < StrcatSettings.NUM_CONCATS; i++) {
            sb.append(StrcatSettings.CONCAT_STRING).
               append(StrcatSettings.CONCAT_STRING).
               append(StrcatSettings.CONCAT_STRING).
               append(StrcatSettings.CONCAT_STRING).
               append(StrcatSettings.CONCAT_STRING);
        }
        return sb.toString();
    }

This results in 54 bytes of bytecode as it avoids the local variable sb from having to be loaded for each append.  Each load results in two extra bytecodes, one to load the variable and the other to pop it once its processing is complete.  From a timing standpoint the chained implementation is nearly identical to the default implementation, though it was consistently about 2-5 milliseconds slower.

Conclusion

Though immutable objects should be used whenever possible, sometimes the loss of performance is just not acceptable.  However, that doesn’t mean to shy away from immutable classes, only to partner them with a mutable sidekick to do all the dirty work when improved performance is required.  Finally, in a mobile environment, creating objects should be avoided at all cost.  It’s an expensive task that is felt in everything from the battery consumed to the responsiveness perceived by the user.  String is a nasty way of ramping up the amount of objects being created by your app.  Avoid its functionality.  Just remember, StringBuffer is there for a reason…use it.