Skip to main content

String Concat() vs. Format() vs. StringBuilder

I was looking through my groups’ C# coding standards the other day and there were a couple of legacy items in there that caught my eye.  They had been passed down from committee to committee so many times that no one even thought to second guess and try them for a long time.  It’s yet another example of how micro-optimizations can often get the best of us and cause us to write code that is not as maintainable as it could be for the sake of squeezing an extra ounce of performance out of our software.
So the two standards in question were these, in paraphrase:
  • Prefer StringBuilder or string.Format() to string concatenation.
  • Prefer string.Equals() with case-insensitive option to string.ToUpper().Equals().
Now some of you may already know what my results are going to show, as these items have been compared before on many blogs, but I think it’s always worth repeating and trying these yourself.  So let’s dig in.
The first test was a pretty standard one.  When concatenating strings, what is the best choice:StringBuilderstring.Concat(), or string.Format()?
So before we being I read in a number of iterations from the console and a length of each string to generate.  Then I generate that many random strings of the given length and an array to hold the results.  Why am I so keen to keep the results?  Because I want to be able to snapshot the memory and don’t want garbage collection to collect the strings, hence the array to keep hold of them.  I also didn’t want the random strings to be part of the allocation, so I pre-allocate them and the array up front before the snapshot.  So in the code snippets below:
  • num – Number of iterations.
  • strings – Array of randomly generated strings.
  • results – Array to hold the results of the concatenation tests.
  • timer – A System.Diagnostics.Stopwatch() instance to time code execution.
  • start – Beginning memory size.
  • stop – Ending memory size.
  • after – Memory size after final GC.
So first, let’s look at the concatenation loop:
   1: // build num strings using concattenation.
   2: for (int i = 0; i < num; i++) 
   3: { 
   4:     results[i] = "This is test #" + i + " with a result of " + strings[i]; 
   5: } 

Pretty standard, right?  Next for string.Format():
   1: // build strings using string.Format()
   2: for (int i = 0; i < num; i++) 
   3: { 
   4:     results[i] = string.Format("This is test #{0} with a result of {1}", i, strings[i]); 
   5: } 

Finally, StringBuilder:
   1: // build strings using StringBuilder
   2: for (int i = 0; i < num; i++) 
   3: { 
   4:         var builder = new StringBuilder(); 
   5:         builder.Append("This is test #"); 
   6:         builder.Append(i); 
   7:         builder.Append(" with a result of "); 
   8:         builder.Append(strings[i]); 
   9:         results[i] = builder.ToString(); 
  10: } 

So I take each of these loops, and time them by using a block like this:
   1: // get the total amount of memory used, true tells it to run GC first.
   2: start = System.GC.GetTotalMemory(true); 
   3:  
   4: // restart the timer
   5: timer.Reset(); 
   6: timer.Start(); 
   7:  
   8: // *** code to time and measure goes here. ***
   9:  
  10: // get the current amount of memory, stop the timer, then get memory after GC.
  11: stop = System.GC.GetTotalMemory(false); 
  12: timer.Stop(); 
  13: other = System.GC.GetTotalMemory(true); 

So let’s look at what happens when I run each of these blocks through the timer and memory check at 500,000 iterations:
   1: Operator + - Time: 547, Memory: 56104540/55595960 - 500000
   2: string.Format() - Time: 749, Memory: 57295812/55595960 - 500000
   3: StringBuilder - Time: 608, Memory: 55312888/55595960 – 500000

Egad!  string.Format brings up the rear and + triumphs, well, at least in terms of speed.  The Concat() burns more memory than StringBuilder but less than string.Format().
This shows two main things:
  • StringBuilder is not always the panacea many think it is.
  • The difference between any of the three (in the context of a creating a string in a single statement) is miniscule!
The second point is extremely important!  You will often here people who will grasp at results and say, “look, operator + is 10% faster thanStringBuilder so always use StringBuilder.”  Statements like this are a disservice and often misleading.  For example, if I had a good guess at what the size of the string would be, I could have pre-allocated my StringBuilder like so:
   1: for (int i = 0; i < num; i++) 
   2: { 
   3:         // pre-declare StringBuilder to have 100 char buffer.
   4:         var builder = new StringBuilder(100); 
   5:         builder.Append("This is test #"); 
   6:         builder.Append(i); 
   7:         builder.Append(" with a result of "); 
   8:         builder.Append(strings[i]); 
   9:         results[i] = builder.ToString(); 
  10: } 

Now let’s look at the times:
   1: Operator + - Time: 551, Memory: 56104412/55595960 - 500000
   2: string.Format() - Time: 753, Memory: 57296484/55595960 - 500000
   3: StringBuilder - Time: 525, Memory: 59779156/55595960 - 500000

Whoa!  All of the sudden StringBuilder is back on top again for this example code!  But notice, it takes more memory now.  This makes perfect sense if you examine the IL behind the scenes.  Whenever you do a string.Concat() – or operator + of course - in your code, it examines the lengths of the arguments and creates a StringBuilder behind the scenes of the appropriate size for you.
But even IF we know the approximate size of our StringBuilder, look how much less readable it is!  That’s why I feel you should always take into account both readability and performance.  After all, consider all these timings are over 500,000 iterations.   That’s at best  0.0004 ms difference per call which is negligible at best.
The key is to pick the best tool for the job you are trying to do.  What do I mean?  Consider these words of wisdom:
  • Concatenate (+) is great at concatenating several strings in one single statement. 
  • StringBuilder is great when you need to building a string across multiple statements or a loop.
  • Format is great at performing formatting of strings in ways that concatenation cannot.
Just remember, there is no magic bullet.  If one of these always beat the others we’d only have one and not three choices available to us, but each has their purpose and each has times when they outshine the others, so do not take this as “string concat is always faster” because that is not true, nor take this as “a sized StringBuilder is always faster” because again that is not always true!  The salient point, which I can’t stress enough, is that each performs a certain job well and the key is to know which tools does which job best.
So, in general, the string.Concat() is clean and often optimal for joining together a known set of strings in a single statement. 
StringBuilder, on the other hand, excels when you need to build a string across multiple statements or in a loop.  Use it in those times when you are looping till you hit a stop condition and building a result and it won’t steer you wrong.
Finally, String.Format() seems to be the looser from the stats, but consider which of these is more readable:
   1: // build a date via concatenation
   2: var date1 = (month < 10 ? string.Empty : "0") + month + '/' 
   3:     + (day < 10 ? string.Empty : "0") + '/' + year;
   4:  
   5: // build a date via string builder
   6: var builder = new StringBuilder(10);
   7: if (month < 10) builder.Append('0');
   8: builder.Append(month);
   9: builder.Append('/');
  10: if (day < 10) builder.Append('0');
  11: builder.Append(day);
  12: builder.Append('/');
  13: builder.Append(year);
  14: var date2 = builder.ToString();
  15:  
  16: // build a date via string.Format
  17: var date3 = string.Format("{0:00}/{1:00}/{2:0000}", month, day, year);
  18:  
So the strength in string.Format() is that it makes constructing a formatted string easy to read.  Yes, it’s slower, but look at how much more elegant it is to do zero-padding and anything else string.Format() does.
So my lesson is, don’t look for the silver bullet!  Choose the best tool.  Micro-optimization can often bite you in the end if you sacrifice more readable code for the sake of a performance gain that may or may not exist.  This is not to say feel free to write ill-performing code, you should still understand the complexity of the code you are writing and of course prefer linear algorithms to quadratic ones and so on, but make sure before you optimize code that you understand what gains (if any) you are going to get.
I love the rules of optimization.  They’ve been stated before in many forms, but here’s how I always remember them:
  1. For Beginners: Do not optimize.
  2. For Experts: Do not optimize yet.
Many of the time on today’s modern hardware, a micro-second optimization at the sake of readability will net you little because it won’t be your biggest bottleneck.  Code for readability, choose the best tool for the job which will usually be the most readable and maintainable as well.  Then, if you need that extra performance boost after profiling your code and finding the true bottleneck you can optimize away.

Comments

Popular posts from this blog

Accessing File Stored in Windows Azure Blob Storage Using jQuery

Did you know it was possible to access the Windows Azure Blob Storage directly from JavaScript, for example using jQuery? At first, it sounds obvious, since Blobs are after all accessible from a public UR. But in practice, there is a very big hurdle: the Web browser’s Same Origine Policy or SOP, that restricts JavaScript code to accessing resources originating from the same site the script was loaded from. This means that you will never be able to load a Windows Azure Blob using XMLHttpRequest for example! Fortunately, there is a popular workaround called JSONP (“JSON with Padding”). The idea behind this technique is that the script tag is not submitted to the SOP: an HTML page can thus load a JavaScript file from any site. So, if you expose your data in an “executable” form in JavaScript, a page will be able to load this data using a script tag. For example: <script type=”text/javascript” src=”http://www.sandeepknarware.in/exemple.jsonp”> </script> But how can ...

Support for debugging lambda expressions with Visual Studio 2015

Anyone who uses LINQ (or lambdas in general) and the debugger will quickly discover the dreaded message “Expression cannot contain lambda expressions”. Lack of lambda support has been a limitation of the Visual Studio Debugger ever since Lambdas were added to C# and Visual Basic.  With visual studio 2015 Microsoft has added support for debugging lambda expressions. Let’s first look at an example, and then I’ll walk you through current limitations. Example To try this yourself, create a new C# Console app with this code: using System.Diagnostics; using System.Linq; class Program { static void Main() { float[] values = Enumerable.Range(0, 100).Select(i => (float)i / 10).ToArray(); Debugger.Break(); } } Then compile, start debugging, and add “values.Where(v => (int)v == 3).ToArray()” in the Watch window. You’ll be happy to see the same as what the screenshot above shows you. I am using Visual Studio 2015 Preview and it has some limitati...

gcAllowVeryLargeObjects Element

There are numerous new features coming with .NET 4.5 and here, on this blog, you can find several posts about it. But the feature we are goint to talk about today is very exciting, because we were waiting for it more than 10 years. Since .NET 1.0 the memory limit of .NET object is 2GB. This means you cannot for example create array which contains elements with more than 2GB in total. If try to create such array, you will get the OutOfMemoryException. Let’s see an example how to produce OutOfMemoryException. Before that Open Visual Studio 2012, and create C# Console Application, like picture below. First lets create simple struct with two double members like example below: 1 2 3 4 5 6 7 8 9 10 11 12 public struct ComplexNumber {      public double Re;      public double Im;      public ComplexNumber( double re, double im)      {    ...