Jason Crease

Test Engineer - Red Gate Software

How big is a string in .NET?

Published Friday, January 16, 2009 4:32 PM

How big is a string in .NET?

Typically the size of an object is 8 bytes for the object header plus the sum of the fields.  Consider this simple object:


    class ThreeFields
    {
        double d;
        object o;
        int i;
    }


The size of a ThreeFields object is 8 bytes (for header) + 8 bytes (for the double) + 4 bytes (for the object pointer) + 4 bytes (for the integer) = 24 bytes .  But what about a string?

A string is composed of:

  1. An 8-byte object header (4-byte SyncBlock and a 4-byte type descriptor)
  2. An int32 field for the length of the string (this is returned by String.Length).
  3. An int32 field for the number of chars in the character buffer.
  4. The first character of the string in a System.Char.
  5. The rest of the string in a character buffer, terminating with a null terminator char.   If you don’t believe me about the null-terminator, have a poke around with Reflector.  String.AppendInPlace() demonstrates this nicely.  One reason this is done is to aid unmanaged interop.  Otherwise, everytime you marshal a string you’d need to copy it and add the ‘\0’

So a string size is 18 + (2 * number of characters) bytes.  (In reality, another 2 bytes is sometimes used for packing to ensure 32-bit alignment, but I’ll ignore that).  2 bytes is needed for each character, since .NET strings are UTF-16.   Thus, a string like so:


         String mySimpleString = "Hello";


Will be 18 + (2 * 5) = 28 bytes.  Splendid.  Now consider this snippet of code.  It creates a String and a Stringbuilder with the same contents.


           String atCompile = "123456789";
           StringBuilder buildingMyString = new StringBuilder("123");
           buildingMyString.Append("456");
           buildingMyString.Append("789");
           String fromStringBuilder = buildingMyString.ToString();


This should create two strings: atCompile and fromStringBuilder, which both read “123456789”.  How big are atCompile and fromStringBuilder?  You might think:


            18 + (2 * 9) = 36 bytes


But if you profile it in a memory profiler, like ANTS Profiler, and you’ll see this.


Community server image manipulation is rubbish.



The String atCompile is 36 bytes, as expected. fromStringBuilder has the same contents, but is 50 bytes.  Eh?  What’s going on there?


This weird behaviour is down to how System.StringBuilder does a ToString().  Like most people, I believed it allocated a new string and then copied the contents of the StringBuilder in.  In reality, it just returns a reference to the string underpinning the StringBuilder.  StringBuilders work by using strings backed with char buffers increasing in size in powers of two.  A string does not need to be backed by a char buffer matching the string length; ‘expansion room’ is permitted.  In this case, fromStringBuilder is backed by a 16-byte array, so is 18 + (2*16) = 50 bytes, as observed.


But what happens if the StringBuilder is then edited?  Doesn’t the String we just got from ToString() then become invalid?   Yes it does.  When you do this append, StringBuilder copies the existing contents to a new string, and uses this new string.  The String we got via ToString() continues to point to the String that has been discarded by the StringBuilder.  This String is still backed by an over-sized char[].


I presume Microsoft made this design decision because it is a common idiom to create a StringBuilder, append to it, ToString() it, and then never use it again.  Copying all those bytes from the StringBuilder to a String would be a waste.  Even if you then append to the StringBuilder after doing your ToString(), the resulting copy of the StringBuilder‘s underlying string requires no time than would copying it during the ToString().

by Jason Crease
Attachment(s): AMPStrings.png

Comments

 

gOODiDEA said:

WebWebDesignTrendsFor2009WhylearningCSSisimportantina(web)developmentworld-...
January 16, 2009 9:41 PM
 

Jason Haley said:

January 17, 2009 10:11 AM
 

DotNetShoutout said:

Thank you for submitting this cool story - Trackback from DotNetShoutout
January 17, 2009 6:31 PM
 

Weekly Link Post 77 « Rhonda Tipton’s WebLog said:

January 18, 2009 8:57 PM
 

Community Blogs said:

Web Web Design Trends For 2009 Why learning CSS is important in a (web) development world - Fizzler:
January 19, 2009 1:47 PM
 

igorbrejc.net » Fresh Catch For January 19th said:

January 19, 2009 3:02 PM
 

Interesting Finds: 2009 01.15 ~ 01.17 | Web Hosting and Domains said:

January 21, 2009 7:11 AM
 

iammuruga said:

Interesting Topic, good to see object's memory allocation too.

Thanks for sharing.
January 27, 2009 12:08 PM
 

Figo Chen said:

A string size is 18   (2 * number of characters) bytes.
January 29, 2009 4:03 AM
 

How big is a string in .NET? « Anubhav Goyal said:

January 29, 2009 5:42 AM
 

TheCPUWizard said:

Suprisingly this behaviour is NOT known by many developers. For "small" strings it typically does not matter, but for large strings, it can be a killer.

Consider any string with a char count between (approx)  32K and 40K bytes [the sore spot]. Because of the doubling of size, all of these will have a StringBuilder buffer of 128K (they just overflowed 64K), but have a true sizeof under 80K.

Not only is the memory waste potentially significant, but the breaking of the 80,000 byte boundary puts them on the LOH, and  triggers all of the associated consequences.

January 29, 2009 10:00 AM
 

Jason Crease said:

Very interesting comment TheCPUWizard.  Thanks!
January 29, 2009 10:03 AM
 

Optimizing Memory Consumption with String Pools: Part I | No bug left behind said:

February 9, 2009 11:20 AM
 

.NET - Strings in Memory - How .NET handles strings | The Open Source U said:

February 25, 2009 12:50 PM
You need to sign in to comment on this blog

About Jason Crease

Jason Crease joined Red-Gate in 2006, and works as a software tester in the .NET division. He is currently working on Reflector Pro.


















<January 2009>
SuMoTuWeThFrSa
28293031123
45678910
11121314151617
18192021222324
25262728293031
1234567
Finding Stuff in SQL Server Database DDL
 You'd have thought that nothing would be easier than using SQL Server Management Studio (SSMS) for... Read more...

Mission Critical: SQL Server 2008 Performance Tuning Task List
 In which Buck Woody imagines how the US military would have tackled DBA checklists for... Read more...

Simple Query tuning with STATISTICS IO and Execution plans
 A great deal can be gleaned from the use of the STATISTICS IO and the execution plan, when you are... Read more...

Switching rows and columns in SQL
 When they use SQL Server, one the commoner questions that Ms Access programmers ask is 'Where's the... Read more...

Writing Efficient SQL: Set-Based Speed Phreakery
 Phil Factor's SQL Speed Phreak challenge is an event where coders battle to produce the fastest code to... Read more...