Click here to monitor SSC

Test Engineer - Red Gate Software

How big is a string in .NET?

Published 16 January 2009 10:32 am

How big is a string in .NET?

Typically the size of an object is 8 bytes for the object header plus the sum of the fields.  Consider this simple object:


    class ThreeFields
    {
        double d;
        object o;
        int i;
    }


The size of a ThreeFields object is 8 bytes (for header) + 8 bytes (for the double) + 4 bytes (for the object pointer) + 4 bytes (for the integer) = 24 bytes .  But what about a string?

A string is composed of:

  1. An 8-byte object header (4-byte SyncBlock and a 4-byte type descriptor)
  2. An int32 field for the length of the string (this is returned by String.Length).
  3. An int32 field for the number of chars in the character buffer.
  4. The first character of the string in a System.Char.
  5. The rest of the string in a character buffer, terminating with a null terminator char.   If you don’t believe me about the null-terminator, have a poke around with Reflector.  String.AppendInPlace() demonstrates this nicely.  One reason this is done is to aid unmanaged interop.  Otherwise, everytime you marshal a string you’d need to copy it and add the ‘’

So a string size is 18 + (2 * number of characters) bytes.  (In reality, another 2 bytes is sometimes used for packing to ensure 32-bit alignment, but I’ll ignore that).  2 bytes is needed for each character, since .NET strings are UTF-16.   Thus, a string like so:


         String mySimpleString = “Hello”;


Will be 18 + (2 * 5) = 28 bytes.  Splendid.  Now consider this snippet of code.  It creates a String and a Stringbuilder with the same contents.


           String atCompile = “123456789″;
           StringBuilder buildingMyString = new StringBuilder(“123″);
           buildingMyString.Append(“456″);
           buildingMyString.Append(“789″);
           String fromStringBuilder = buildingMyString.ToString();


This should create two strings: atCompile and fromStringBuilder, which both read “123456789”.  How big are atCompile and fromStringBuilder?  You might think:


            18 + (2 * 9) = 36 bytes


But if you profile it in a memory profiler, like ANTS Profiler, and you’ll see this.


Community server image manipulation is rubbish.



The String atCompile is 36 bytes, as expected. fromStringBuilder has the same contents, but is 50 bytes.  Eh?  What’s going on there?


This weird behaviour is down to how System.StringBuilder does a ToString().  Like most people, I believed it allocated a new string and then copied the contents of the StringBuilder in.  In reality, it just returns a reference to the string underpinning the StringBuilder.  StringBuilders work by using strings backed with char buffers increasing in size in powers of two.  A string does not need to be backed by a char buffer matching the string length; ‘expansion room’ is permitted.  In this case, fromStringBuilder is backed by a 16-byte array, so is 18 + (2*16) = 50 bytes, as observed.


But what happens if the StringBuilder is then edited?  Doesn’t the String we just got from ToString() then become invalid?   Yes it does.  When you do this append, StringBuilder copies the existing contents to a new string, and uses this new string.  The String we got via ToString() continues to point to the String that has been discarded by the StringBuilder.  This String is still backed by an over-sized char[].


I presume Microsoft made this design decision because it is a common idiom to create a StringBuilder, append to it, ToString() it, and then never use it again.  Copying all those bytes from the StringBuilder to a String would be a waste.  Even if you then append to the StringBuilder after doing your ToString(), the resulting copy of the StringBuilder‘s underlying string requires no time than would copying it during the ToString().

19 Responses to “How big is a string in .NET?”

  1. Anonymous says:

    WebWebDesignTrendsFor2009WhylearningCSSisimportantina(web)developmentworld-…

  2. Anonymous says:

    Thank you for submitting this cool story – Trackback from DotNetShoutout

  3. Anonymous says:

    Web Web Design Trends For 2009 Why learning CSS is important in a (web) development world – Fizzler:

  4. iammuruga says:

    Interesting Topic, good to see object’s memory allocation too.

    Thanks for sharing.

  5. Anonymous says:

    A string size is 18 (2 * number of characters) bytes.

  6. TheCPUWizard says:

    Suprisingly this behaviour is NOT known by many developers. For “small” strings it typically does not matter, but for large strings, it can be a killer.

    Consider any string with a char count between (approx) 32K and 40K bytes [the sore spot]. Because of the doubling of size, all of these will have a StringBuilder buffer of 128K (they just overflowed 64K), but have a true sizeof under 80K.

    Not only is the memory waste potentially significant, but the breaking of the 80,000 byte boundary puts them on the LOH, and triggers all of the associated consequences.

  7. Jason Crease says:

    Very interesting comment TheCPUWizard. Thanks!

  8. mkisaacs says:

    I may be missing something, but it appears your description of the StringBuilder and how it does string copy seem to contradict what Microsoft documentation and reflector show the .Append() and the ToString() methods are doing. Based on what I’m seeing using Reflector, the Append only creates a new instance of a backed string array when the array was too small to contain the text after the append (doubles the size). This also seems to match one reason for using the string builder, to prevent allocating new strings for every string concatination. Also, when the ToString method is called, it DOES appear to do a copy of the string if it is smaller than the string array size. One outside case shows in the ToString method returns the actual string (not a copy) only if the string length is the same as the string ArrayLength and the class’s Thread matches the current thread. See ToString() and Append(String) methods. Am I missing a piece of the puzzle?

  9. Jason Crease says:

    Hi mkisaacs:
    1. “Based on what I’m seeing using Reflector, the Append only creates a new instance of a backed string array when the array was too small to contain the text after the append (doubles the size). ”
    Yes. The backing char[] doubles (roughly) in size when the Append goes beyond the current length.
    2. “Also, when the ToString method is called, it DOES appear to do a copy of the string if it is smaller than the string array size. ”
    By using Reflector and ANTS Memory Profiler, I found that it doesn’t do a copy until an edit to the Stringbuilder occurs. i.e. Once a ToString is done, the StringBuilder thinks of its backing char[] as ‘copy-on-write’. This is a good performance optimization, especially since StringBuilders are frequently used as accumulators until they are ToString’ed and discarded.
    I’m not sure why you were seeing this behaviour, it contradicts the evidence that I’ve seen, and is wasteful behaviour.

    There is a heated discussion of this post on Reddit: http://www.reddit.com/r/programming/comments/7th9z/how_big_is_a_string_in_net_well_that_depends_is/

  10. mkamoski says:

    Jason — Regarding this… “In reality, another 2 bytes is sometimes used for packing to ensure 32-bit alignment, but I’ll ignore that” …why are we ignoring reality? Seriously, to ignore this point makes unit testing impossible, no? Just a layman making an observation. — Mark Kamoski

  11. Jason Crease says:

    Mark K – Yes, thanks for picking me up on this. Packing is used by .NET to 4-byte and 8-byte boundaries (depending on CLR bitness) for all objects. The size of the object remains the same, but the amount of memory used is always a 4 or 8 byte multiple. E.g. a 18-byte string will use 20 bytes on x86 and 24 bytes on x64.

    Two effects of this behaviour:
    1) It makes you run out of memory faster with lots of small, non-aligned objects.
    2) If you start looking at memory addresses (e.g. via the Marshalling stuff in .NET), you’ll see this alignment

    Please note that string sizes have changed slightly in .NET 4.0 – I think they’re now always 2 bytes smaller.

Leave a Reply