Object Overhead: The Hidden .NET Memory Allocation Cost

When developing a .NET application, one of the least visible sources of memory consumption is the overhead required by an object simply to exist. In applications that create a lot of small objects, this overhead can be a major or even a dominant factor in the total memory requirements for the application.

The amount of memory required to keep an object in memory varies depending on whether or not the application is running as a 32 or 64 bit process, and it can be surprisingly high. On a 32-bit system, every object has an 8 byte header – which means that in most cases it must have at least 3 fields to be more than 50% padding. This isn’t the whole story, though: in order to exist this object has to be referenced from somewhere – this increases the amount of memory needed for an object simply to exist to 12 bytes..

On 64-bit systems, the situation is worse. The object header is increased to 16 bytes, and 8 bytes are required for a reference, so every object needs 24 bytes simply to exist. If an application’s memory usage is due to many small objects, switching from 32 to 64 bits will make the situation much worse, not better! The out of memory conditions might be avoided but the resource requirements might increase by up to a factor of 3..

This overhead imposes limits on the number of objects that can reasonably be created by a .NET application. The 12 bytes to exist limit suggests that the maximum number of objects that can be created on a 32-bit system is around 170 million or so – but this many objects would be useless, as no data could be associated with them. Adding 4 bytes of data decreases the limit to 130 million, but 75% of the memory used by the application will be overhead, which is very inefficient..

A more practical way of looking at the number of objects that can reasonably be created is to think about the desirable level of efficiency. For a given amount of memory wasted by being given over to .NET infrastructure, it’s possible to work out the number of objects that should exist and what their average size should be. To reduce the overhead to 10%, for example, each object on a 32-bit system must store an average of around 80 bytes of data – for a total size of 88 bytes, 10% of which is the 8 byte header. This suggests a more reasonable limit of 24 million objects. On 64-bit systems, the objects should be around twice as large to get the same efficiency..

This sounds hard to achieve – 80 bytes usually means that objects must have 10 fields, and such large numbers of fields is usually considered a bad ‘code smell’, indicating classes that are too complex and in need of factoring. There’s a simple practical solution, though: design the application so that any bulk data storage it requires is done by arrays of value types – an array’s overhead is fixed no matter how large it is, and value types have no overhead at all if they are not boxed. Raw arrays or the classes found in System.Collections.Generic are both suitable for this purpose. The details of how these arrays are accessed can be hidden by creating classes that provide an abstract interface to the data they represent and only keeping instances of these classes in memory for the parts of the data that are actually being used at any given point in time..

Unfortunately, this solution may introduce a new form of inefficiency: heap fragmentation. If arrays are being created and destroyed a lot, or resized a lot (which in .NET amounts to the same thing) it is possible that the pattern of creations and garbage collections can result in .NET leaving large holes in memory that will reduce the size of the largest array that it can allocate. This problem can result in an application gradually running out of memory even though it has no memory leaks and its memory requirements are not otherwise increasing over time. I covered this issue and some possible ways to work around it in an earlier article..

All objects created by the CLR are subject to this hidden memory cost, which can result in an application using many times more memory than expected. For bulk in-memory data storage, swarms of small objects can push the cost up to unacceptable levels, especially on 64 bit systems. Reducing the number of objects kept in memory at any one time, perhaps by increasing the number of fields in individual objects or by storing bulk data in large data structures, is an effective way to increase the capacity and efficiency of .NET applications.

For more articles like this, sign up to the fortnightly Simple-Talk newsletter.

Tags: , ,

  • 44850 views

  • Rate
    [Total: 87    Average: 4/5]
  • Sam Allen

    Excellent
    This is an excellent point. Often you store a large array of data (byte[]) and then use indexes to that data. The int indexes are value types so don’t count as objects. I reduced my application’s set by about 1500 objects this way. This technique is also described in Code Complete.

  • Hedwig

    Good
    Different view on .NET Memory management.

  • Nick Ellis-Gowland

    .Net Sockets
    You also have to think of ways to combat heap fragmentation when working with ASync System.Net.Sockets<br/><br/>

    On performance orientated c# socket servers, byte arrays being pinned for long times can lead to severe heap fragmentation.<br/><br/>

    As you state in your article<br/>
    “Reducing the number of objects kept in memory at any one time, perhaps by increasing the number of fields in individual objects or by storing bulk data in large data structures, is an effective way to increase the capacity and efficiency of .NET applications.”<br/>
    This is because the CLR puts large objects (over 80K) into the Large Object Heap. Freeing up the CLR memory management.

  • TheCPUWizard

    Additional Items…
    A good post. A couple of things I think should be added.

    1) Avoid Large Objects [those that must go on the LArge Object Heap] and you will avoid memory fragementation [Generational Heaps do no Not fragment – unless you “pin” items]. Segmented arrays can easily achieve this while still getting high memory efficiency

    2) Consider the cost of a Reference Type Object even existing. Every time the GC runs, it must walk this tree. It is the surviving objects that cost time, NOT the unreachable ones. So having a large number of objects is very often a sign that one is not following the create, use, abandon strrategy. This is commonly seen in people with a C++ background who want to “keep an object around so I can use it later”.

  • Ian Ringrose

    In most case this is not a problem
    I have never seen this as a problem in a real life .net application that is not to say it is never a problem…

    If I was writing the data viewer for a profiler in C# I would expect to hit this problem due to the volume of data. In most other applications if you have a large volume of data, you can keep most of the data in a database and just read into memory what you need.

    This is the sort of blog posting that is of great value to an experienced programmer, but may lead a lot of other programmer to write over complex code for no good reason.

    I would say you should first design the application so that the code is simple, clear, and have good unit tests. Then (and only then) if you hit this type of problem look at solutions like “storing any bulk data storage it requires by arrays of value types etc”. It will take a lot of work to attract the data storage away from the rest of the application while keeping it fast enough, so don’t do it until it is NEEDED.

    Another thought, if the data is mostly read only, once you have gone to the effort of packing into to arrays and bytes etc, you might wish to store it in files so keeping it out of the way of the garbage collector. The files will be cached into RAM by windows anyway, so can be faster then expected. (Or even write a data storage process in C++ that communicates over named pipes or shared memory)

  • Nick Ellis-Gowland

    .Net Sockets
    You also have to think of ways to combat heap fragmentation when working with ASync System.Net.Sockets<br/><br/>

    On performance orientated c# socket servers, byte arrays being pinned for long times can lead to severe heap fragmentation.<br/><br/>

    As you state in your article<br/>
    “Reducing the number of objects kept in memory at any one time, perhaps by increasing the number of fields in individual objects or by storing bulk data in large data structures, is an effective way to increase the capacity and efficiency of .NET applications.”<br/>
    This is because the CLR puts large objects (over 80K) into the Large Object Heap. Freeing up the CLR memory management.

  • Stefan

    Strings – part 1.
    It is a good post, however an empirical study using a sample application, or better real life application, will be very welcome. Automatic (garbage collected) memory management is a major bottleneck of all such systems including .Net, Java and academic languages such as Scheme or standard ML). For example splitting 1 GB of text into words by the whitespace and newline in standard ML takes 60 secs. when each word is actually created as a string, 6 secs. when using substrings, which are simply pointers to chunks of very large strings and 3secs. when an int is returned instead of the string.

  • Stefan

    Strings – part 2.
    For .Net splitting 1 GB of text into string using String.split is even slower. As a result any application which heavily uses strings in .Net is
    very slow. The same problem exists in Java and authors of software use inplace update using char arrays which makes the code unmodular, complicated and potentially buggy. There are elegant solutions to this problem and I would have liked to see those explored.

  • Anonymous

    Interesting
    Of course the 24 byte object limit on a “64 bit system” (2^63, assuming 64 bit windows splits its system/user memory at the top bit 🙂 ) is 384,307,168,202,282,325 objects. Yes I know that seems a bit silly but so did 4 GB PC’s several year ago. Adopting a managed language platform seems to require a basic trade-off off of memory and performance over a well written “direct” (compiled) code. This trade-off hopefully results in better code (fewer defects) written more quickly. I’d say if your application starts getting anywhere near the limits you’ve mentioned, then it’s time to consider rewritting parts in unmanaged code using some form of custom memory management. As opposed to trying to refactor managed code in such a way that increases the field/object count or attempts to “array away” memory issues.

  • Andrew Hunter

    Some responses
    There are some good points here, so I’ll address them in no particular order…

    This article was based on feedback we have been getting from the memory profiler, where people have been hitting a bug caused when trying to take snapshots of applications with more than 25 million objects on 32-bit systems. In this case, the header overhead must be a large factor in the memory usage of the application as it’s not possible to have that many objects in memory unless they are quite small.

    With 64-bit systems the problem isn’t running out of address space as such but running out of physical resources. It’s a more subtle problem: a program won’t usually stop working, but at some point performance will fall off a cliff when the application can no longer work without paging. This is more serious with .NET than it is with unmanaged code as a full garbage collection will visit all of the dormant objects, forcing a lot of paging if the system is low on physical RAM.

    The intention of the article was to suggest a way to identify possible causes of these problems before they occur and to describe the symptoms so that a diagnosis can be made when an application seems to be using too much memory for a given workload. We’ve occasionally seen this issue with Red Gate applications and it’s not intuitive what’s going on even with a memory profiler. I think the need for an object to have a reference as well as a header is usually what proves confusing.

    It’s not worth changing your code to take account of this unless you can see a clear issue – notably a small class with hundreds of thousands or millions of instances – but it is occasionally possible to do the maths and discover that this might be a limiting factor at the design stage, which can save a certain amount of heartache later on in a project. A good API design can hide this as an implementation issue (this also makes it easier to fix ‘after the fact’).

  • Anonymous

    Interesting
    Of course the 24 byte object limit on a “64 bit system” (2^63, assuming 64 bit windows splits its system/user memory at the top bit 🙂 ) is 384,307,168,202,282,325 objects. Yes I know that seems a bit silly but so did 4 GB PC’s several year ago. Adopting a managed language platform seems to require a basic trade-off off of memory and performance over a well written “direct” (compiled) code. This trade-off hopefully results in better code (fewer defects) written more quickly. I’d say if your application starts getting anywhere near the limits you’ve mentioned, then it’s time to consider rewritting parts in unmanaged code using some form of custom memory management. As opposed to trying to refactor managed code in such a way that increases the field/object count or attempts to “array away” memory issues.

  • Charles Kincaid

    Two questions…
    … if you would but first let me say thatthis is a great article.
    (1) Is an array in .NET a single object wich contains sub parts or is it a container of objects like a collection or hashtable?

    (2) How does this play though to the Compact Framework? I seem to have almost no memory issues on the desktop but very similar code runs out of memory in a hurry on my devices.

  • Anonymous

    Pack Tiny INTs into 32-bit and 64-bit INTs
    I experienced Andrew Hunter’s memory wasting problem when writing a Smalltalk version of YACC. Smalltalk can store integers as unboxed value types that still behave like objects objects. The app generated parse tables consisting of many tiny integers describing gotos, shifts, reductions, etc. The integers were small enough that I could pack several of them into 32-bit normal integers. The space savings came from (1) 32-bit integers only consume 32-bits (no header; no reference; no boxing; yet object behavior) and (2) The tiny integers only consumed a few bits each, yet behaved like objects through an abstract interface. This technique would work even better on a 64-bit machine, and the packing and unpacking of tiny ints into normal ints took surprisingly little CPU.

  • Anonymous

    QUESTION
    How can we prove that function is not part of object and property is part of objact?

  • Edwin

    Maximum number of object in relation to GetHashCode()
    Looking at the contract for GetHashCode(), which returns an integer, doesn’t this mean the maximum number of objects is limited to 2^31 (I have never seen negative hashcodes for objects), even on a 64-bit system? Otherwise GetHashCode() can not guarantee to give a unique number. (It doesn’t guarantee to give a unique number during the lifetime of the application, but DOES guarantee the number is unique at every point in time. At some time, it is said is was implemented as the number of a sync block…)

  • rharding64

    Solid state drive question
    hi,

    I am an EE who cross trained into MS .NET in 2004. Since that time, I have been tasked to create various tools that provide remote control of various hardware devices to assess their quality.

    In the course collecting data over the port(s), (serial, LAN, WIFI, Bluetooth, microcontrollers with command semaphores), with standard mechanical hard-drives I double buffered my acquired data. hard drive is slow device, so its buffer is larger.

    the buffer on the ‘front end’ on PC UART is sometimes not enough, so I use ‘FRAM’ buffers, these are Ferromagnetic RAM that functions as RAM FIFO(sometimes used as drop in replacements for EEPROM). But what is important is that these jewels actually run read/write cycles at bus speeds! so you can push data through with little latencies.

    I was just looking for some performance metrics with respect to SSD so when i start to target to tablets and the latest/greatest laptops – we will all have SSDs soon enough, that I can estimate what i can do. I’m sure some folks have taken their samsung 7 tablet or IPAD and done some tests.

    A sort of side question, that is important to me, maybe someone might know the answer;

    In my C# apps, I use timing event handlers as back end to capture any missing data bytes in the stream. I always wondered if there is a way to do faster timing event handlers in .NET on the PC. My guess is that I have to do it using unmanaged code base, or if i stick in .NET can i do it in the MSIL as in line assembly calls.

    I have used from the Windows Performance class, data-marshaled the native timing tools and end up with a START and STOP tags of the stopwatch. so inside of c# i can at least see microsecond timing resolution.

    What I do, is provide remote control of connected micro-controllers with semaphore symbols and packet passing that DO provide very fast timing.

    Some of my device targets, such as OEM electronic test equipment, like Digital Storage Oscilloscopes, waveform generators, network analyzers and the like have their own timing circuitry on-board that run at frequencies up to and beyond 6GHz. by using what is called SCPI(one of IEEE comm. protocols used with test gear control) and is much like XML, allows you to send / receive data streams in data acquisition sessions.

    I am currently studying the USB 3.0 with superspeed as well as this at least starts to approach LAN speeds and new USB development kit h/w, and .net drivers for same.

    Cheers

    Ron