Great SQL Server Debates: Buffer Cache Hit Ratio

One of the more popular counters used by DBAs to monitor the performance, the Buffer Cache Hit Ratio, is useless as a predictor of imminent performance problems. Worse, it can be misleading. Jonathan Kehayias demonstrates this convincingly with some simple tests.

Many years ago, when I first started working with SQL Server, there were a number of performance counters that were on the radar of all conscientious DBAs, and were used to track SQL Server performance and assess the general health of a server. One of those counters was SQLServer:Buffer Manager\Buffer Cache Hit Ratio, described as follows in the Books Online topic for the SQL Server:Buffer Manager Object:

“Percentage of pages found in the buffer cache without having to read from disk. The ratio is the total number of cache hits divided by the total number of cache lookups over the last few thousand page accesses. After a long period of time, the ratio moves very little. Because reading from the cache is much less expensive than reading from disk, you want this ratio to be high. Generally, you can increase the buffer cache hit ratio by increasing the amount of memory available to SQL Server.”

Commonly, this definition is interpreted like this: if the value of the Buffer Cache Hit Ratio (BCHR) counter is “high”, then SQL Server is efficiently caching the data pages in memory, reads from disk are relatively low, and so there is no memory bottleneck. Conversely, if the BCHR value is “low”, then this is a sure sign sign that SQL Server is under memory pressure and doing lots of physical disk reads to retrieve the required data. Prevailing wisdom suggests that “low” is less than 95% for OLTP systems, or less than 90% for OLAP or data warehouse systems.

This article will, I hope, convince you that this interpretation of the BCHR counter value is completely incorrect and very misleading. In fact, I will prove that it’s entirely possible for SQL Server to be under significant memory pressure while displaying a value for BCHR that, viewed in isolation, would lead a DBA to assume that SQL Server was in fine heath. At the same time, I’ll demonstrate that there are far better counters for tracing memory usage patterns in SQL Server and for diagnosing potential memory issues.

As a consultant, the BCHR is one performance counter that I never use in my work, and I stopped using it as soon as I discovered exactly why its value can be so misleading, and how misguided were the prevailing ideas about its meaning.

Necessary memory-related counters

In my recent book on Troubleshooting SQL Server, I stress time and again that no single piece of information should be relied on to accurately diagnose a SQL Server problem. Unfortunately, many DBAs still do regard the BCHR counter value, viewed in isolation, as a useful indicator of the heath of a SQL Server. Not only does this lead people to misdiagnose SQL Server performance, it also means that they are ignoring counters that could give vital supporting evidence, and in fact are often much more effective at tracking down memory issues. These memory counters, all belonging to the SQLServer:Buffer Manager object, include the following (definitions extracted from the previous Books Online link):

  • Page reads/secNumber of physical database page reads that are issued per second. This statistic displays the total number of physical page reads across all databases. Because physical I/O is expensive, you may be able to minimize the cost, either by using a larger data cache, intelligent indexes, and more efficient queries, or by changing the database design
  • Free PagesTotal number of pages on all free lists (free lists track all of the pages in the buffer pool that are not currently allocate to a data page, and are therefore available for usage immediately)
  • Page Life ExpectancyNumber of seconds a page will stay in the buffer pool without references
  • Free List Stalls/secNumber of requests per second that had to wait for a free page

Consider, for example, the SQLServer:Buffer Manager\Page reads/sec counter. Without full knowledge of what the BCHR counter value really means, it seems reasonable to assume that if the BCHR value is high, then the Page reads/sec counter value should be low, since the high BCHR means that we don’t have to hit disk to retrieve the required data, right? Let’s find out!

Set up: databases, tables, memory counters and a load generator

To investigate the true nature of the BCHR, we’ll need to perform some tests on an isolated instance of SQL Server. In my tests, I used a dedicated VM, with SQL Server Agent in a stopped state, and no other activity on at all on the instance, aside from the tests themselves.

To follow along with this demo, you’ll need to install on your test instance the AdventureWorks and AdventureWorks2008R2 databases, both downloadable from Codeplex. In addition, you’ll need to run the script in Listing 1 to create the LowBCHR test database, and in it a sample table, TestTable, of a known size.

Listing 1: Creating the sample LowBCHR database and TestTable table

This TestTable table is larger than 1GB in size, as confirmed in Listing 2, using the sp_spaceused built-in stored procedure.

1426-image001.png

Listing 2: TestTable is over 1GB in size

In Listing 3, we set ‘max server memory‘ option on the instance to 1024 MB, a value lower than the size of the table, meaning that SQL Server can’t cache the whole table in memory. We then clear all the caches on the instance so that we have a cold system against which to run our tests.

Listing 3: Setting max server memory to 1024 MB and clearing the caches

Next, we need to generate our performance counter collection set in Windows, in order to capture the BCHR values, along with the Page reads/sec counter values, and those of Free Pages, Page Life Expectancy, and Free List Stalls/sec, all from the SQLServer: Buffer Manager object.

1426-image002.png

Figure 1: Setting up the performance counter collection set in Windows

With the collection set created we are ready to begin logging the counter values, as we execute various SQL loads against the instance. In my test, I generated this load using a tool called SQL Load Generator, which is freely downloadable from Codeplex. If you wish to follow along, you’ll need to download this tool, or something similar (if you don’t want to install a tool you may be able to achieve similar effects using the GO x trick and lots of SSMS tabs!)

Testing the behavior of BCHR

We’ll run a series of tests to investigate the behavior of BCHR, under various workloads and conditions. A variable query load, comprising three queries, will be executed against three different databases. In these tests, the load was generated using the previously-referenced SQL Load Generator tool. The query in Listing 4 will be executed against both the AdventureWorks and AdventureWorks2008R2 databases.

Listing 4: The query against AdventureWorks and AdventureWorks2008R2

This will be the “base” workload in our tests. When we need create some memory pressure, we’ll introduce to the workload the query in Listing 5, which executes against TestTable in our LowBCHR database.

Listing 5: The query against the LowBCHR database

Test 1: Behavior under base test load

With the system still idle, start the performance counter data collection; the counter values should be stable at this point.

1426-image003.png

Figure 2: Initial, stable values of the performance counters

BCHR will be at 100%, Free Pages will be high (it depends on your specific system, but on the test system it was in the 128,602 range after freeing the caches), and Page Life Expectancy will increase by one for each second that passes (the starting value on my test system was 1034).

To create our initial workload, the query in Listing 4 will be run against AdventureWorks and AdventureWorks2008R2. This will cause roughly 20MB per database to be read from disk into the buffer pool.

1426-image004small.png

Figure 3: Starting the initial data load (click through for detail)

When the queries begin execution, you should see an initial dip in the BCHR value, on the instance, and a corresponding spike in page reads/sec, as the required pages are read from disk, into the buffer cache, as shown in the data collected by Performance Monitor (Figure 4).

1426-image005small.png

Figure 4: Memory counter behavior under initial load

However, the Page Life Expectancy continues to increase by a value of one every second and, even though the two queries continue to execute against the databases, the system shows near-zero impact aside from that initial drop in BCHR.

Test 2: Behavior under maximum test load

The behavior observed in the initial test is what people generally expect with regards to BCHR value, and it shapes many people’s understanding of what the value is telling them. If this is the case with you, then prepare for a shock as, alongside our other two queries, we start up the third query against our LowBCHR database, and generate some serious memory pressure!

The TestTable query (Listing 5) is going to require a full scan of the table, and the table is larger than the available buffer pool size, because we constrained ‘max server memory‘ to 1GB. Bear in mind that the queries against the AdventureWorks and AdventureWorks2008R2 tables will also be competing for this 1 GB of buffer pool memory.

With the other queries still running, introduce to the load the TestTable query, simulating five concurrent sessions executing the query. The impact on our counter values is shown in Figure 5.

1426-image006small.png

Figure 5: Memory counter behavior under memory pressure

Imagine for a second that we were only monitoring the BCHR value; as you can see, it shows barely a blip, staying comfortably above the “recommended” 95% value the entire time. According to this, our server is just fine.

The other counters, however, reveal the true story: our server is under significant memory pressure and experiencing constant data cache churn, as evidenced by the fact that Page Life Expectancy bottoms out at 0 to 1, the number of page reads/sec skyrockets, and the number of Free Pages fluctuates but is generally below 200 for the duration of the test. Figure 6 shows a snapshot report of each counter value, at some point during the test.

1426-image007.png

Figure 6: Snapshot counter values during Test 2

So why does the BCHR counter value fail to respond to the memory pressure? The reason is that the read-ahead mechanism in SQL Server, is keeping our buffer cache populated with the required pages. As long as the disk array can keep up with the I/O demands of the query workload, and the read-ahead I/Os can read the data in large enough blocks to keep the data churning in cache, then the required data will continue to be found in cache, rather than on disk, at the point that it is needed during the query’s execution by the query processor. As such, the BCHR will remain above 95%.

Test 3: Behavior under base test load with read-ahead disabled

The results of the previous test demonstrates that all the BCHR value really tells us is whether or not the read-ahead mechanism is operating efficiently; it tells us nothing of the memory pressure, and buffer cache churn, that the server may be experiencing.

To prove this, we can disable the read-ahead mechanism in SQL Server using Trace Flag 652, and then rerun our tests. Alternatively, we could also place our LowBCHR database on an incredibly slow disk array, for example a USB thumb drive, instead of the internal SSDs in my laptop, to significantly reduce the I/O throughput available for read-ahead operations.

To prepare for this test, flush the data caches once again using the relevant code from Listing 3, and then capture another initial counter baseline for the instance, as shown in Figure 7, which was taken after waiting a few minutes to allow the Page Life Expectancy to recover.

1426-image008.png

Figure 7: Initial counter baseline for Test 3

As soon as we enable Trace Flag 652 for the entire instance, using DBCC TRACEON(652, -1), we take a hit on the BCHR that is higher than any of the previous hits taken during our tests, as shown in Figure 8. The cause of this dip isn’t entirely clear (since there is no activity on the server at this stage) but it is reproducible.

1426-image009small.png

Figure 8: Perform counter value variation upon enabling Trace Flag 652

Wait a few minutes to allow the BCHR value to recover and then, in SQL Load Generator, start the two queries against AdventureWorks and AdventureWorks2008R2, as per Test 1 (Figure 3).

The dip in the BCHR value is bigger than we observed in Test 1, but it recovers quickly, as shown in Figure 9.

1426-image010small.png

Figure 9: Memory counter values: behavior under base load, with read-ahead disabled

Test 4: Behavior under increased load (read-ahead disabled)

With the two queries still running, start the third query against TestTable, but using only a single session, as shown in Figure 10.

1426-image011small.png

Figure 10: Introducing the TestTable query to the workload (1 session)

This time, with page read-ahead disabled, the BCHR value for the instance drops below the accepted value (to a value of around 90-91% in my tests), and remains there for the duration of the test. Figure 11 was taken after running the three queries for nearly 20 minutes.

1426-image012small.png

Figure 11: Memory counter values after 20 mins increased load, with read-ahead disabled

Test 5: Behavior under maximum test load (read-ahead disabled)

Finally, let’s up the memory pressure one more notch, by changing the setup of the SQL Load Generator to use five concurrent sessions for the TestTable query, replicating the conditions of our “maximum load” in Test 2.

1426-image013small.png

Figure 12: Maximum TestTable workload (5 concurrent sessions)

Remember that under this same load, with read-ahead enabled, we saw almost no impact on the values of the BCH counter. Performing the same test with read-ahead disabled, the impact on the BCHR value is very significant. As soon as multiple copies of the TestTable query are running concurrently, without read ahead enabled, the BCHR value bottoms out and never recovers completely. As shown in Figure 13, it is consistently low and has multiple near zero values that occur for long durations. In conjunction with the behavior of the other counter values (very high page reads/sec, zero page life expectancy) this indicates, that for certain periods, the query processor is having to wait entirely on the pages to be read from disk and into cache, in order to execute any of the 15 concurrent requests in the test workload.

1426-image014small.png

Figure 13: Memory counter values under maximum test load (read-ahead disabled)

My laptop test machine has a quad core i7 processor and dual solid state drives, with very low latency I/O response, but even so, as a result of the memory pressure and without the benefits of read ahead I/O operations, the pages generally weren’t in memory before they were required by the query processor for execution. In these tests, the I/O for the LowBCHR database was consistently 8K average per read (although this isn’t to suggest that there might not be conditions where it is still possible to get to get multi-page I/O or extent based I/O).

1426-image015.png

Figure 14: With read-ahead disabled we see 8K average per read

If we re-enable read ahead with DBCC TRACEOFF(652, -1), the BCHR value once again returns to the ‘acceptable’ range.

1426-image016small.png

Figure 15: Recovery of counter values, upon re-enabling read-ahead

Conclusions

Many people continue to track the value of Buffer Cache Hit Ratio, as an “early-warning” of any potential memory issues on a SQL Server instance. What these tests prove, I hope, is that, in this regard, the value of BCHR is more or less meaningless. BCHR only responds to significant memory pressure in conjunction with I/O subsystem pressure, or possibly fragmentation i.e. under conditions that impedes page read-ahead to the point that SQL Server becomes much less effective at populating the data cache with the required pages, before the query processor actually requires them for use.

To put it another way, the BCHR early warning alert is only raised once the house is already burning down. At the point that the BCHR has sustained values below 95 for an OLTP workload, the server has been experiencing significant performance problems for a long time, and the use of the other counters on the system would have yielded better results for diagnosing the problem.

For more articles like this, sign up to the fortnightly Simple-Talk newsletter.

Tags: , , , , , ,

  • 111570 views

  • Rate
    [Total: 143    Average: 4.7/5]
  • Aditya

    Super Stuff!
    Jonathan,

    Thanks for Demystifying in a clear way. I am also not a fan of BCH and I depend pn PLE rather for Memory pressure. Now I can throw your example to all who are BCH fans… Thanks once again

    -Aditya

  • Nelson

    BCHR debunked
    I stopped relying on the BCHR (in Oracle databases) when I heard how easy it was to manipulate.

    I heard that some DBA bonuses were tied to the BCHR.

    An Oracle expert, Conner McDonald, wrote a little PL/SQL procedure that would perform totally unnecessary i/o to bring the BCHR up to the target number. You passed in your target BCHR as a parameter. This was not done to increase the bonuses, but more as a proof that the hit ratio was not meaningful in tuning a database.

    It was very effective in convincing me that the BCHR was just a nice number, not a true database performance indicator.

    Nelson

  • Steve

    Great article!
    I plan recreating your steps on my own VM to help it all sink in.

  • Eric

    Excellent
    Awesome article. I always wondered why BCHR never indicated issues on systems I know were under intense memory pressure. I too have been relying on PLE mostly for quite some time.

    I think MS should seriously consider altering the text description for the counter to include this information.

  • Sqldiva

    What a revelation!
    Thanks Jonathan for your amazing research in this paper. Clearly BCHR is SQL’s very own urban myth. You are right; there is no magic bullet and thanks to SQLSkills team and the great information you share, performance analysis is becoming more of science instead of an Art!

  • Michel

    Question
    Really well explained article!

    I am missing something though…

    If read-aheads are fast enough to get the needed pages in the buffer (i.e. BCHR = 100), wouldn’t query performance be the same?
    In other words, BCHR might not say whether you have enough memory, but it does tell you memory (or slow disks) is causing queries to suffer.

    Cheers,
    Michel

  • Radu

    Great insight!
    Thank you, Jonathan!

    Great experiment with very clear explanation! I wish there would be more articles like this.

    Could the BCHR temporary dip in Fig. 8 be explained as follows?
    — the read-ahead might use some other cache of its own before making the data available to the buffer cache
    — system processes (using the system cache) are relying on that stream of data
    — when the read-ahead is stopped, its “pre-cache” is lost and the stream of data for those system processes has some catching-up to do until it recovers the lost portion.

    Let me know if my rambling took me off too much 🙂

    Radu

  • nmcdermaid

    I would love to see…..
    …a diagram showing me data’s journey from disk to when it’s ready to be sent to the client, along with which counters indicate bottlenecks at which layer. A picture tells a thousand words.

    Putting it very simply (forgive me if this is inaccurate), we have physical disk, some kind of physical disk cache, windows memory management, SQL Server memory management, consisting of some SQL Server read ahead magic, and the buffer that it reads-ahead into.

    For a casual DBA like me this would be great to see.

    …an idea for those more devoted and knowledgable than myself.

  • Alex

    Excellent article
    Started reading this over the past weekend during a monthly shutdown at work while I waited for others to complete their work. Experimented using a Parallels VM with Server 2008 R2 and SQL Server 2008 R2…the results matched your examples perfectly.

    Even better, I applied your examples to solving a problem in production. A purge process we were running against a production database hit a snag. After we gave up on the process, I tried to figure out why the total PAGEIOLATCH_SH wait type was so high. I decided to revisit the process in our test environment since I couldn’t really run it in production during production hours.

    By chance I decided to turn on the same counters used in this example…and BAM…like lightening it became clear what was the culprit. A combination of memory pressure and higher than normal disk activity…and the whole time BCHR appeared to be just perfect!

    Thanks again Jonahtan! This article was great and kept a few more hairs on my head!

  • DivineFlame

    Great Article !!!
    It was great reading this article. This cleared all the doubts that I had regarding the meromory counters & their usage. Thanks Jonathan !!!

  • v.vt

    Great Article
    Thanks Jonathan .. Bye bye to BHR

  • Nirav Gajjar

    Great Article
    hi,
    Jonathan Kehayias
    Really very nice article

    Thank You, i really need it.

  • mudit

    good article
    Thanks for the article. It explains BHCR very well.
    I mostly rely on PLE for determining Memory pressure. I would like to ask you an question. is it fair to say that if PLE is low and free pages will have low value and vice versa?

    Can there be cases where Page life expectancy stays low but Free pages goes from a low value to high value during same period?
    Thanks,

  • Guy Glantser

    Excellen Article!
    Jonathan, you did it again! I enjoyed reading this article very much!
    Two years ago I prepared for a presentation that I was going to do at Tech-Ed, and I wanted to simulate memory pressure. I wanted to show a deline in BCHR as a sign of memory pressure. So I set up an environment similar to the one you described in your article, performing a table scan on a table that is larger than the sice of the cache. No matter what I did, I couldn’t get that counter below 99%. I verified that there is a lot of IO activity. I also verified that pages in the buffer pool are changing all the time by examining sys.dm_os_buffer_descriptors. Everything else showed that the pages are read from disk to cache all the time, and still that BCHR stayed above 99%.
    It drove me crazy. I was at Tech-Ed already, a day before my presentation, and that part of the presentation was still not ready because I couldn’t demonstrate what I wanetd with the BCHR counter. Then I met Sunil Agarwal from Microsoft. I showed him my scenario and asked him why wouldn’t the BCHR counter drop. Sunil took a very qucik look and immediately explained to me that this behavior is due to the read-ahead mechanism, just like you explained here.
    It was a shock to me. I investigated it some more, and I realized how it works. I found other ways to demonstrate memory pressure in my presentation, and I’m not using the BCHR counter for diagnosing memory pressure since then.
    I planned to write a post about it, but I have never gotten to actually do it. I’m so glad that you did it, because I think it’s very important for every SQL Server DBA to understand it, and you did a wonderful job explianing it.
    Thank you!

  • PhilY

    Free List Empty counter readings
    Excellent article Jonathan. You mention "necessary" memory counters, one of which is Free List Stalls/sec. Do you happen to know how this relates (if at all) to the "Buffer Partition:Free List Empty" counter? The definition for this counter is ‘Number of times per second a free page was requested and none was available.’, which sounds like it ought to lead to a wait condition of some sort.

    Our system has a consistently zero value for free list stalls (which is good), but has Free List Empty counts that are regularly in the thousands per second. I can find next to no explanatory info on the implications of high values for this counter anywhere.

  • PhilY

    Free List Empty counter readings
    Excellent article Jonathan. You mention "necessary" memory counters, one of which is Free List Stalls/sec. Do you happen to know how this relates (if at all) to the "Buffer Partition:Free List Empty" counter? The definition for this counter is ‘Number of times per second a free page was requested and none was available.’, which sounds like it ought to lead to a wait condition of some sort.

    Our system has a consistently zero value for free list stalls (which is good), but has Free List Empty counts that are regularly in the thousands per second. I can find next to no explanatory info on the implications of high values for this counter anywhere.

  • Lokeshgm7

    Lovely Read
    Fantastic article Jonathan.

  • leotohill

    Interesting, but perhaps overstated.
    Good article, thanks.

    But, as Michel suggested, if read-aheads are able to keep the BCHR low, then does it really matter that you are doing a lot of them? Another way of thinking about it is this: if BCHR is high, and PLE is low, and you add cache, will throughput increase? I have my doubts. Almost certainly I/O would decrease, PLE would increase, but that’s just an improvement in something that wasn’t affecting throughput in the first place.

  • leotohill

    correction to previous message
    I should have said "able to keep the BCHR high," Not low.