COUNT | SQLCwby – Chris Woods

“It is not enough to be busy… The question is: What are we busy about?” – Henry David Thoreau (Poet and Philosopher)

Have you ever had to write a query to check some threshold count of records and then do some action based on that?

For instance, let’s say your reporting website shows a red indicator when you have more than 10,000 orders of a specific category type in the last 30 days. Otherwise, that indicator is green. You don’t care if there is only one more than your limit or one million more.

The Obvious (and Slow) Solution: `SELECT COUNT(*)`

The obvious first answer is to do a full count of everything that matches that criteria:

SELECT COUNT(*) Cnt
FROM Sales.Orders
WHERE CategoryID = 5
  AND CreateDt >= GETDATE()-30;

Lets say this gives us 156,000 records, which is well above the threshold of 10,000. But look at the cost: it produced 12,080 reads and took 380ms (which you can see using SET STATISTICS IO ON and SET STATISTICS TIME ON). This is on a query that is already using an index and whose stats are up to date.

This query is so slow because it must count every single matching row, even after we have easily counted past our 10,000 record threshold. But how can we short circuit this counting and stop at our threshold?

The Fast Solution: Short-Circuiting with `TOP (N)`

I’m sure you have all used SELECT TOP (N) at some point to just get a sample of what a table looks like. This same logic can help us short circuit our counting. What if we just asked SQL for the top 10,000 records and then counted that result?

We can do this with an inline subquery:

SELECT COUNT(*) Cnt
FROM (
    SELECT TOP (10000) ID
    FROM Sales.Orders
    WHERE CategoryID = 5
      AND CreateDt >= GETDATE()-30
    ) AS r;

This query now only does 54 reads (0.4% of the original) and completes in 12ms (3% of the original).

It works because the inner query (SELECT TOP (10000)) stops as soon as it finds 10,000 matching records. The outer query then just counts the results of the inner query. If the count is 10,000, we know we have at least that many. If it’s less (for example 8,500), we know we’re under the threshold.

The Takeaway

So don’t forget: if you don’t really care about the total count, but just that it’s above some limit, don’t waste resources and time doing a full COUNT(*). Simply put a short circuit into your query with TOP (N) so it stops counting at your given threshold!

Do you know of any other methods in SQL Server to solve this “at least N rows” problem? Share them in the comments!

SQLCwby – Chris Woods

SQL Server Insights with a down-home feel

Tag Archives: COUNT

The TOP (N) Trick: Short Circuit Your SQL Server Counts for Massive Speed Gains

The Obvious (and Slow) Solution: `SELECT COUNT(*)`

The Fast Solution: Short-Circuiting with `TOP (N)`

The Takeaway

The Obvious (and Slow) Solution: SELECT COUNT(*)

The Fast Solution: Short-Circuiting with TOP (N)

The Takeaway

Share this:

The Obvious (and Slow) Solution: `SELECT COUNT(*)`

The Fast Solution: Short-Circuiting with `TOP (N)`