Index Fragmentation | SQLCwby

“An ounce of prevention is worth a pound of cure.” – Benjamin Franklin

Summary

Fill Factor is a database setting that determines the percentage of free space on each leaf-level page in an index. It directly impacts DiskIO, RAM utilization, and internal fragmentation. Which makes the decision of changing this setting not something to be taken lightly for optimal SQL Server performance.

If you have a Fill Factor of 100 (or 0), the page will completely fill up. After it is full, a new page is created. But if you have a fill factor of 60, it will stop adding new records when the page reaches 60% full. Then, it will place the next record on a new page.

Understanding how Fill Factor works is crucial to the health of your SQL server. Too high and you may have a lot of unnecessary page splits. Too low and SQL is wasting resources and taking longer to get you the data you want. It also needs to be considered alongside your index maintenance planning to properly handle internal fragmentation.

To Adjust Fill Factor or Not To Adjust?

Many newbie DBAs will automatically adjust the default fill factor to something less than 100 on all indexes to avoid page splits. This over-reaction can actually cause more problems than it solves. Lets talk about specifics on what the DBAs are trying to avoid when they do this…

If you remember the Defragmentation post from last week, you’ll recall I talked about adding a new record into the middle of the table. This will cause a page split if that page you are attempting to add the record to is full. Each page split adds extra overhead by creating a new page, and then moving half the old page to the new page. And on top of that, now you have two pages that are half empty. Many page splits like this lead to excessive internal fragmentation. This is the worst type of fragmentation. Read the article above if you want to learn why. For the rest of this article make sure you are thinking of INTERNAL and not external fragmentation.

If you have an index on a table that continually adds records at the end, you do not need to adjust the fill factor. It never adds new records in the middle of the index. This is like a log table or index. It will never cause a page split since it’s always adding to new pages at the end of the table/index.

What is wrong with non-default Fill Factor values?

How Fill factor impacts DiskIO: If you recall from I want to be a SQL Engine when I grow up article, you learned that SQL stores (reads and writes) data in 8kb pages. Regardless of how full that page is it still reads the full 8kb. So having empty space just causes you to have your data spread across more pages, causing more reads and thus taking longer to run if the data isn’t already in memory.

How Fill factor impacts RAM Utilization: But even if the data is already in memory, a non-default Fill Factor will still cause problems. SQL not only reads/writes in 8kb pages, but it also holds that entire 8kb page in RAM. This means if your fill factor is 60%, then you are wasting 40% of your RAM on empty space. In Standard Edition, which only allows 128gb of RAM, this is a huge waste of resources.

What should be our initial baseline?

The default setting in your DBs and for all indexes should be 100. If you find you are having issues, then you can adjust from there. But you really need to understand your data. You need to know the patterns of the inserts. You also need to be aware of how how much internal fragmentation you are seeing in your indexes. You should not be wildly adjust this fine tuning control in SQL without a good process in place.

When/If You Should Adjust

Many DBAs are unaware that the page split counter increases during normal page additions

Newbie DBAs often look at the page splits perfmon counter incorrectly. They see high counts and then quickly adjust Fill Factor without any other investigation. They hope this adjustment will reduce page splits. The problem with relying on just the page split counter is that it reflects more than one action. In perfmon, a page split is counted for actions beyond just inserting data in the middle of a full page. Many DBAs are unaware that the page split counter increases during normal page additions. They believe a page split counter increases only when a page is split and half the data is moved to the end of the index. In our example above about a typical Log table/index, a new page is added when the end of the index page is filled. This completely blank page is then recorded as a page split in perfmon! Really, this perfmon counter should just be called the “New Page” counter.

Ideally, you have create date and modified date on your tables. You can use this to find the cause of your page split counts. Find out if new records were added into the middle of your index or at the end of the index.

How to Adjust Fill Factor properly?

Review your indexes and find ones that are non-default. Change these to 100 and rebuild the pages. Do this only if you are sure there wasn’t a valid reason for the current non-default value. Be careful doing this. If you do it, make sure to quickly keep an eye on internal fragmentation. Be ready to do the other steps swiftly to add some of that empty space back in there.
Any indexes that are already on the default 100 fill factor value, do a rebuild to consolidate any internal fragmentation.
Now your indexes will all be at a good starting point. Record today’s date as the start of your experiment.
Every day (or as often as your workload warrants), pull a report of internal fragmentation.
If this internal fragmentation is significant, then a) reduce the fill factor by 5 points and b) rebuild the index so the “clock” resets. Record in your log that you lowered the fill factor on this day. This will help you see how long it takes to become significant again.
Repeat steps 4 and 5 until a “good amount” of time has gone by.

How long is a good amount of time? It all depends on your workload and environment. If your data is inserted in the middle of your index, you will never get rid of internal fragmentation completely. You need to decide at what point lowering the Fill Factor further will be more detrimental to your DiskIO and RAM utilization than doing a rebuild. My opinion would be you should aim for at least a week between rebuilds. This helps to fix internal fragmentation, but then you aren’t doing rebuilds all the time to fix it, either. However, your mileage may vary.

Finally, review this plan on a recurring basis! You may find over time you need to reduce the fill factor more. Perhaps you can rebuild the index less often. You may even find you can increase the fill factor, saving you on resources.

Conclusion

Remember, Fill Factor is not a “set it and forget it” setting. You should monitor internal fragmentation on all your indexes regularly. Remember to check before you rebuild indexes if you do that on a recurring basis. It is also a precise adjustment tool. You should not make sweeping changes to all indexes or even huge changes to single indexes. Your mileage may vary. You need to look at your specific workload because there is no good “one size fits all” suggestion.

What has been your experience with adjusting Fill Factor in your own SQL Server environments, and what results did you observe?

Most DBAs have heard the term fragmentation. But do you really know what it means? Do you know there are two types of fragmentation? And in today’s world is it still an issue?

External Fragmentation

SQL stores data in what are called pages. Each page is 8kb in size. Let’s say we have a table for Users that is 100 pages in size. This table has a clustered index on lastname. Every page is completely full of all of our users for the application. What happens when a new user named Joe Cool is added into the application? If there is no space for Joe Cool on page 7 of our table, then SQL will divide page 7 into two parts. It will move the second part of that page to a new page 101. It will then insert Joe Cool on page 7.

Imagine this happening for a few thousand more users. Each one potentially causing another page split. In the end, lets say we have 150 pages. Now we have all of our pages out of order — This is what we call External Fragmentation. In the golden days of databases when we had spinning disks this was an huge issue. Those rust buckets read the data much faster when it was sequential. But, now with modern SSD storage this is almost never a problem anymore. Since it random reads the data it can read the data anywhere very quickly.

Internal Fragmentation

However, as you can imagine in our example above, there may be a lot of empty space on those split pages. This depends on how they were filled with the new users. Also, imagine if we needed to drop a bunch of those users out of our system. We would have even more empty space in our table. This is what we call Internal Fragmentation — where there is empty space (gaps) in our pages.

Internal Fragmentation is going to be much more of a problem than what External Fragmentation is. Imagine printing off our list of Users just as it exists in the table on disk. These physical pages you print off will have a lot of wasted white space. This happens where there are gaps from the splits and/or from the deletes. Before SQL can do anything, it needs the data/pages in memory. To get that, it will need to read the data from disk. This process takes more time than it otherwise would because there are more pages. Why? The data is spread across more pages than necessary with lots of empty space. And then even worse: once it’s in memory the entire page is in memory, even the “white space”. As you can now see, this will be a huge waste of not only disk IO but also precious RAM.

The Bad Fix

Many DBAs react to performance complaints on a query by checking External Fragmentation. They often conclude that they need to rebuild their index when they see the high percentage of fragmentation. Once they do the performance is suddenly better and they pat their backs and say, “Job well done!” But what they don’t realize is most likely it wasn’t the actual defragmentation that solved the problem.

When a rebuild occurs three things happen:

The actual rebuild
A full stats update
Clearing any plans from cache that involved that index

Most likely what fixed your issue was #2 or #3.

The Good Fix

So it’s much better to look at and fix those two first:

Full Stats update: Look to see if the the index has out of date stats. If so, then update the stats and check the query again. Stats updates are much less resource intensive than an index rebuild.

Plan Cache clearing: This may have solved your problem because of a bad query plan. Instead, you should check the query plan in cache. See if it makes sense for what you are trying to do. If this was your issue, most likely you have parameter sniffing that you need to look into.

Internal Fragmentation: If neither of the above has solved your problem, then look at Internal Fragmentation. If you find a lot of Internal Fragmentation, then go ahead and rebuild the index. You should check your fill factor. Make sure it’s at the default to decrease the chances of this happening again. (I plan to have a future post about how to adjust fill factor if you believe it needs to be adjusted. So be sure to subscribe so you won’t miss it when it comes out!)

Obviously after this point if you are still having issues, it’s time to dig deeper into the query plan. Examine how you can improve it. Use just good old query tuning techniques.

Have you found other ‘bad fixes’ that DBAs commonly rely on, and what’s your ‘good fix’ alternative?

SQLCwby – Chris Woods

SQL Server Insights with a down-home feel

Tag Archives: Index Fragmentation

Fill Factor: The Hidden Culprit Behind Your Database’s Slowdown?

Summary

To Adjust Fill Factor or Not To Adjust?

What is wrong with non-default Fill Factor values?

What should be our initial baseline?

When/If You Should Adjust

How to Adjust Fill Factor properly?

Conclusion

Is SQL Defragmentation Still Relevant Today?

External Fragmentation

Internal Fragmentation

The Bad Fix

The Good Fix

Summary

To Adjust Fill Factor or Not To Adjust?

What is wrong with non-default Fill Factor values?

What should be our initial baseline?

When/If You Should Adjust

How to Adjust Fill Factor properly?

Conclusion

Share this:

External Fragmentation

Internal Fragmentation

The Bad Fix

The Good Fix

Share this: