Tag Archives: Database Administration

Skip the Full Snapshot: Add a Single Article to SQL Replication Fast!

“I will always choose a lazy person to do a difficult job because a lazy person will find an easy way to do it.”

Bill Gates, Founder Microsoft

Lets imagine you have a multi-terabyte database you are replicating. Now let’s say you either need to add in a new table into that subscription or heaven forbid something went catastrophically wrong and you need to reinitialize one of the tables that already exists in the subscription. Sounds like it’s time to buckle up and wait for a while until that snapshot is taken just for that one article to be applied on the subscriber side, right? Wrong!

What if I told you there was a faster way? In this post we’ll walk through the steps to make this happen including how to watch the action as it’s happening from publisher to subscriber.

ALWAYS TEST IN NON-PRODUCTION FIRST!!!!

The “Full Snapshot” Headache

Performing a full snapshot will not only take a long time, but it’ll use a lot of resources and potentially cause blocking on unaffected tables that have nothing to do with the table you really want to get replicated.

The Fast Way: Add an Article Without a Full Snapshot

  1. (If needed) Cleanup: If the scenario is you need to fix a broken replication of just one table, remove that article from your subscription and then continue on. If you are looking to just add a new table into your replication, skip this step and move on to the next.
  2. Change Publication Settings: There are two config settings that will allow you to bypass the need for a full snapshot to be taken – allow_anonymous and immediate_sync. Below are the commands you’ll want to run to change both of these values to false.
USE [YourPublisherDatabase]

EXEC sp_changepublication 
    @publication = N'YourPublicationName', 
    @property = N'allow_anonymous', 
    @value = 'false'

EXEC sp_changepublication 
    @publication = N'YourPublicationName', 
    @property = N'immediate_sync', 
    @value = 'false'
  1. Add Your Article: Once you run those commands, you can add the article to the subscription.

Watch it Work in Replication Monitor

Then the easiest way to watch everything progress is to use the Replication Monitor. Once you have the monitor open for the publisher server, navigate to the publication you are working with and open the Agents tab.

  1. Run the Snapshot Agent: In the Agents tab, Start the snapshot agent. This will kick off a new snapshot, but if you right click and go to View Details on the agent you will see it only snapshot the new article.
  2. Verify: Once that completes, you’ll want to immediately back out to the publication level again and go to the All Subscriptions tab. Then right click on the subscription and go to the View Details. In the Distributor to Subscriber History tab, you can watch that new article get copied to the subscriber.

    Don’t Forget to Clean Up!

    Revert Settings: Once done, all you need to do is clean up the configurations again. Go back to the publisher and run these commands to change both values to true again (note it’s the opposite order as you ran it last time).

    USE [YourPublisherDatabase]
    
    EXEC sp_changepublication 
        @publication = N'YourPublicationName', 
        @property = N'immediate_sync', 
        @value = 'true'
    
    EXEC sp_changepublication 
        @publication = N'YourPublicationName', 
        @property = N'allow_anonymous', 
        @value = 'true'
    

    Much quicker than doing a full snapshot, right?! All that was required was to change those two configurations in the subscription options and then do your normal steps after that. By doing these few extra steps at the beginning you will save tons of resources and time when you want to add a single article into an existing subscription.

    But as always, be sure to always test this in nonproduction first!!!

    Many thanks to Ragnar Skalvik (DBA Architect) for coming up with this working solution for our team at work! His assistance and tenacity at finding solutions is immeasurable!


    Continued learning…

    Transactional Replication
    If you’re a bit new to the whole setup or need to explain it to your boss, this is the main “what is transactional replication?” page. It gives you the full 10,000-foot view of the technology.

    sp_changepublication (Transact-SQL)
    This is the official page for the main stored procedure we’re using. It’ll show you every single property you can change, but the ones we care about are allow_anonymous and immediate_sync.


    What other “replication tricks” have you learned over the years to save time and resources? Put your tips in the comments below!

    Fill Factor: The Hidden Culprit Behind Your Database’s Slowdown?

    “An ounce of prevention is worth a pound of cure.”Benjamin Franklin

    Summary

    Fill Factor is a database setting that determines the percentage of free space on each leaf-level page in an index. It directly impacts DiskIO, RAM utilization, and internal fragmentation. Which makes the decision of changing this setting not something to be taken lightly for optimal SQL Server performance.

    If you have a Fill Factor of 100 (or 0), the page will completely fill up. After it is full, a new page is created. But if you have a fill factor of 60, it will stop adding new records when the page reaches 60% full. Then, it will place the next record on a new page.

    Understanding how Fill Factor works is crucial to the health of your SQL server. Too high and you may have a lot of unnecessary page splits. Too low and SQL is wasting resources and taking longer to get you the data you want. It also needs to be considered alongside your index maintenance planning to properly handle internal fragmentation.

    To Adjust Fill Factor or Not To Adjust?

    Many newbie DBAs will automatically adjust the default fill factor to something less than 100 on all indexes to avoid page splits. This over-reaction can actually cause more problems than it solves. Lets talk about specifics on what the DBAs are trying to avoid when they do this…

    If you remember the Defragmentation post from last week, you’ll recall I talked about adding a new record into the middle of the table. This will cause a page split if that page you are attempting to add the record to is full. Each page split adds extra overhead by creating a new page, and then moving half the old page to the new page. And on top of that, now you have two pages that are half empty. Many page splits like this lead to excessive internal fragmentation. This is the worst type of fragmentation. Read the article above if you want to learn why. For the rest of this article make sure you are thinking of INTERNAL and not external fragmentation.

    If you have an index on a table that continually adds records at the end, you do not need to adjust the fill factor. It never adds new records in the middle of the index. This is like a log table or index. It will never cause a page split since it’s always adding to new pages at the end of the table/index.

    What is wrong with non-default Fill Factor values?

    How Fill factor impacts DiskIO: If you recall from I want to be a SQL Engine when I grow up article, you learned that SQL stores (reads and writes) data in 8kb pages. Regardless of how full that page is it still reads the full 8kb. So having empty space just causes you to have your data spread across more pages, causing more reads and thus taking longer to run if the data isn’t already in memory.

    How Fill factor impacts RAM Utilization: But even if the data is already in memory, a non-default Fill Factor will still cause problems. SQL not only reads/writes in 8kb pages, but it also holds that entire 8kb page in RAM. This means if your fill factor is 60%, then you are wasting 40% of your RAM on empty space. In Standard Edition, which only allows 128gb of RAM, this is a huge waste of resources.

    What should be our initial baseline?

    The default setting in your DBs and for all indexes should be 100. If you find you are having issues, then you can adjust from there. But you really need to understand your data. You need to know the patterns of the inserts. You also need to be aware of how how much internal fragmentation you are seeing in your indexes. You should not be wildly adjust this fine tuning control in SQL without a good process in place.

    When/If You Should Adjust

    Many DBAs are unaware that the page split counter increases during normal page additions

    Newbie DBAs often look at the page splits perfmon counter incorrectly. They see high counts and then quickly adjust Fill Factor without any other investigation. They hope this adjustment will reduce page splits. The problem with relying on just the page split counter is that it reflects more than one action. In perfmon, a page split is counted for actions beyond just inserting data in the middle of a full page. Many DBAs are unaware that the page split counter increases during normal page additions. They believe a page split counter increases only when a page is split and half the data is moved to the end of the index. In our example above about a typical Log table/index, a new page is added when the end of the index page is filled. This completely blank page is then recorded as a page split in perfmon! Really, this perfmon counter should just be called the “New Page” counter.

    Ideally, you have create date and modified date on your tables. You can use this to find the cause of your page split counts. Find out if new records were added into the middle of your index or at the end of the index.

    How to Adjust Fill Factor properly?

    1. Review your indexes and find ones that are non-default. Change these to 100 and rebuild the pages. Do this only if you are sure there wasn’t a valid reason for the current non-default value. Be careful doing this. If you do it, make sure to quickly keep an eye on internal fragmentation. Be ready to do the other steps swiftly to add some of that empty space back in there.
    2. Any indexes that are already on the default 100 fill factor value, do a rebuild to consolidate any internal fragmentation.
    3. Now your indexes will all be at a good starting point. Record today’s date as the start of your experiment.
    4. Every day (or as often as your workload warrants), pull a report of internal fragmentation.
    5. If this internal fragmentation is significant, then a) reduce the fill factor by 5 points and b) rebuild the index so the “clock” resets. Record in your log that you lowered the fill factor on this day. This will help you see how long it takes to become significant again.
    6. Repeat steps 4 and 5 until a “good amount” of time has gone by.

    How long is a good amount of time? It all depends on your workload and environment. If your data is inserted in the middle of your index, you will never get rid of internal fragmentation completely. You need to decide at what point lowering the Fill Factor further will be more detrimental to your DiskIO and RAM utilization than doing a rebuild. My opinion would be you should aim for at least a week between rebuilds. This helps to fix internal fragmentation, but then you aren’t doing rebuilds all the time to fix it, either. However, your mileage may vary.

    Finally, review this plan on a recurring basis! You may find over time you need to reduce the fill factor more. Perhaps you can rebuild the index less often. You may even find you can increase the fill factor, saving you on resources.

    Conclusion

    Remember, Fill Factor is not a “set it and forget it” setting. You should monitor internal fragmentation on all your indexes regularly. Remember to check before you rebuild indexes if you do that on a recurring basis. It is also a precise adjustment tool. You should not make sweeping changes to all indexes or even huge changes to single indexes. Your mileage may vary. You need to look at your specific workload because there is no good “one size fits all” suggestion.

    What has been your experience with adjusting Fill Factor in your own SQL Server environments, and what results did you observe?

    Is SQL Defragmentation Still Relevant Today?

    Most DBAs have heard the term fragmentation. But do you really know what it means? Do you know there are two types of fragmentation? And in today’s world is it still an issue?

    External Fragmentation

    SQL stores data in what are called pages. Each page is 8kb in size. Let’s say we have a table for Users that is 100 pages in size. This table has a clustered index on lastname. Every page is completely full of all of our users for the application. What happens when a new user named Joe Cool is added into the application? If there is no space for Joe Cool on page 7 of our table, then SQL will divide page 7 into two parts. It will move the second part of that page to a new page 101. It will then insert Joe Cool on page 7.

    Imagine this happening for a few thousand more users. Each one potentially causing another page split. In the end, lets say we have 150 pages. Now we have all of our pages out of order — This is what we call External Fragmentation. In the golden days of databases when we had spinning disks this was an huge issue. Those rust buckets read the data much faster when it was sequential. But, now with modern SSD storage this is almost never a problem anymore. Since it random reads the data it can read the data anywhere very quickly.

    Internal Fragmentation

    However, as you can imagine in our example above, there may be a lot of empty space on those split pages. This depends on how they were filled with the new users. Also, imagine if we needed to drop a bunch of those users out of our system. We would have even more empty space in our table. This is what we call Internal Fragmentation — where there is empty space (gaps) in our pages.

    Internal Fragmentation is going to be much more of a problem than what External Fragmentation is. Imagine printing off our list of Users just as it exists in the table on disk. These physical pages you print off will have a lot of wasted white space. This happens where there are gaps from the splits and/or from the deletes. Before SQL can do anything, it needs the data/pages in memory. To get that, it will need to read the data from disk. This process takes more time than it otherwise would because there are more pages. Why? The data is spread across more pages than necessary with lots of empty space. And then even worse: once it’s in memory the entire page is in memory, even the “white space”. As you can now see, this will be a huge waste of not only disk IO but also precious RAM.

    The Bad Fix

    Many DBAs react to performance complaints on a query by checking External Fragmentation. They often conclude that they need to rebuild their index when they see the high percentage of fragmentation. Once they do the performance is suddenly better and they pat their backs and say, “Job well done!” But what they don’t realize is most likely it wasn’t the actual defragmentation that solved the problem.

    When a rebuild occurs three things happen:

    1. The actual rebuild
    2. A full stats update
    3. Clearing any plans from cache that involved that index

    Most likely what fixed your issue was #2 or #3.

    The Good Fix

    So it’s much better to look at and fix those two first:

    Full Stats update: Look to see if the the index has out of date stats. If so, then update the stats and check the query again. Stats updates are much less resource intensive than an index rebuild.

    Plan Cache clearing: This may have solved your problem because of a bad query plan. Instead, you should check the query plan in cache. See if it makes sense for what you are trying to do. If this was your issue, most likely you have parameter sniffing that you need to look into.

    Internal Fragmentation: If neither of the above has solved your problem, then look at Internal Fragmentation. If you find a lot of Internal Fragmentation, then go ahead and rebuild the index. You should check your fill factor. Make sure it’s at the default to decrease the chances of this happening again. (I plan to have a future post about how to adjust fill factor if you believe it needs to be adjusted. So be sure to subscribe so you won’t miss it when it comes out!)

    Obviously after this point if you are still having issues, it’s time to dig deeper into the query plan. Examine how you can improve it. Use just good old query tuning techniques.

    Have you found other ‘bad fixes’ that DBAs commonly rely on, and what’s your ‘good fix’ alternative?