Tag Archives: DBA

Skip the Full Snapshot: Add a Single Article to SQL Replication Fast!

“I will always choose a lazy person to do a difficult job because a lazy person will find an easy way to do it.”

Bill Gates, Founder Microsoft

Lets imagine you have a multi-terabyte database you are replicating. Now let’s say you either need to add in a new table into that subscription or heaven forbid something went catastrophically wrong and you need to reinitialize one of the tables that already exists in the subscription. Sounds like it’s time to buckle up and wait for a while until that snapshot is taken just for that one article to be applied on the subscriber side, right? Wrong!

What if I told you there was a faster way? In this post we’ll walk through the steps to make this happen including how to watch the action as it’s happening from publisher to subscriber.

ALWAYS TEST IN NON-PRODUCTION FIRST!!!!

The “Full Snapshot” Headache

Performing a full snapshot will not only take a long time, but it’ll use a lot of resources and potentially cause blocking on unaffected tables that have nothing to do with the table you really want to get replicated.

The Fast Way: Add an Article Without a Full Snapshot

  1. (If needed) Cleanup: If the scenario is you need to fix a broken replication of just one table, remove that article from your subscription and then continue on. If you are looking to just add a new table into your replication, skip this step and move on to the next.
  2. Change Publication Settings: There are two config settings that will allow you to bypass the need for a full snapshot to be taken – allow_anonymous and immediate_sync. Below are the commands you’ll want to run to change both of these values to false.
USE [YourPublisherDatabase]

EXEC sp_changepublication 
    @publication = N'YourPublicationName', 
    @property = N'allow_anonymous', 
    @value = 'false'

EXEC sp_changepublication 
    @publication = N'YourPublicationName', 
    @property = N'immediate_sync', 
    @value = 'false'
  1. Add Your Article: Once you run those commands, you can add the article to the subscription.

Watch it Work in Replication Monitor

Then the easiest way to watch everything progress is to use the Replication Monitor. Once you have the monitor open for the publisher server, navigate to the publication you are working with and open the Agents tab.

  1. Run the Snapshot Agent: In the Agents tab, Start the snapshot agent. This will kick off a new snapshot, but if you right click and go to View Details on the agent you will see it only snapshot the new article.
  2. Verify: Once that completes, you’ll want to immediately back out to the publication level again and go to the All Subscriptions tab. Then right click on the subscription and go to the View Details. In the Distributor to Subscriber History tab, you can watch that new article get copied to the subscriber.

    Don’t Forget to Clean Up!

    Revert Settings: Once done, all you need to do is clean up the configurations again. Go back to the publisher and run these commands to change both values to true again (note it’s the opposite order as you ran it last time).

    USE [YourPublisherDatabase]
    
    EXEC sp_changepublication 
        @publication = N'YourPublicationName', 
        @property = N'immediate_sync', 
        @value = 'true'
    
    EXEC sp_changepublication 
        @publication = N'YourPublicationName', 
        @property = N'allow_anonymous', 
        @value = 'true'
    

    Much quicker than doing a full snapshot, right?! All that was required was to change those two configurations in the subscription options and then do your normal steps after that. By doing these few extra steps at the beginning you will save tons of resources and time when you want to add a single article into an existing subscription.

    But as always, be sure to always test this in nonproduction first!!!

    Many thanks to Ragnar Skalvik (DBA Architect) for coming up with this working solution for our team at work! His assistance and tenacity at finding solutions is immeasurable!


    Continued learning…

    Transactional Replication
    If you’re a bit new to the whole setup or need to explain it to your boss, this is the main “what is transactional replication?” page. It gives you the full 10,000-foot view of the technology.

    sp_changepublication (Transact-SQL)
    This is the official page for the main stored procedure we’re using. It’ll show you every single property you can change, but the ones we care about are allow_anonymous and immediate_sync.


    What other “replication tricks” have you learned over the years to save time and resources? Put your tips in the comments below!

    The TOP (N) Trick: Short Circuit Your SQL Server Counts for Massive Speed Gains

    “It is not enough to be busy… The question is: What are we busy about?”Henry David Thoreau (Poet and Philosopher)

    Have you ever had to write a query to check some threshold count of records and then do some action based on that?

    For instance, let’s say your reporting website shows a red indicator when you have more than 10,000 orders of a specific category type in the last 30 days. Otherwise, that indicator is green. You don’t care if there is only one more than your limit or one million more.

    The Obvious (and Slow) Solution: SELECT COUNT(*)

    The obvious first answer is to do a full count of everything that matches that criteria:

    SELECT COUNT(*) Cnt
    FROM Sales.Orders
    WHERE CategoryID = 5
      AND CreateDt >= GETDATE()-30;
    

    Lets say this gives us 156,000 records, which is well above the threshold of 10,000. But look at the cost: it produced 12,080 reads and took 380ms (which you can see using SET STATISTICS IO ON and SET STATISTICS TIME ON). This is on a query that is already using an index and whose stats are up to date.

    This query is so slow because it must count every single matching row, even after we have easily counted past our 10,000 record threshold. But how can we short circuit this counting and stop at our threshold?

    The Fast Solution: Short-Circuiting with TOP (N)

    I’m sure you have all used SELECT TOP (N) at some point to just get a sample of what a table looks like. This same logic can help us short circuit our counting. What if we just asked SQL for the top 10,000 records and then counted that result?

    We can do this with an inline subquery:

    SELECT COUNT(*) Cnt
    FROM (
        SELECT TOP (10000) ID
        FROM Sales.Orders
        WHERE CategoryID = 5
          AND CreateDt >= GETDATE()-30
        ) AS r;
    

    This query now only does 54 reads (0.4% of the original) and completes in 12ms (3% of the original).

    It works because the inner query (SELECT TOP (10000)) stops as soon as it finds 10,000 matching records. The outer query then just counts the results of the inner query. If the count is 10,000, we know we have at least that many. If it’s less (for example 8,500), we know we’re under the threshold.

    The Takeaway

    So don’t forget: if you don’t really care about the total count, but just that it’s above some limit, don’t waste resources and time doing a full COUNT(*). Simply put a short circuit into your query with TOP (N) so it stops counting at your given threshold!


    Do you know of any other methods in SQL Server to solve this “at least N rows” problem? Share them in the comments!