A scrub server is a server where we will restore protected Production data. Production data must be changed to not expose data that is protected by government regulations. Some of the government regulations will include privacy and personal identifiable information (PII data) laws. Other laws might be HIPPA and PCI laws. This is not an exhaustive list of all laws that covers data in your production environment. Check with your government and management for laws you need to adhere to.
With that said, when I set up a Scrub server, it is in a very secure area where the data access is very, very limited. For example, in my current company, the server is in a separate domain from Production and QA/Dev. Only DBAs are allowed to access this server. If you have multiple DBAs at your location, you may want to even limit which DBAs have access to this server. Our goal is to automate the entire scrubbing process so no one has to access the data including copying backup files from Production and to a shared scrub location for QA/Dev to retrieve.
First, we have to determine which version of SQL Server will be on the Scrub server or you may decide to have multiple versions of SQL Server. I chose to only have one version of SQL Server on my Scrub Server. Determining the SQL Server version will all depend upon the version of SQL Server in Production. Hopefully, your QA/Dev environments will be at least the same version of SQL Server as Production or newer. Note, I did not say the same service pack or cumulative update version but SQL Server 2008 R2, 2012, 2014 or 2016. You may be in the process of deploying service pack 1 to Production but if you are like me, you start out changing your Development servers first and if the deployment goes well moving the service pack to QA, then UAT and finally to Production. So we must be on the same major version number as Production, not the same service pack.
Second, you must have a lot of disk space. By far, this is my largest SQL Server in terms of hard drive space. People may say you are wasting disk space but remember what you are really doing is protecting the Company from a data breach. When protecting the data was explained to management by scrubbing data, disk space request was no longer an issue. On my scrub server, I have enough disk space to restore all of my Production databases from all of my Production servers. Yes, that is a lot of disk space. My scrub server has 1.6 TB of disk space. To put this in relation, my largest database is 160 gigs. In my previous company where I had PCI data, the largest production database was 700 gigs and my scrub server was 4 TB of disk space. Remember you will need space for the backup from Production, the restored database and the backup you will take after you scrub it. So 160 gigs database would need 480 gigs of space if you can’t compress the backup files.
Finally, I recommend the Scrub server to be a Virtual Server to keep some cost down. My Scrub VM server has 2 vCpu and 8 gigs of Ram. The scrub server does not need much horsepower. With automation, I restore, scrub and backup 17 databases in 25 minutes run by a SQL Server Agent job with Powershell and T-SQL.
As we wrap up this blog, the most important thing to remember when creating a scrub server is security. You will be restoring Production data, keep it secure. Let’s build, troubleshoot and test with the best possible dataset we can provide to our team by scrubbing Production data. Next blog post will discuss how we discover what data needs to be scrubbed and how I scatter the data across the database.