Developing Environmentally Aware Database Servers

All of us have Production database servers and hopefully, you also have additional database servers for Development, QA, and UAT. Some IT shops will also have a Continuous Integration server and maybe other servers. If you only have Production servers this needs to be addressed and is outside the scope of this post. In the locations where I have worked, we also have a Scrub server. The question is when a script executes, do you know what environment the query is executing in? Most scripts will not care what environment the script executes in but other scripts could cause damage in a Production environment. For example, if the script is removing email addresses so you don’t spam your clients with automated email messages, you would not want the script to execute in a Production environment.

So how do you make your database server environmentally aware? Here is how I do it:

  • Create a DBA database to use for DBA purposes, not master!! I call mine, Admin. Only restore this database to the same server or a new server that is replacing the old server. It will contain the server environment information.
  • Create a table to store specific information about the server.
  • Add fields to store information key and information value. Example: Environment = Production or Environment = QA
  • Populate the fields with Environment and the value for the server you are on. You can do this by maintaining a population script.

The DBA database will be on every database server including developers computers, Environment = Dev. No database server will be excluded. If SQL Server is installed on the computer, the DBA database will be on the computer.

Once this has been accomplished, you are now able to make your scripts environmentally aware. How would you do this? When your script needs to be environmentally aware, get the environment location from you DBA database. Here are two examples of how I use the environment information:

  • Data change scripts – If I don’t want the change script to execute in Production, at the beginning of the script I would check the environment location. When I am in the Production environment I would just return.

If @Environment = ‘Prod’

      Return;

  • Security scripts – I change the value of the security based upon environment

Case when @Environment = ‘Prod’ then ‘ProdAccount’

       When @Environment = ‘UAT’ then ‘UATAccount’

       When @Environment like ‘QA%’ then ‘QAAccount’

       When @Environment = ‘Dev’ then ‘DevAccount’

 

Notice, QA may have multiple servers and if I am using the same security account, I could use a like to search multiple QA servers.

Protect yourself and make your database servers environmentally aware!

 

Advertisements

Why Do I Need Production Data for Testing?

Every day as we test and develop, we need data to see how a screen will work. If the screen is filling in an address of an employee and the employee is adding their home address state, how does the screen behave? Is the UI allowing just US states or is this a worldwide application? If this is a legacy database system, what values have been allowed in the past? What happens if I run a state report?

Have you ever heard, “but it works on my machine”? Is this because of data perfection in Development and QA or having specific failure conditions? Can you think of all the data scenarios that accompany Production data? What about performance? Why did the application fail? What happens if I add this index?

Here are the reasons I believe you should get a scrubbed version of your production database into your Development, QA and UAT environments.

  1. Data Selectivity – if you have data in your address table, do you have addresses, which are 90 percent one state, California, or are the states spread out. If your database is only a US database, do you only have US states or do you have other data?
  2. Application Errors – You need to figure out why the application is failing in Production, you need to validate the error condition. Is there more than one row of data causing this error? Did a data value overflow the data type?
  3. Data Usage Stats – When you look at the data statistics, what happens if you add an index? Does this explain why sometimes your query is fast and other times it is very slow?
  4. Data Anomalies –What data are you expecting in the field? Is the data what you are expecting? Do we need to fix the screen to only allow certain values to be entered? What if you have a legacy database which has unexpected data values, how does the application handle the situation if the data can’t be changed?
  5. Index Maintenance – What is the effect of the index you want to add or remove?
  6. Data Security – Never move real Production data outside of Production without scrubbing the data first.

All of this should not be done or debugged in Production. Let’s get a copy from Production, scrub it and work with this data. Next, let’s learn how to build a data scrubbing environment.

SSRS Reports Crawling – Too Many Images?

Over the past several weeks, I have been trying to find out why the Reporting Services report was taking so long to render.  I just wrote my first Reporting Services report just a few weeks ago.  I have administered Reporting Services for a few years but have never built a report.  In this new job I took early this year, my first project with the new company was to create a cube and we would have reports executing against the cube.  We hired a contractor to help me since this was the first cube where I would help build it.

We finished the project in time but since I was new and the contractor was helping, we did not always have all the facts.  Right after he left, we deployed it to Production.  Yes it was there.  The processing of the cube is very fast, under 4 minutes average processing running over 80 stored procedures to gather the information. The cube processing time was excellent.

The data was in the cube very quick  but running the reports was like watching paint dry on the wall.  We only have 5 different reports.  No matter how simple, the report was very slow.  I could request a report and I could walk away from my desk, go get coffee or tea, talk to some other developers, return to my desk 8 minutes later and the report would still not be finished.  The report has expandable columns and if we clicked to open an expandable part of the report you would have to wait another 5 minutes before you would see the expanded part of the report.  As we all know, this is totally unacceptable.

I read all sorts of articles and blogs, nothing helped.  We had two fact tables and I combined it to one Fact table.  On the single Fact table, I added ColumnStore Index.  This should fix the slowness of the report, right? No, the report was still slow.

While testing our product manager noticed images in the report to show either a check mark or a “x” mark.  In the report we were using the gauge indicator to display the check mark or “x” mark depending upon the value in the database.  Much to our dismay, the image coming from the indicator gauge was being rendered several hundred times on several of the reports causing the slowness.  Not Good!

    Solution to our slow running reports was to remove the gauge indicator and make the check mark and “x” mark a text value instead using the WindDings font.  After changing all the reports to use text instead of a image, the reports rendered 300 times faster.  No more slow reports, issue solved.

.

Page Life Expectancy Issues

When I took this new job in January, I inherited a Windows 2003 Standard server running Sql Server 2005 with only 4 gigs of ram.  How in the world does this server even run Sql Server with only 4 gigs of ram.  As I always do, I set the min and max memory setting for Sql Server with min being 1 gig of ram and the max getting 2 gigs of ram.  If you are not aware, Windows 2003 Standard maximum memory is 4 gigs.

As I began to monitor this server, the first thing I did was ask for more ram since I was seeing Page Life Expectancy issues everyday.  If I could get a page to last 3 seconds, I was lucky.  I knew to get better performance from the Sql Server, I had to solve this issue.  Rebuilding the server was not an option since development was building a newer version of the product.  To make it even worse, the server has the web and app layer also on this server.  I know you are not suppose to do this, but remember I inherited it.  Lucky me!

Due to a great resource of BrentOzar.com, I ran sp_blitzindex on the server to see if what type of indexing issues might be occurring.  I saw several indexes not being used and missing indexes.  I also saw tables with no clustered indexes.  Last night with a change control ticket, I fixed all the missing indexes issues and on one query which was timing out some times, I added the missing clustered index.  I also dropped indexes not in use.  This morning when I came in to review the monitoring of the server, I noticed my Page Life Expectancy had changed dramatically.  The server when from 3 seconds to 2 hours 17 minutes and 51 seconds.  Wow, don’t tell my system administrators but fixing the index issues also solved memory pressure issues.  You know a DBA always wants more memory.  Yes, I knew it was going to help but not to the degree of change I am seeing now.

Check out Brent Ozar and look for the sp_blitzindex and sp_blitz scripts.  Brent and team, you make me look very smart.  Thanks for all you do for the community.

Missing mssqlsystemresource database

Last night, one of our Development servers had a service pack get pushed to the server and automatically install.  I was unaware our patching process was automatically patching all of our servers which are not in Production.  Not a good idea, I would rather control this process and deploy the patches in a uniform process.

After the server was patched, the patch required a reboot.  When the server rebooted, Sql Server would not start up.  After reviewing the Windows Application log and Sql Server log, I quickly discovered the mssqlsystemresource database was missing.  I did search for it just in case it was placed somewhere else, but to no avail.

If the database was misplaced, I could have used this article just to move it back to the proper location, http://sqlblog.com/blogs/jonathan_kehayias/archive/2009/01/22/error-945-severity-14-state-2-database-mssqlsystemresource-cannot-be-opened.aspx.

In my situation, the database was not misplaced, but it was gone.  Next I read the blog, http://social.msdn.microsoft.com/Forums/en-US/sqldatabaseengine/thread/9ddc5d52-5835-405f-ad38-ac25022373f4/.  As you can read, I needed to find a server with the same version number to copy the files from the working server to the broken server.  I opened the Sql Server error log to get the version of Sql Server.  I scrambled around the office looking for a server with the same version number of the broken server.  No luck.

What is a DBA to do?  I have the version number but what do I need to install.  Thanks to our http://sqlserverbuilds.blogspot.com/ which shows use all the versions and links to the website for the download. I started to think how could I get this server back online, create a VM, install Sql Server 2008 R2 with cumulative update 1. I could then copy the files I needed from the new server to the old server.  Do I need to save the Master and MSDB databases?  This could take a while.  I need this server back online.

Instead, I went to Add/Remove program, found the Sql Server program and did a repair installation.  After the repair completed, my server was back online.  Yea!

But I am not finished, what do I need to do to prevent this same issue in the future.  The files look like a database, why can’t I just back them up in Sql Server like other databases. Microsoft does not let you do that.  But per this article, http://msdn.microsoft.com/en-us/library/ms190190.aspx, you can backup the files for emergency recover.  If needed, you can use the backup files for recovery but be carefully as the article states.  Finally, I think I am protected here.  Time to move on to the next issue.