Every day as we test and develop, we need data to see how a screen will work. If the screen is filling in an address of an employee and the employee is adding their home address state, how does the screen behave? Is the UI allowing just US states or is this a worldwide application? If this is a legacy database system, what values have been allowed in the past? What happens if I run a state report?
Have you ever heard, “but it works on my machine”? Is this because of data perfection in Development and QA or having specific failure conditions? Can you think of all the data scenarios that accompany Production data? What about performance? Why did the application fail? What happens if I add this index?
Here are the reasons I believe you should get a scrubbed version of your production database into your Development, QA and UAT environments.
- Data Selectivity – if you have data in your address table, do you have addresses, which are 90 percent one state, California, or are the states spread out. If your database is only a US database, do you only have US states or do you have other data?
- Application Errors – You need to figure out why the application is failing in Production, you need to validate the error condition. Is there more than one row of data causing this error? Did a data value overflow the data type?
- Data Usage Stats – When you look at the data statistics, what happens if you add an index? Does this explain why sometimes your query is fast and other times it is very slow?
- Data Anomalies –What data are you expecting in the field? Is the data what you are expecting? Do we need to fix the screen to only allow certain values to be entered? What if you have a legacy database which has unexpected data values, how does the application handle the situation if the data can’t be changed?
- Index Maintenance – What is the effect of the index you want to add or remove?
- Data Security – Never move real Production data outside of Production without scrubbing the data first.
All of this should not be done or debugged in Production. Let’s get a copy from Production, scrub it and work with this data. Next, let’s learn how to build a data scrubbing environment.