Click here to monitor SSC

John Magnabosco

SQL Server Development and Data Security

Random Numeric Variance

Published Wednesday, July 29, 2009 5:57 AM

In the efforts of preventing unauthorized access to sensitive data, scrambling production data for testing, staging and development environments is a recommended practice. There are many ways to approach data scrambling. One option for numeric values is a process called "numeric variance".

Numeric variance is a process in which the actual numeric values contained within a specified column are increased or decreased by a given percentage. For example, an original value of a bank account balance for a customer is $500. With a numeric variance of 10% applied, the new value for the account balance will be changed to $550.

The addition of a randomizer will strengthen the results of the numeric variance. This strength is that each row of the affected column will be increased or decreased by a different percentage than the other rows.

The RAND function can be utilized to generate a random number. By leaving out the seed argument, SQL Server will dynamically determine the seed that will be used. If a hard-coded value is entered into the seed argument the value returned will be the same for each execution.

Below is some sample code that can be used to perform a randomized numeric variance:

-- Set original value
DECLARE @OrigVal int
SET @OrigVal = 500

-- Set variance to 10%
DECLARE @VarPct numeric (5,2)
SET @VarPct = 10

The preceding code simply provides us with variables to store our original value ($500) and our maximum variance percentage (10%). This process can be created either as a stored procedure or a user defined function; if created as such, these can be used arguments.

-- Set number range for random number
DECLARE @Range int
SET @Range = (((0-@VarPct)+1)-@VarPct)

-- Set random element
DECLARE @Rand int
SET @Rand = CONVERT(int,(@Range * RAND() + @VarPct))

The preceding code uses the RAND function to create a randomized value. When you attempt to create a user-defined function with RAND function contained within it you will receive the following error:

Msg 443, Level 16, State 1, Procedure <UDF NAME>, Line <LOCATION OF RAND>
Invalid use of a side-effecting operator 'rand' within a function.

A work around this issue is to create a simple view that can be referenced to obtain the RAND value. In our specific example, the resulting @Rand variable will contain the value of -19. This value represents the range from -9% to 10%, including 0%.

-- Determine numeric variance
DECLARE @Variance int
SET @Variance = CONVERT(INT,((@OrigVal*@Rand)/100))

-- Return new value
SELECT @OrigVal + @Variance
GO

The preceding code uses our randomized value and generates a new variance with each execution. For illustration purposes let's say that the @Variance value returned is 35. This value is then added to our original value to change 500 to 535. If @Variance returned -45 the adding of a negative number is the same as subtracting a positive number; therefore 500 + -45 = 455.

This process is repeated for each row. At the end, our result is a successfully scrambled set of numeric values that protects the sensitive data that is contained within these columns.

by Johnm

Comments

No Comments
You need to sign in to comment on this blog

About Johnm

John Magnabosco manages the Data Services Group at one of the fastest growing companies in the United States. He is also a Co-Founder of the Indianapolis Professional Association for SQL Server (IndyPASS), Co-Founder of IndyTechFest, the author of the book titled "Protecting SQL Server Data" and contributing author of "SQL Server MVP Deep Dives Volume 2".
<July 2009>
SuMoTuWeThFrSa
2829301234
567891011
12131415161718
19202122232425
2627282930311
2345678
How to Kill a Company in One Step or Save it in Three
 The majority of companies that suffer a major data loss subsequently go out of business. Wesley David... Read more...

Migrating from OCS 2007 R2 to Lync: Part 4
 Having migrated the rest of our users and legacy resources across and started getting ready to... Read more...

Automated Script-generation with Powershell and SMO
 In the first of a series of articles on automating the process of building, modifying and copying SQL... Read more...

Seth Godin: Big in the IT Business
 Seth Godin has transformed our understanding of marketing in IT. He invented the concept of 'permission... Read more...

Using SQL Test Database Unit Testing with TeamCity Continuous Integration
 With database applications, the process of test and integration can be frustratingly slow because so... Read more...