Rodney

SQL Server and The Holy Grail

Published Thursday, June 21, 2007 9:57 PM

That title should garner some comments, I suspect, or if not then the following questions should:

Does there exist a way to provide real-time - or a close proximity thereof - replication from a SQL production database to a reporting database with the following rquirements:

1.) No source shema changes (even assuming that the source DB does not have primary keys defined)

2.) Does not require SQL Server replication and the overhead and administration implied therein.

3.) Is not a failover/data mirror non-accessible representation of the source data that has to have snapshots for reporting.

4.) Costs less than 60,000 pounds or whatever astronimical amount that translates to in US dollars.

5.) Does not require a team of sales people to meet with you and your boss in person to discuss the "solution" over a lunch that they will pay for surely, preferably after hours, so that they can get you slogged enough to believe that 60,000 pounds is reasonable.

I dare say I have yet to find such a solution. Shall I handcraft one for myself and my company, languish in the accolades for a few weeks and then sell it? I could do it. Really, I could. BUT...that would take precious time away from me babying my other hard working SQL servers that complain little and work for next to nothing.

With the myriad questions in this post, I expect to see some comments. Okay, all 431 of you viewers...stop reading and write something. Does it exist?

by Rodney

Comments

 

Phil Factor said:

It is surprisingly easy to do. The problem that Microsoft, and other commercial suppliers, have is that they can't assume anything about the way a database is to be used. This is why their solutions must be so complex.
If you have proper control over your database application, you've probably got a defined interface based on stored procedures, and you stamp on the fingers of any crazed developers who dare to touch any of the base tables. You can then design your database to execute these procedures on your standby systems as the stored procedures are called, using some nice queuing system like SB. If you feed two identical databases the same input, they ought to stay identical.
Failover is as simple as switching the DSN and changing a row in a 'Globals' table in your 'Primary' and 'Standby'.

I've used a couple of the third-party failover systems. They seem to work, but tend to lull you into a false sense of security. They are no substitute for the DBA rituals we perform. Databases seem to fail in very unexpected ways, occasionally: not always through an obvious hardware fault or database corruption.
June 22, 2007 3:05 AM
 

GSquared said:

A slight variation on Phil's solution: Have two databases that consist of identical tables, constraints, triggers, etc., all procs perform all Insert/Update/Delete actions in both.  All selects are from one or the other (reporting from one, everything else from the other).

That should give you two identical databases, without using log shipping or "replication".

I'm not sure how well it would perform.  I guess a lot of that would depend on hardware configuration and whether the two databases were physically proximate.

Of course, the same idea could work for non-identical tables, just with more work on the part of the procs.

Would kind of defeat much of the purpose of OLTP databases, if you slow down every update/delete/insert so that it can keep an OLAP database up-to-date.  (That seems to be what you're asking about.)

Alternately, if the reporting database doesn't have the indexing and such of a normal OLAP database, and can deal with fast inserts, triggers can populate it pretty easily, without the performance hit of updating a lot of indexes.  That keeps your OLTP database from bogging down, but reduces the usefulness of the OLAP database.

The other thing to do is partition your reporting database.  Have older data heavily indexed, for fast selecting.  Have current data lightly indexed (if at all).  Use partitioned tables and views for this.  Insert real-time data from your OLTP database (through your procs or through triggers) into your current-data tables.  Once the data is more than a few seconds/minutes/hours/days old (depending on performance and needs) and can be retired for reporting purposes, move it to the archived tables and index the heck out of it.  That might work, depending on the archivability of your data.

For example, once an order is fulfilled, move it from live to archived.  It might need an occassional update (refunds, returns, etc.), but most won't.

Once archived, it becomes a major pain to update (lots of indexes), but becomes, for the same reason, fast to select.

Would something like work for what you need?  Slightly slower real-time data, fast archived data, slightly slower OLTP but possibly faster than full-on replication/log-shipping.

(Note: This is a theoretical solution.  I have not tried it.  Might be worse than spending your dinner with some salespeople and spending some money, might not.)
June 26, 2007 2:33 PM
 

Eric Russell said:

Using quantum physics, you could run your DSS queries against a virtually identical copy of your database server located in a parallel universe exactly .0001 miliseconds in the past (or even 1 year in the future!!!). It will have zero impact on your production server, and the only downside is that it will perhaps cause concurrency problems for your parallel twin, who you will probably never meet anyway.
Now that I think about it, this may explain a lot of the sporatic CPU utilization performance problems I've been experiencing lately. Somebody needs to invent quantum email before this gets out of hand...
July 19, 2007 10:02 AM
 

dave007 said:

Hey Rodney,
I ran into this situation, I needed to sync certain tables, and only certain columns of certain tables, from production to standby, and sql replication overhead was unacceptable.

I wrote a proc sp_TableName_Sync for my simplest table, and spent a couple days converting it to a template in CodeSmith (learning codesmith in the process). Now everytime I need to sync a new table, I point the CodeSmith table at it, list the fields I want to exclude as a script property, and generate the proc.

I fire all the sp_TableName_Sync procs from a sp_Tables_Sync proc in a scheduled SQL Job, so it happens every X minutes, or I can fire it manually as needed. It writes to a log table every time it's called, and when there's a change to the table's record(s) it logs the 'sync' for debugging/tracing.

It's quick, and I'm in complete control of each table's sync.
Post a reply and I can fill in the details for you.
July 19, 2007 7:03 PM
 

Rodney said:

dave007...Yes, I would absolutely like to review the details.  One of our partners has scripted a solution with RedGate SQL Data Compare Professional (which supports scripting) but I have not delved into that solution as yet. Once I review that I will post that up as well.
And Eric...I am working on a neucleotide that can store the electron orbital cloud patterns of Hydrogen and propogate the data to the adjacent atom and so one that would in time (relatively) alter the patterns of the existing universe.
July 23, 2007 9:48 AM
You need to sign in to comment on this blog

















<June 2007>
SuMoTuWeThFrSa
272829303112
3456789
10111213141516
17181920212223
24252627282930
1234567
Using Powershell to Generate Table-Creation Scripts
 For all of us who learn best by trying out examples, Bob Sheldon produces a PowerShell script file for... Read more...

Configuring Exchange Server 2007 to Support Information Rights Management
 In Exchange Server 2007, Information Rights management is easy to set up once you have set up the... Read more...

SQL Response: The dim sum interview
 Richard Morris met David and Nigel of the SQL Response team, in a dim sum Restaurant in Cambridge. They... Read more...

Why This SQL Server DBA is Learning Powershell
 Ron describes how he decided to study Powershell as a single scripting system to automate all the... Read more...

Using Covering Indexes to Improve Query Performance
 Designers of database systems will often assume that the use of a clustered index is always the best... Read more...