Click here to monitor SSC
  • Av rating:
  • Total votes: 140
  • Total comments: 43
Arthur Fuller

Database Design: A Point in Time Architecture

22 February 2007

Point in Time Architecture (PTA) is a database design that guarantees support for two related but different concepts – History and Audit Trail.

  • History – all information, both current and historical, that as of this moment, we believe to be true.
  • Audit Trail – all information believed to be true at some previous point in time.

The distinction is that the Audit Trail shows the history of corrections made to the database. Support for History and Audit Trail facilities are notably absent from typical OLTP databases. By "typical", we mean databases that support the traditional Select, Insert, Delete and Update operations. In many cases, typical OLTP databases are perfectly fine for their requirements, but some databases demand the ability to track History and Audit Trail as core requirements. Without these abilities, the database will fail.

Typical OLTP databases destroy data. This is most obvious with the Delete command, but a moment's thought reveals that the Update command is equally destructive. When you update a row in a table, you lose the values that were there a moment ago. The core concept in PTA is this: no information is ever physically deleted from or updated in the database.

However, some updates are deemed important while others are not. In all likelihood, the data modeler, DBA, or SQL programmer will not know which updates are important and which unimportant without consultation with the principal stakeholders. A mere spelling error in a person's surname may be deemed unimportant. Unfortunately, there is no way to distinguish a spelling error from a change in surname. A correction to a telephone number may be deemed trivial, but again there is no way to distinguish it from a changed number. What changes are worth documenting, and what other changes are deemed trivial? There is no pat answer.

The Insert statement can be almost as destructive. Suppose you insert ten rows into some table today. Unless you've got a column called DateInserted, or similar, then you have no way to present the table as it existed yesterday.

What is Point In Time Architecture (PTA)?

PTA is a database design that works around these problems. As its name implies, PTA attempts to deliver a transactional database that can be rolled back to any previous point in time. I use the term "rolled back" metaphorically: traditional restores are unacceptable for this purpose, and traditional rollbacks apply only to points declared within a transaction.

A better way to describe the goal of a PTA system is to say that it must be able to present an image of the database as it existed at any previous point in time, without destroying the current image. Think of it this way: a dozen users are simultaneously interrogating the database, each interested in a different point in time. UserA wants the current database image; UserB wants the image as it existed on the last day of the previous month; UserC is interested in the image of the last day of the previous business quarter; and so on.

Requirements of PTA

Most obviously, physical Deletes are forbidden. Also, Inserts must be flagged in such a way that we know when the Insert occurred. Physical Updates are also forbidden; otherwise we lose the image of the rows of interest prior to the Update.

What do we need to know?

  • Who inserted a row, and when.
  • Who replaced a row, and when.
  • What did the replaced row look like prior to its replacement?

We can track which rows were changed when in our PTA system by adding some standard PTA columns to all tables of PTA interest. I suggest the following:

  • DateCreated – the actual date on which the given row was inserted.
  • DateEffective – the date on which the given row became effective.
  • DateEnd – the date on which the given row ceased to be effective.
  • DateReplaced – the date on which the given row was replaced by another row.
  • OperatorCode – the unique identifier of the person (or system) that created the row.

Notice that we have both a DateCreated column and a DateEffective column, which could be different. This could happen, for example, when a settlement is achieved between a company and a union, which guarantees specific wage increases effective on a series of dates. We might know a year or two in advance that certain wage increases will kick in on specific dates. Therefore we might add the row some time in advance of its DateEffective. By distinguishing DateCreated from DateEffective, we circumvent this problem.

Dealing with inserts

The easiest command to deal with is Insert. Here, we simply make use of our DateCreated column, using either a Default value or an Insert trigger to populate it. Thus, to view the data as it stood at a given point in time, you would perform the Select using the following syntax:

SELECT FROM AdventureWorks.Sales.SalesOrderHeader
WHERE DateCreated [some PTA date of interest]

This scenario is all fine and dandy assuming that you are creating the table in question. But you may be called upon to backfill some existing tables.

If you are retrofitting a database to support PTA, then you won't be able to use a Default value to populate the existing rows. Instead you will have to update the existing rows to supply some value for them, perhaps the date on which you execute the Update command. To that extent, all these values will be false. But at least it gives you a starting point. Once the DateCreated column has been populated for all existing rows, you can then alter the table and either supply a Default value for the column, or use an Insert trigger instead, so that all new rows acquire their DateCreated values automatically.

Dealing with deletes

In a PTA architecture, no rows are physically deleted. We introduce the concept of a "logical delete". We visit the existing row and flag it as "deleted on date z." We do this by updating its DateEnd column with the date on which the row was "deleted". We do not delete the actual row, but merely identify it as having been deleted on a particular date. All Select statements interrogating the table must then observe the value in this column.

SELECT FROM AdventureWorks.Sales.SalesOrderHeader
WHERE DateEnd [PTA_date]

Any row logically deleted after our PTA date of interest is therefore assumed to have logically existed up to our date of interest, and ought to be included in our result set.

Dealing with updates

In PTA, updates are the trickiest operation. No rows are actually updated (in the traditional sense of replacing the current data with new data). Instead, we perform three actions:

  1. Flag the existing row as "irrelevant after date x".
  2. Copy the values of the existing row to a temporary buffer.
  3. Insert a new row, copying most some of its values from the old row (those that were not changed), and using the new values for those columns that were changed. We also supply a new value for the column DateEffective (typically GetDate(), but not always as described previously).

There are several ways to implement this functionality. I chose the Instead-Of Update trigger. Before investigating the code, let's describe the requirements:

  1. We must update the existing row so that its DateReplaced value reflects GetDate() or UTCDate(). Its DateEnd value might be equal to GetDate(), or not. Business logic will decide this question.
  2. The Deleted and Inserted tables give us the values of the old and new rows, enabling us to manipulate the values.

Here is the code to create a test table and the Instead-Of trigger we need. Create a test database first, and then run this SQL:

CREATE TABLE [dbo].[Test_PTA_Table](
   [TestTablePK] [int] IDENTITY(1,1) NOT NULL,
   [TestTableText] [varchar](50)
COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
   [DateCreated] [datetime]

NOT NULL CONSTRAINT [DF_Test_PTA_Table_DateCreated]

DEFAULT (getdate()),
   [DateEffective] [datetime] NOT NULL,
   [DateEnd] [datetime] NULL,
   [OperatorCode] [varchar](50)

COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
   [DateReplaced] [datetime] NULL

CONSTRAINT [DF_Test_PTA_Table_DateReplaced]
DEFAULT (getdate()),
 CONSTRAINT [PK_Test_PTA_Table] PRIMARY KEY CLUSTERED
(
   [TestTablePK] ASC
)WITH (PAD_INDEX  = OFF, IGNORE_DUP_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]

Here is the trigger:

CREATE TRIGGER [dbo].[Test_PTA_Table_Update_trg]
-- ALTER TRIGGER [dbo].[Test_PTA_Table_Update_trg]
   ON  [dbo].[Test_PTA_Table]
   INSTEAD OF UPDATE
AS

   SET NOCOUNT ON
   DECLARE @key int
   SET @key = (SELECT TestTablePK FROM Inserted)

   UPDATE Test_PTA_Table
      SET DateEnd = GetDate(), DateReplaced = GetDate()
      WHERE TestTablePK = @key

   INSERT INTO dbo.Test_PTA_Table
   (TestTableText, DateCreated, DateEffective, OperatorCode, DateReplaced)
   (SELECT TestTableText, GetDate(), GetDate(), OperatorCode, NULL
FROM Inserted)

A real-world example would involve more columns, but I kept it simple so the operations would be clear. With our underpinnings in place, open the table and insert a few rows. Then go back and update one or two of those rows.

Dealing with selects

Every Select statement must take into account the dates just described, so that a query which is interested in, say, the state of the database as it appeared on December 24, 2006, would:

  • Exclude all data inserted or updated since that day.
  • Include only data as it appeared on that day. Deletes that occurred prior to that date would be excluded.
  • In the case of updated rows, we would be interested only in the last update that occurred prior to the date of interest.

This may be trickier than it at first appears. Suppose that a given row in a given table has been updated three times prior to the point in time of interest. We'll need to examine all the remaining rows to determine if any of them have been updated or deleted during this time frame, and if so, exclude the logical deletes, and include the logical updates.

With our standard PTA columns in place this may not be as tricky as it at first sounds. Remember that at any particular point in time, the rows of interest share the following characteristics.

  • DateCreated is less than or equal to the PTA date of interest.
  • DateEffective is greater than or equal to the PTA date.
  • DateEnd is either null or greater than the PTA date.
  • DateReplaced is either null or less than the PTA date.

So for our row that has been updated three times prior to the PTA date:

  • The first and second rows will have a DateEnd and a DateReplaced that are not null, and both will be less than the PTA date.
  • The third row will have a DateEffective less than the PTA date, and a DateReplaced that is either null or greater than the PTA date.

So we can always query out the rows of interest without having to examine columns of different names, but rather always using the same names and the same semantics.

PTA implementation details

The most important thing to realize is that it may not be necessary to trace the history of every column in a table. First of all, some columns, such as surrogate IDs, assigned dates (e.g. OrderDate), and other columns such as BirthDate will never be changed (other than for corrections). Another example is TelephoneNumber. In most applications, it is not significant that your telephone number changed twice in the past year: what we care about is your current number. Admittedly, some organizations may attach significance to these changes of telephone number. That is why we can only suggest a rule of thumb rather than an iron-clad rule. The stakeholders in the organization will help you decide the columns that are deemed "unimportant".

What then qualifies as an important column? The rule of thumb is that important columns are those that have changeable attributes, and whose changes have significance.

Columns with changeable attributes are often called Slowly Changing Dimensions (SCDs). However, just because an attribute value changes, that doesn't imply that the change is significant to the business. There are two types of SCD:

  • Type 1 – columns where changes are of little or no interest to the organization
  • Type 2 – columns where changes must be tracked and history recorded.

An obvious example of a Type 2 SCD is EmployeeDepartmentID. Typically, we would want to be able to trace the departments for which an employee has worked. But again, this may or may not be important to a given organization. What we can say is this: it is rarely the case that all columns within a table are considered Type 2.

Once you have defined the Type 1 and Type 2 columns, you can then devise the programmatic objects required to handle both types. The Type 1 code won't bother with logical updates; it will perform a simple update., replacing the old value with a new one and not documenting this change in detail. The Type 2 code will follow the rules for logical updates and deletes.

Using domains

Depending on the development tools you use, you may or may not be able to take advantage of domains. (I am a big fan of ERwin and PowerDesigner, and almost never develop a data model without using them, except for the most trivial problems.)

In terms of data-modeling, a domain is like a column definition, except that it is not related to a table. You create a collection of domains, specifying their default values, description, check constraints, nullability and so on, without reference to any given table. Then, when you create individual tables, instead of supplying a built-in data type for a column, you specify its domain, thus "inheriting" all its attributes that you defined earlier. The less obvious gain is that should you need to change the domain definition (for example from int to bigint, or shortdatetime to datetime, or varchar(10) to char(10)), you make the change in exactly one place, and then forward-engineer the database. All instances of the domain in all tables will be updated to correspond to the new domain definition. In a database comprising hundreds of tables, this approach can be a huge time-saver.

Although I love domains, I have found one problem with them. In my opinion, there ought to be two kinds of domains, or rather a double-edged domain. Consider a domain called CustomerID. Clearly, its use in the Customers table as a PK is different than its use in various related tables, as an FK. In the Customer table it might be an int, Identity(1,1), whereas in the related tables, it will still be an int, but not an identity key. To circumvent this problem, I typically create a pair of domains, one for the PK and another for all instances of the domain as an FK.

Sample transactions

There is no panacea for creating a PTA database. However, using the examples provided here, plus some clear thinking and a standard approach, you can solve the problems and deliver a fully compliant PTA system.

Assume a Hollywood actress who marries frequently, and who always changes her surname to match her husband's. In a PTA, her transaction record might look like this: 

Table 1: Persons table with PTA columns and comment.

PersonID

Given

Surname

DateCreated

DateEffective

DateEnd

DateReplaced

Operator Code

Description

1234

Mary

O'Hara

01-Jan-04

01-Jan-04

jlarue

Prior to Wedding

1234

Mary

O'Hara

01-Jan-04

01-Jan-04

01-Jun-04

01-Jun-04

jlarue

Ends Maiden Name

2345

Mary

Roberts

01-Jun-04

02-Jun-04

bhoskins

Adopts New Surname

2345

Mary

Roberts

01-Jun-04

02-Jun-04

12-Dec-05

12-Dec-05

cwebb

Ends Marriage

3456

Mary

Kent

12-Dec-05

12-Dec-05

cwebb

Remarries and adopts new Surname

3456

Mary

Kent

12-Dec-05

12-Dec-05

06-Jun-06

06-Jun-06

bhoskins

Ends Marriage

4567

Mary

Clark

06-Jun-06

07-Jun-06

bhoskins

Remarries and adopts new Surname

Here we have the history of Mary's three marriages. Mary O'Hara entered the database on 01-Jan-04. In June of the same year she adopted, through marriage, the surname Roberts. This is reflected in our PTA database with the appropriate value inserted into the DateEnd and DateReplaced columns of Mary's row. We then insert a new row into the Persons table, with a new PersonID value, the updated surname and the correct DateCreated and DateEffective values. This process is repeated for each of Mary's subsequent marriages, so we end up with four rows in the Persons table, all referring to the same "Mary".

These three Primary Keys all point to the same woman. Her surname has changed at various points in time. To this point, we have considered History as referring to the history of changes within the tables. However, this example illustrates another concept of history: the history of a given object (in this case, a person) within the database. Some applications may not need to know this history, while others may consider this critical. Medical and police databases come immediately to mind. If all a criminal had simply to change his surname to evade his history, we would have problems in the administration of justice.

One might handle this problem by adding a column to the table called PreviousPK, and insert in each new row the PK of the row it replaces. This approach complicates queries unnecessarily, in my opinion. It would force us to walk the chain of PreviousPKs to obtain the history of the person of interest. A better approach, I think, would be to add a column called OriginalPK, which may be NULL. A brand-new row would contain a null in this column, while all subsequent rows relating to this person would contain the original PK. This makes it trivial to tie together all instances. We can then order them using our other PTA columns, creating a history of changes to the information on our person of interest.

Table 2: Persons Table with PTA and Original PK tracking column.

PersonID

Given

Surname

DateCreated

DateEffective

DateEnd

DateReplaced

Operator Code

OriginalPK

1234

Mary

O'Hara

01-Jan-04

01-Jan-04

jlarue

1234

Mary

O'Hara

01-Jan-04

01-Jan-04

01-Jun-04

01-Jun-04

jlarue

2345

Mary

Roberts

01-Jun-04

02-Jun-04

bhoskins

1234

2345

Mary

Roberts

01-Jun-04

02-Jun-04

12-Dec-05

12-Dec-05

cwebb

1234

3456

Mary

Kent

12-Dec-05

12-Dec-05

cwebb

1234

3456

Mary

Kent

12-Dec-05

12-Dec-05

06-Jun-06

06-Jun-06

bhoskins

1234

4567

Mary

Clark

06-Jun-06

07-Jun-06

bhoskins

1234

Given Point-In-Time 21-Dec-2005, then the row of interest is the penultimate row: the row whose DateEffective value is 12-Dec-2005 and whose DateEnd is 06-June-2006. How do we identify this row?

SELECT FROM Persons
WHERE OriginalPK 1234
AND DateEffective <= '21-Dec-2005'
AND (DateEnd IS NULL) OR (DateEnd '21-Dec-2005'

 

The DateEffective value must be less than or equal to 21-Dec-2005 and whose DateEnd is either NULL or greater than 21-Dec-2005.

Dealing with cascading updates

Let us now suppose that during the course of our history of Mary O'Hara, she changed addresses several times. Her simple changes of address are not in themselves problematic; we just follow the principles outlined above for the PersonAddresses table. If her changes of address correspond to her marriages, however, the waters muddy slightly., because this implies that she has changed both her name and her address. But let's take it one step at a time.

Mary moves from one flat to another, with no other dramatic life changes. We stamp her current row with a DateEnd and DateReplaced (which, again, might differ). We insert a new row in PersonAddresses, marking it with her current PK from the Persons table, and adding the new address data. We mark it with a DateEffective corresponding to the lease date, and leave the DateEnd and DateReplaced null. Should her surname change within the scope of this update then we mark her row in Persons with a DateEnd and a DateReplaced, then insert a new row reflecting her new surname. Then we add a new row to PersonAddresses, identifying it with Mary's new PK from Persons, and filling in the rest of the data.

Each time Mary's Person row is logically updated, thus requiring a new row with a new PK, so we must logically update the dependent row(s) in the PersonAddresses (and all other related tables with a new row that references the new PK in the Persons table). This also applies to every other table in our database that relate to Persons. Fortunately, we can trace the history of Mary's addresses using the Persons table.

In more general terms, the point to realize here is that every time a Type 2 update occurs in our parent table (Persons, in this case), a corresponding Type 2 update must occur in every related table. How complex these operations will be clearly depends on the particular database and its requirements. Again, there is no hard-and-fast rule to decide this.

Dealing with cascading deletes

A logical delete is represented in PTA as a row containing a not-null DateEnd and a null DateReplaced. Suppose we have a table called Employees. As we know, employees come and go. At the same time, their IDs are probably FKs into one or more tables. For example, we might track SalesOrders by EmployeeID, so that we can pay commissions. A given employee departs the organization. That certainly does not mean that we can delete the row. So we logically delete the row in the Employees table, giving it a DateEnd that will exclude this employee from any lists or reports whose PTA date is greater than said date – and thus preserving the accuracy of lists and reports whose PTA date is prior to the employee's departure.

On the other hand, suppose that our firm sells products from several vendors, one of whom goes out of business. We logically delete the vendor as described above, and perhaps we logically delete all the products we previously purchased from said vendor.

NOTE:
There is a tiny glitch here, beyond the scope of this article, but I mention it because you may have to consider what to do in this event. Suppose that you still have several units on hand that were purchased from this vendor. You may want to postpone those related deletes until the inventory has been sold. That may require code to logically delete those rows whose QuantityOnHand is zero, and later on to revisit the Products table occasionally until all this vendor's products have been sold. Then you can safely logically delete those Products rows.

Summary

The first time you confront the challenge of implementing Point in Time Architecture, the experience can be quite daunting. But it is not rocket science. I hope that this article has illuminated the steps required to accomplish PTA. As pointed out above, some applications may require the extra step of tracking the history of individual objects (such as Persons), while others may not need this. PTA is a general concept. Domain-specific implementations will necessarily vary in the details. This article, I hope, will serve as a practical guideline. I emphasize that there are rarely hard-and-fast rules for implementing PTA. Different applications demand different rules, and some of those rules will only be discovered through careful interrogation of the stakeholders. You can do it!

Arthur Fuller

Author profile:

Arthur Fuller has been developing database applications for more than 20 years. He frequently works with Access ADPs, Microsoft SQL 2000 and 2005, MySQL, and .NET.

Search for other articles by Arthur Fuller

Rate this article:   Avg rating: from a total of 140 votes.


Poor

OK

Good

Great

Must read
Have Your Say
Do you have an opinion on this article? Then add your comment below:
You must be logged in to post to this forum

Click here to log in.


Subject: RE: Audit mechanism...
Posted by: Anonymous (not signed in)
Posted on: Thursday, February 22, 2007 at 7:01 PM
Message: Excellent article. This is a straight forward technical. There are lot of professional programmers out there don't follow this. Hence I had to cleaned out the pieces and re-designed the database with the use of triggers.

Subject: Errors?
Posted by: Anonymous (not signed in)
Posted on: Friday, February 23, 2007 at 5:09 PM
Message: Please double-check your comparison operators in the sections "Dealing with deletes" and "Dealing with selects." Some of them seem backward to me.

Subject: Alternate designs
Posted by: Anonymous (not signed in)
Posted on: Friday, February 23, 2007 at 5:16 PM
Message: Reasonable approach. Good detail. Agree that database designers often miss this crucial need.

Personally I prefer using two tables, one to store the logical entity (e.g., Person) that only contains type 1 columns and a 1:Many table (e.g., PersonVersion) that stores each of its versions and has all the type 2 columns. The logical entity retains the same PK through its lifetime. Requires less date comparisons, too, especially for the typical case of wanting the current version.

I've seen other successful approaches, too. Depends on your particular needs, of course.

Thanks again for the nice article.

Subject: New ID's
Posted by: Anonymous (not signed in)
Posted on: Tuesday, February 27, 2007 at 7:13 PM
Message: This is an interesting approach. I am in the midst of a PTA type project. In my case, I am using "history tables" for the type 2 values. But queries are complicated because I have to join to each of 4 history tables (each record in the parent table has 4 values that must be tracked for history) and check the dates on each of them. Anyway, I don't understand why in your examples you assign new ID values when the Type 2 data changes? Why can't the PersonId stay the same. That really complicates things.
Thanks.

Subject: Addition to this approach
Posted by: Anonymous (not signed in)
Posted on: Wednesday, February 28, 2007 at 10:30 AM
Message: I've used a PTA approach very similar to this for several projects. I find it useful to also set up a few standard views for each entity in the db; 1. a view that shows all logical records with the valid start and end dates, 2. a view that shows the "current" values, 3. one or more views for default time intervals as needed for the application (1 hour ago, yesterday at midnight, last month end, last quarter end, etc.) These views can sometimes help speed up some select statements as well as providing a place to relate record level security.

Subject: amended article title
Posted by: Anonymous (not signed in)
Posted on: Wednesday, February 28, 2007 at 10:43 AM
Message: PiTA!!


Subject: re: Addition to this Approach
Posted by: Anonymous (not signed in)
Posted on: Wednesday, February 28, 2007 at 10:46 AM
Message: "These views can sometimes help speed up some select statements as well as providing a place to relate record level security."
I wanted to do something like this, but it only speed up the development - views negatively affects performance. Writing queries is a struggle, but you'll get good at using alias and max functions.

Subject: Tigger Update?
Posted by: Anonymous (not signed in)
Posted on: Wednesday, February 28, 2007 at 10:58 AM
Message: This is really good timing for me. I just had a discussion with one of our developers about this. I forwarded the link. Thanks.

One minor detail - unless it's not caught - is to use a join with the inserted table to do the update in the trigger. The other option is to raise an error in the trigger if there is more than one record being updated. The use of a local variable in a trigger normally is a red flag for me.

Subject: Normalization
Posted by: Anonymous (not signed in)
Posted on: Wednesday, February 28, 2007 at 11:26 AM
Message: In one of our applications, we have implemented something similar to this. The difference being (for right or wrong) we have an Archive table with the history and an Actual table with the current value. I like your approach better and am a true believer in not deleting data. I cannot tell you how many times I have been asked who made a change to a record and when it was changed. With the current processes, these questions can be answered about 50% of the time.
Upon developing our next application, however, we gave consideration to the issue of normalization, or in this case, denormalization. In theory, an attribute value should occur once and only once in its object's table, regardless of time. If a table has 30 columns and only one of the columns is updated (assuming it is worthy of archiving), 29 columns will be redundant. The issue of disk space, once a major consideration in data storage (i.e. 6 character dates) is now a non-issue. Disk space is cheap so just add another hard drive. Unfortunately, it overlooks the larger issue of normalization, which should be the major determining factor in data storage.
The other major factor to weight in the archiving process is the data overhead (storage and retrieval processes). The overhead increases tremendously when a normal path is chosen. Instead of row (or record) archiving, history is now being done at the field level. In order to get a record for an object at any given point in time, each field's state is determined by locating the field value as of that date/time and joining it with the other object's fields. Tables of meta data can be used to determine which fields will be archived.
Admittedly, this creates much more database management overhead, but, in some cases, data retrieval may be increased due to normalization.
That is my two cents worth.

Subject: PK?
Posted by: Anonymous (not signed in)
Posted on: Wednesday, February 28, 2007 at 5:20 PM
Message: I can't spot a consistent PK in the example. That leads me to surmise that this approach is theoretical and hasn't been implemented in a real world db yet. Is that right?

Subject: Further to the trigger update
Posted by: Anonymous (not signed in)
Posted on: Wednesday, February 28, 2007 at 5:24 PM
Message: Good article. I can see the benefits of this design.

Regarding the trigger update however, if you use a variable as you have done, then when a multi row update occurs, you won't capture the full extent of the changes.

It is important to remember that a triggers in some databases behave differently from others.

For instance, SQL Server triggers run once for multiple update statement, so you have to deal with the data as a set.

Subject: Nice Job
Posted by: Anonymous (not signed in)
Posted on: Wednesday, February 28, 2007 at 7:41 PM
Message: This was definitely worth the read... nice job, Arthur.

Subject: Normalized solution
Posted by: Anonymous (not signed in)
Posted on: Thursday, March 01, 2007 at 1:58 AM
Message: If you do the normalization correctly, then you will see that you will have one table for type1 columns and each type2 column will be in its own table (unless multiple type2 columns must change at the same time, in which case they should naturally be in the same table).

If you can create a table for each group of users showing the start and end dates they currently use, then a view can be created for each entity that contains all the many joins needed. Don't open this view:-) Mosts rdbs will then be able to optimize the view by compliation, caching and even the addition of indexes to the view.

Updates (or triggered inserts) will need to understand which tables to operate on.

This properly normalized logical system will be fast and efficient because its physical representation will be smaller (no repeated data), have smaller indexes, and - in general - optimal usage of pages.

One additional comment I would make is that the user should be able to decide whether an update is one that affects the validity date or not on type2 columns. What could be termed 'data refactoring' where spelling, grammar and clarity are improved without changing the meaning of the data might not change the validity date.

Actually, this solution looks so good, I think I will do away with my snapshot databases and frozen tables and implement it myself:-)

Subject: Your comments
Posted by: Anonymous (not signed in)
Posted on: Thursday, March 01, 2007 at 8:40 AM
Message: Thank you all for your intelligent and thorough comments. I haven't got much time at the moment (I'm a witness at a trial today and have to leave in a moment), but when I return I shall respond in detail to the questions you've posed.

Thanks again. It's very gratifying to know that so many people read the piece, and to know that the piece was useful.

Subject: Cascading Updates/Deletes
Posted by: Anonymous (not signed in)
Posted on: Thursday, March 01, 2007 at 10:06 AM
Message: A very good article. I hadn't considered the cascading updates and deletes. It seems to me that many applications may require an ability to relate a current vendor (with a new gold status)to an older transaction and the historical state of the vendor at the time of the transaction (only silver status at the time). It seems very costly in terms of space to duplicate all related records with every change to the parent table. Especially if there are many related rows and frequent changes to the parent.

Subject: Tracking "used" records
Posted by: danjs (view profile)
Posted on: Thursday, March 01, 2007 at 10:12 AM
Message: We have the need for a PTA in our systems, but I would like to achieve these use cases and specifications.

For example with storing a person's home address:
- User can enter address information; simple Insert into table with current date as effective date
- User can change address information;
-- If address information is "unused" then just update data and don’t change the effective date. This allows a user to correct a simple data entry error without having to create a new record.
-- If existing address information is "used" then insert new record with the current date as effective date. This keeps the audit/history of the change when the information was used for some purpose.
-- A used flag in the address table indicates if the information was used for some purpose or not. For example if a product was shipped to an address then the address information was "used" and the history must be preserved. Any subsequent change to the information demands the system insert a new record instead of updating the existing record.
- User can enter address change to take effect on a given date (ex. On May 1st 2007, John Doe will be moving from New York to New Jersey)
-- If an effective date is set with the address change then insert new record using the provided effective date.

Any experience with such a system?

Subject: Commercial Solutions?
Posted by: Anonymous (not signed in)
Posted on: Thursday, March 01, 2007 at 12:15 PM
Message: Are there any commercially available solutions that can achieve some level of auditing?

Subject: PTA
Posted by: Anonymous (not signed in)
Posted on: Thursday, March 01, 2007 at 2:22 PM
Message: Excellent article.
Describes complex PTA architecture in a simple words with a good examples

Subject: Time-oriented SQL databases
Posted by: Anonymous (not signed in)
Posted on: Friday, March 02, 2007 at 6:11 AM
Message: What were talking about is a temporal database. Two time dimensions to consider:

Valid time - the time period during which a fact is true with respect to the real world ("effective dates")
Transaction time - the time period during which a fact is stored in the database.

A quick intro here: http://en.wikipedia.org/wiki/Temporal_database
An entire book on Time-Oriented SQL: http://www.cs.arizona.edu/people/rts/tdbbook.pdf

A prototpe considers how to implement transaction time in SQL Server: http://www-users.cs.umn.edu/~mokbel/ImmortalDBDemo.pdf

But at the end of the day, current SQL databases don't provide in-built support for this, so we end up having to implement it all manually using techniques describe in this article. My life would be *so* much easier if DBMSs provided time-support out of the box!

In the real world, one big issue with storing loads of versions of the "same" row of data is performance. Will it run like a dog due to vastly more data, when most of the time you still only want the "current" view ?

Subject: poor
Posted by: Anonymous (not signed in)
Posted on: Friday, March 02, 2007 at 6:16 AM
Message: not even a mention to temporal dbs...

Subject: Bitemporal databases
Posted by: Anonymous (not signed in)
Posted on: Friday, March 02, 2007 at 7:40 AM
Message: Look at the extensive work of Snodgrass, Jensen et al. on bitemporal databases:

http://www.cs.auc.dk/TimeCenter/pub.htm


Subject: Temporal databases
Posted by: Anonymous (not signed in)
Posted on: Friday, March 02, 2007 at 6:10 PM
Message: Those who mention temporal databases not beng exactly fair. These things only exist in the minds and papers of academics. No production quality dbms handles temporal databases.

Subject: Time oriented SQL database
Posted by: Anonymous (not signed in)
Posted on: Friday, March 02, 2007 at 6:16 PM
Message: "In the real world, one big issue with storing loads of versions of the "same" row of data is performance. Will it run like a dog due to vastly more data, when most of the time you still only want the "current" view ?"

In a normalized model, the current data is an entity. Each type2 (in Arthur's terminology) attribute, however, is not part of the same entity and should be in a table on its own as proposed in the "Normalized solution" comment above.

This means most of the time, the databse can ignore the history, and it only takes up the space needed to store changed data. The database will obviously have a lot more tables, but these will just be ignored unless history is requested.

Subject: Consistent PK
Posted by: Anonymous (not signed in)
Posted on: Saturday, March 03, 2007 at 9:03 AM
Message: "I can't spot a consistent PK in the example. That leads me to surmise that this approach is theoretical and hasn't been implemented in a real world db yet. Is that right?"

The listing in Table 1 was intended to illustrate a halfway-there approach. It lacks the ability to trace the ancestry of an original PK. That was the point I attempted to make in subsequent paragraphs. Adding the column OriginalPK and "inheriting" it in subsequent updated rows eliminates the problem of walking the chain recursively. Apparently I didn't make this sufficiently clear. Table 1 was intended as a point in time, so to speak -- an initial design that we later discover cannot deal with the second meaning of history -- tracking Mary O'Hara through her several residences, marriages and name changes.

Table 2 illustrates the revised version of Table 1, and its ultimate column shows a consistent PK that travels through all the updates.

Arthur

Subject: History Tables
Posted by: Anonymous (not signed in)
Posted on: Saturday, March 03, 2007 at 9:21 AM
Message: Several readers have commented that my examples do not include History tables, and they are quite correct in pointing this out. In this piece, I attempted to describe the concept of PTA and illustrate some of its wrinkles (cascades). Going into every single detail of the problem would require a book, not an article, and I'm not prepared to write a whole book on the topic yet (though it's a thought).

At any rate, the concept of History tables is quite sound, and was in fact used on the last PTA project I did. In brief, every PTA table had a mirror table called X_History, where X = Client or Product or whatever. When a row was updated, the original was moved to History, leaving only the current row in the active table.

This approach increases performance in some and perhaps most operations, but when you need to interrogate both the current and the history tables then you are forced into a UNION or some other construct to assemble the whole picture. I'm not suggesting this is a bad thing, but merely pointing it out.

In the actual case that gave rise to this article, the History tables were an entire database and resided on another server, further complicating matters. But what is one to do, given that the initial footprint was 500MB and the expected growth was 1TB per year? Add to this the fact that large segments of the data were highly confidential, so there were multiple servers and firewalls between them and even the programmers were not allowed to see the actual data. (It was a medical application. You can see the implications -- I might want to check if the woman I met last night had been tested for AIDS, for example.)

But overall, I definitely subscribe to the notion of History tables, despite the performance hit when assembling the entire history (current + past). Further, these History tables are not to be confused with the OLAP version of the data. History is identical in structure to Current. The OLAP version uses dimensions to aggregate the data in meaningful ways (i.e. how many women who have taken drug xyz subsequently developed kidney ailments? etc.).

So, the full picture of a large database that must adhere to PTA probably consists of 3 databases: Current, History and OLAP. The logic of moving the data from C to H to O is application-specific.

Arthur

Arthur

Subject: Tracking "Used" Records
Posted by: Anonymous (not signed in)
Posted on: Saturday, March 03, 2007 at 9:30 AM
Message: Danjs,

It sounds like you have an interesting problem to solve. I am available to assist, for vast amounts of currency. :)

More seriously, your requirements outstrip what I attempted to present in the article. I tried to present a simple case, to illuminate the issues involved. If you want assistance on this, email me privately (artful@rogers.com). The wrinkles are too wrinkly to explore here.


Subject: Thanks
Posted by: Anonymous (not signed in)
Posted on: Saturday, March 03, 2007 at 9:33 AM
Message: It is most gratifying to receive so many comments on this article, and not only for quantity but also for quality. Clearly, you read it and thought about it and responded with intelligent consideration. Thank you, all.

Arthur

Subject: 4 posibilities
Posted by: Anonymous (not signed in)
Posted on: Sunday, March 04, 2007 at 8:31 AM
Message: We are small software company and we create a lot of information systems for different users (every year at least 2 or 3 new). PTA is really painful problem and we tried a lot of options (all are from real applications and every has several GB and more than 80 tables):

1. Archive table. Every table has archive table (mytable_A, mytable2_A) and every update is insert into archive table too. Cons: a lot of triggers and tables.

2. Archive data in table. Every PTA table has compound PK (ID and IDN). Row with IDN=0 is actual record. When record is updated, wil be created new archive row (where IDN is next number in sequence).

3. One archive table for all tables where every column is new row. Is create table with columns: TABLE_NAME, TABLE_ID, VERSION, COLUMN_NAME, USERNAME, DATE, intvalue, numbervalue, datevalue, blobvalue, stringvalue. After update of any table, new value of every column is inserted into this table. Cons: a lot of rows in this table

4. One archive table for all tables with XML record. Table has columns: TABLE_NAME, TABLE_ID, VERSION, USERNAME, DATE, XML-DATA. XML record is XML serialization in Java.

Subject: Re: posibilities
Posted by: Anonymous (not signed in)
Posted on: Monday, March 05, 2007 at 10:15 AM
Message: So which of the 4 approaches works best for your clients?

Subject: Consistent PK
Posted by: Anonymous (not signed in)
Posted on: Monday, March 05, 2007 at 12:38 PM
Message: "Table 2 illustrates the revised version of Table 1, and its ultimate column shows a consistent PK that travels through all the updates."

Al. I see it. It is either in the PK column or in the original PK column depending on which row you look at. How would you create a relationship with another table having PK or Original Pk as the FK?

Subject: Domains and FK's
Posted by: ASdanz (view profile)
Posted on: Tuesday, March 06, 2007 at 2:07 PM
Message: I use domains extensively with ERStudio and do not have any of the issues you describe with certain attributes like IDENTITY being suitable only for PK's and not FK's. ERStudio handles it just file. I'm surprised that you can't get the same result using ERwin.

I would think that maintaining referential integrity in your PTA design is going to be problematic at best. The whole notion of choosing a good primary key is to choose a value that will never (or very rarely) change. (The main reason so many databases have surrogate keys instead of natural keys for most primary keys.)

Subject: More unsaid than said
Posted by: Anonymous (not signed in)
Posted on: Tuesday, March 06, 2007 at 6:03 PM
Message: On the one hand this article doesn't refer to the significant body of works that deal with temporal database solutions: Date, Darwen and Lorentzos or Snodgrass for example. On the other hand the "Type 1" and "Type 2" SCD terms are appropriated from Kimball (without any attribution) and given incomplete explanations. I would encourage anyone who is seriously interested to study the standard works on this topic.

This field is already overburdened with jargon (see Kimball again). The last thing we need is a new term - "PTA" - for an old concept!

Subject: Interesting - Other Approaches Exist Too
Posted by: Anonymous (not signed in)
Posted on: Wednesday, March 07, 2007 at 12:54 PM
Message: Have extensively used what was available through the former PeopleSoft organization and their solution is significantly simpler. They kept the idea of the key that never changes and tacked on the effective date and an "Active/Inactive" indicator. If you had an employee, with an employee ID, then when I am first recorded, I have that ID, effective dated and "Active". When I change my name, I KEEP the same ID, have a new effective date, and am "Active" - with a new name. When I eventually retire (hopefully) or am fired (please, no - wait, consultant and big bucks???), I KEEP the same ID, have a new effective date, and become "Inactive". This deals with the "what was the state" of each individual real world object being modeled within the database. And, separately, they tacked on what I have over many years lovingly referred to as an "Audit Stamp" - when it was done, who did it, and, if you need it that badly, what function was being used.

In this concept, related tables use my employee ID as the foreign key (which can be a PK type value assigned and should NOT be any type of social insurance number or other "natural looking" key).

Respecting what I learned about normalization rules way back, the idea of (in a logical design, physical implementation might be different for performance reasons) storing the end date in this record that matches the begin date of the next record in the chain makes me double take. This may help with performance and be legitimate - but I saw the PeopleSoft applications perform well with large data loads and they did NOT use the approach - they stuck with the simpler approach. That the effective date of a record implied that the previous record along the timeline ended at the time this one came into being.

Now - having said all that, I will say that this does appear to be reasonably well thought out from the perspective of writing an article and I would encourage your hinted at desire to write an entire book - assuming you have the necessary time available to make such a committment. Also, thanks for the information - amazing thinking.

Subject: Referential Integrity
Posted by: Anonymous (not signed in)
Posted on: Tuesday, March 27, 2007 at 6:05 AM
Message: This article does seem to gloss over a problem which I believe is central to the physical design of a temporal database, and that is how referential integrity is enforced. In a non-temporal relational database (I use Oracle), referential integrity is enforced by the imposition of FK constraints. If the database becomes temporal then FK constraints can clearly not be used in the same way as an FK constraint can only point to one parent record at a time.

For instance, take the typical employee in a department scenario with an employee and his department both having a DateEffective of 01-Jan-2007 and a DateEnd of 31-Dec-2099, with the employee table having an FK constraint to the PK of the department table. Were the department record to be updated today, there would then be two department records, one with a DateEffective of 01-Jan-2007 and a DateEnd of 26-Mar-2007, the other with a DateEffective of 27-Mar-2007 and a DateEnd of 31-Dec-2099. Whether the PK of the department table is a surrogate key or the business key + DateEffective, the employee must now point to two parent records so clearly the FK constraint cannot be used.

One solution would be, as implied in Arthur's article by his mention of the PersonAddresses table, to include many-to-many resolver tables between each of the main business entities. Another would be to do away with FK constraints altogether and instead use triggers to check the existence of parents on insert and update. Either way, the design of the database is greatly complicated, either by a proliferation of extra tables and the code to maintain them, or by the large amount of trigger code required. (One could, of course, trust (ha!) the application to maintain referential integrity, but, in my experience, this always ends in tears!)

I would be very interested to hear how other people resolve this conundrum.

Subject: Updating foreign key
Posted by: Anonymous (not signed in)
Posted on: Thursday, April 12, 2007 at 7:10 AM
Message: I was working on implementing support for historical records in an existing DB when I read this post.

My thoughts and experiences about historical data is that a separate log/history table cause too much administrative work. E.g. Table changes must be done in two places and retreiving historical data will require more work.

So this time I decided to try to keep historical data in the "main" table. Although, this gave me the trouble with changing all FKs, since all PKs will have to be combined with a "start time" for current value.

Each table that keep historical data has two extra columns: ST (start time) and ET (end time).
ST will have "current timestamp" as default value and ET will have NULL as default.
The active values will always have NULL as ET.

Add a trigger to the table (this is TSQL for Microsoft SQL Server):
ALTER TRIGGER myTrigger
ON myTable
FOR UPDATE
AS
BEGIN
DECLARE @now DATETIME
SET @now = {fn NOW()}

-- Update start time on current row
UPDATE myTable
SET ST = @now
FROM inserted INNER JOIN myTable on myTable.id = inserted.id AND myTable.ET IS NULL

-- Create history
INSERT INTO myTable
SELECT * FROM deleted

-- Update end time on historical data
UPDATE myTable
SET ET = @now
FROM deleted INNER JOIN myTable on myTable.id = deleted.id AND myTable.ST = deleted.ST
END

It should be easy to port this to other flavors of SQL.

Subject: Updating foreign key
Posted by: Anonymous (not signed in)
Posted on: Thursday, April 12, 2007 at 10:59 AM
Message: I was working on implementing support for historical records in an existing DB when I read this post.

My thoughts and experiences about historical data is that a separate log/history table cause too much administrative work. E.g. Table changes must be done in two places and retreiving historical data will require more work.

So this time I decided to try to keep historical data in the "main" table. Although, this gave me the trouble with changing all FKs, since all PKs will have to be combined with a "start time" for current value.

Each table that keep historical data has two extra columns: ST (start time) and ET (end time).
ST will have "current timestamp" as default value and ET will have NULL as default.
The active values will always have NULL as ET.

Add a trigger to the table (this is TSQL for Microsoft SQL Server):
ALTER TRIGGER myTrigger
ON myTable
FOR UPDATE
AS
BEGIN
DECLARE @now DATETIME
SET @now = {fn NOW()}

-- Update start time on current row
UPDATE myTable
SET ST = @now
FROM inserted INNER JOIN myTable on myTable.id = inserted.id AND myTable.ET IS NULL

-- Create history
INSERT INTO myTable
SELECT * FROM deleted

-- Update end time on historical data
UPDATE myTable
SET ET = @now
FROM deleted INNER JOIN myTable on myTable.id = deleted.id AND myTable.ST = deleted.ST
END

It should be easy to port this to other flavors of SQL.

Subject: Been thinking about this
Posted by: Anonymous (not signed in)
Posted on: Thursday, April 26, 2007 at 5:23 PM
Message: Hmm, at first I was not happy about the primary keys changing, due to having to create new records for all foreign key references to it.

However the suggestion made to use a many to many mapping table that stores the parent Primary key (one that changes) and the childs foreign key is a good one as it prevents unnecessary row duplication when only the foreign key reference is changed. This means that if you change the parents primary key, you only need a trigger to create a new reference in the many to many table between the new parents primary key and the old childs foreign key.

If you are worried about space then this will save a huge ammount.

Negatives
(a) Queries a bit more complex (although consistent)
(b) Each parent and child table requires a date lookup whereas the space filling alternative in the article has a direct relationship.

Subject: Gotta keep 'em separated
Posted by: Anonymous (not signed in)
Posted on: Monday, May 07, 2007 at 9:28 PM
Message: The concept is a great one, and for the most part I agree (as this is the approach I use myself).

The problem I have is that you want to store 'live' data with deleted data... Good luck with that. That will not only make your data muddy, but it will make your stored procedures more complex (or increase exponentially in number).

You should really consider keeping your archived data in separate tables.

Subject: Wow good article
Posted by: Anonymous (not signed in)
Posted on: Tuesday, May 15, 2007 at 8:02 AM
Message: Nicely explained in clear and concise text, great job.

I am doing something similar in a project now and am using most of this.

Thanks

Subject: History Tables
Posted by: Kurt V (not signed in)
Posted on: Thursday, June 28, 2007 at 3:30 PM
Message: from a previous posting...

"When a row was updated, the original was moved to History, leaving only the current row in the active table.

This approach increases performance in some and perhaps most operations, but when you need to interrogate both the current and the history tables then you are forced into a UNION or some other..."

Rather than waiting till a record is updated before moving the original to the history table, why not add the new record to the history table and update the current table all in one transaction. Then your history table also includes current and there is never a need to UNION.

Subject: Anonymous Comments Disabled
Posted by: Sarah Grady (view profile)
Posted on: Wednesday, December 19, 2007 at 9:57 AM
Message: Due to a high volume of spam, anonymous comments have been disabled.

Subject: Correct use of RTS in PITA
Posted by: D_George (view profile)
Posted on: Tuesday, March 15, 2011 at 12:27 PM
Message: This article continues a common, but serious error in the use of the Replaced Timestamp field. While the article is otherwise a good introduction to the subject, Table 1: Persons table with PTA columns and comment is incorrect. Table 1 introduces us to Mary O’Hara, a frequently married actress, and her marriage history:

PersonID Given Surname DateCreated DateEffective DateEnd DateReplaced Operator Code Description
1234 Mary O'Hara 01-Jan-04 01-Jan-04 jlarue Prior to Wedding
1234 Mary O'Hara 01-Jan-04 01-Jan-04 01-Jun-04 01-Jun-04 jlarue Ends Maiden Name
2345 Mary Roberts 01-Jun-04 02-Jun-04 bhoskins Adopts New Surname
The RTS field is labelled DateReplaced. In PITA, when this field is null on a record, the record possesses three characteristics:

1. There was never any point in time in the past when belief in the correctness of its data ceased;
2. As a corollary, the record is believed at the present time;
3. For the context it represents, the record is the currently active, i.e. most up-to-date, statement of facts as we know them.
‘Context’ can refer to some object such as real estate property or piece of equipment, or to an attribute such as a person’s surname as in the present case.

When RTS is non-null on a record, its value is the point in time, expressed as a date and time value, when we stop believing that the values in the record are correct. Some of the values may be correct, but we believe the value of at least one of the fields is not. This extends to fields with null values. For example, we may have come to believe that the value of DateEnd in the first record is no longer null as of June 1st, 2004. That means the information contained in the record is no longer in effect.

As the second record shows, the date the record ceases to be in effect is the same date on which we stopped believing that there was no end, namely June 1st, 2004. However the two dates could be different, and the End Date can even be later than the RTS.

The problem with the way the author has structured Table 1 is that, in accordance with the three characteristics just cited associated with a record whose RTS is null, it appears that the first record is still in effect, and that we no longer believe the second record is true. Record 1 does not have a non-null RTS value, so our belief in it continues. Record 2, on the other hand, has an RTS value of June 1st, 2004, so that is the date on which we stopped believing that the information was effective only between January 1st and June 1st of that year. This is exactly the reverse of the author’s intention.
However, if we re-write the first two records as follows,

PersonID Given Surname DateCreated DateEffective DateEnd DateReplaced Operator Code Description
1234 Mary O'Hara 01-Jan-04 01-Jan-04 01-Jun-04 jlarue Prior to Wedding
1234 Mary O'Hara 01-Jun-04 01-Jan-04 01-Jun-04 jlarue Ends Maiden Name
2345 Mary Roberts 01-Jun-04 02-Jun-04 bhoskins Adopts New Surname
Table 1.1
the intention becomes obvious: Prior to June 1st, 2004 Mary’s last name was believed to remain O’Hara forever. A day later, however, she married and changed her last name to Roberts. However, prior to that and since January 1st, her last name was O’Hara, and we continue to believe that to be the case.

I have a somewhat lengthier document giving further background and examples on the use of end dates and the RTS field. If anyone wishes a copy, please email me at my public address.

Subject: Bitemporal database
Posted by: D_George (view profile)
Posted on: Wednesday, April 13, 2011 at 8:31 AM
Message: There's a useful, if incomplete, discussion and history of the topic in Wikipedia: http://en.wikipedia.org/wiki/Temporal_database.

Be sure to read the discussion page too as it points out some of the deficiencies of the main article. There is a considerable body of academic writing on the subject of temporal data, lead as usual by Chris Date and Richard Snodgrass. Your favourite search engine will present a plethora of hits when queried with 'temporal database'.

 

Phil Factor
Searching for Strings in SQL Server Databases

Sometimes, you just want to do a search in a SQL Server database as if you were using a search engine like Google.... Read more...

 View the blog

Top Rated

Searching for Strings in SQL Server Databases
 Sometimes, you just want to do a search in a SQL Server database as if you were using a search engine... Read more...

The SQL Server Sqlio Utility
 If, before deployment, you need to push the limits of your disk subsystem in order to determine whether... Read more...

The PoSh DBA - Reading and Filtering Errors
 DBAs regularly need to keep an eye on the error logs of all their SQL Servers, and the event logs of... Read more...

MySQL Compare: The Manual That Time Forgot, Part 1
 Although SQL Compare, for SQL Server, is one of Red Gate's best-known products, there are also 'sister'... Read more...

Highway to Database Recovery
 Discover the best backup and recovery articles on Simple-Talk, all in one place. Read more...

Most Viewed

Beginning SQL Server 2005 Reporting Services Part 1
 Steve Joubert begins an in-depth tour of SQL Server 2005 Reporting Services with a step-by-step guide... Read more...

Ten Common Database Design Mistakes
 If database design is done right, then the development, deployment and subsequent performance in... Read more...

SQL Server Index Basics
 Given the fundamental importance of indexes in databases, it always comes as a surprise how often the... Read more...

Reading and Writing Files in SQL Server using T-SQL
 SQL Server provides several "standard" techniques by which to read and write to files but, just... Read more...

Concatenating Row Values in Transact-SQL
 It is an interesting problem in Transact SQL, for which there are a number of solutions and... Read more...

Why Join

Over 400,000 Microsoft professionals subscribe to the Simple-Talk technical journal. Join today, it's fast, simple, free and secure.