Click here to monitor SSC
  • Av rating:
  • Total votes: 210
  • Total comments: 66
Nigel Rivett

Identity Columns

12 May 2008

When Nigel Rivett takes us on a tour of the apparently innocuous subject of Identity Columns in TSQL, even the seasoned programmer is due for one or two surprises.

Definition

An identity column has a name, initial seed and step.  When a row is inserted into a table the column will take the value of the curent seed incremented by the step.

Note: An identity column is not guaranteed to be unique nor consecutive. You should always place a unique index on an identity column if your system requires uniqueness.

Creating And Using Identity Columns

We create an example table

CREATE TABLE #a(i INT IDENTITY(1,1), j INT)

This is the usual way you will see an identity used and is the default.  It gives the same result as ...

CREATE TABLE #a(i INT IDENTITY, j INT)

Inserting rows ...

INSERT #a SELECT 1
SELECT * FROM #a i j ----------- ----------- 1 1

... will work but is a bit confusing and may be version dependent. It is a good idea to always name the columns inserted and leave the others to default.

INSERT #a (j) SELECT 1
SELECT * FROM #a
i j ----------- ----------- 1 1 2 1

Note one use of an identity. We have inserted duplicate rows but can separate them by the identity column value. The identity also shows the order in which the rows were inserted.

We can find the current identity seed by using DBCC checkident ...

DBCC checkident (#a)
Checking identity information: current identity value '2', current column value '2'.

So the current seed is 2 - remember the next value will be the current seed plus the step.

Failed Inserts

BEGIN TRAN
INSERT
#a (j) SELECT 1
ROLLBACK TRAN
SELECT
* FROM #a
i j ----------- ----------- 1 1 2 1
DBCC checkident (#a) Checking identity information: current identity value '3', current column value '3'.

The table has not changed but we can see from the checkident that the current seed has been changed and we know that this is used to generate the next value.

Note: The next value is the step added to the current seed; not one more than the max value in the table, or even the step from the last or maximum value.

INSERT #a (j) SELECT 1
SELECT * FROM #a i j ----------- ----------- 1 1 2 1 4 1

The situation above commonly happens when there is an index violation on j.

Values For Original Seed And Step

Note that the initial seed and step can be any integer value

CREATE TABLE #b (i INT IDENTITY(-7,5), j INT)
INSERT #b (j) SELECT 1
INSERT #b (j) SELECT 1
INSERT #b (j) SELECT 1
SELECT * FROM #b i j ----------- ----------- -7 1 -2 1 3 1
CREATE TABLE #c (i INT IDENTITY(1,-3), j INT)
INSERT #c (j) SELECT 1
INSERT #c (j) SELECT 1
INSERT #c (j) SELECT 1
SELECT * FROM #c i j ----------- ----------- 1 1 -2 1 -5 1

Inserting Specific Identity Values

We can insert a specific identity value to override the generated value.  To do this execute a SET IDENTITY_INSERT #a ON and specify the column list explicitly.

SET IDENTITY_INSERT #a ON
INSERT
#a (i,j) SELECT 2,2
SET IDENTITY_INSERT #a OFF
SELECT
* FROM #a
i j ----------- ----------- 1 1 2 1 4 1 2 2

Remember that the identity doesn't guarantee uniqueness? We now have 2 rows with the identity value 2.

What has happened to the seed?

DBCC checkident (#a)
Checking identity information: current identity value '4', current column value '4'.

Note that it is not affected by the previous insert. Let's insert a higher value ...

SET IDENTITY_INSERT #a ON
INSERT
#a (i,j) SELECT 10,3
SET IDENTITY_INSERT #a OFF
SELECT
* FROM #a i j ----------- ----------- 1 1 2 1 4 1 2 2 10 3 DBCC checkident (#a) Checking identity information: current identity value '10', current column value '10'.

This time the seed is updated - that is because the value we inserted was higher than the current seed. It will increase but not decrease.

But what happens if the step is negative?

SELECT * FROM #c

i           j
----------- -----------
1           1
-2          1
-5          1

DBCC checkident (#c)
Checking identity information: current identity value '-5', current column value '-5'.
note: step is -3
SET IDENTITY_INSERT #c ON
INSERT #c (i,j) SELECT -8,2
SET IDENTITY_INSERT #c OFF
SELECT * FROM #c i j ----------- ----------- 1 1 -2 1 -5 1 -8 2 DBCC checkident (#c) Checking identity information: current identity value '-8', current column value '-8'.
SET IDENTITY_INSERT #c ON
INSERT
#c (i,j) SELECT 10,2
SET IDENTITY_INSERT #c OFF
SELECT
* FROM #c i j ----------- ----------- 1 1 -2 1 -5 1 -8 2 10 2 DBCC checkident (#c)

So the update of the seed takes into account the sign of the step.

Changing The Current Seed

We have seen that the current seed can be changed by an insert - but only in the direction of the step. A better way is to use DBCC checkident. This will take a reseed keyword and value to set the seed.

CREATE TABLE #d (i INT IDENTITY (5,2), j INT)
INSERT #d (j) SELECT 1
INSERT #d (j) SELECT 1
SELECT * FROM #d i j ----------- ----------- 5 1 7 1 DBCC checkident(#d) Checking identity information: current identity value '7', current column value '7'.
DBCC checkident(#d, reseed, 2)

Checking identity information: current identity value '7', current column value '2'.

Note this is the first time that the current seed has been different from the last allocated value.

INSERT #d (j) SELECT 2
SELECT * FROM #d i j ----------- ----------- 5 1 7 1 4 2

We can also reset the current seed to it's original value via a truncate table.

Note - a delete does not do this.

TRUNCATE TABLE #d
DBCC checkident(#d)
Checking identity information: current identity value 'NULL', current column value 'NULL'.

INSERT #d (j) SELECT 1
SELECT * FROM #d i j ----------- ----------- 5 1 DBCC checkident(#d) Checking identity information: current identity value '5', current column value '5'.

Finding The Identity Value

A common requirement is to find the identity value for an inserted row. There are several statements associated with this

  • scope_identity
  • ident_current
  • @@identity

scope_identity()
returns the last identity inserted in the current scope and session. This is usualy the only one of these functions that is useful. It is not affected by other connections or tables nor by triggers.
@@identity
will return the last identity value inserted in any scope. This means that if a trigger inserts into a table with an identity then that is the value returned. This means that adding replication or auditing triggers to a database can alter the value of @@identity. In earlier versions of sql server this was the only means of returning the identity value and care had to be taken.
ident_current('table')
returns the last value inserted into that table on any connection. Remember to put the table name in quotes

As stated earlier scope_identity() is probably the only one of these functions that you will need to use.

Using Scope_Identity()

As stated earlier scope_identity() returns the last identity value inserted.

CREATE TABLE #t1 (i INT IDENTITY(5,1), j INT)
INSERT #t1 (j) SELECT 1
SELECT SCOPE_IDENTITY(), MAX(i) FROM #t1 --------------------------------------- ----------- 5 5

scope_identity also returns the value after a rollback

BEGIN TRAN
INSERT
#t1 (j) SELECT 1
ROLLBACK TRAN
SELECT
SCOPE_IDENTITY(), MAX(i) FROM #t1
--------------------------------------- ----------- 6 5

but the value is not updated for a failure due to an index violation although the value is allocated

CREATE UNIQUE INDEX ix ON #t1 (j)
INSERT #t1 (j) SELECT 1
SELECT SCOPE_IDENTITY(), MAX(i) FROM #t1
--------------------------------------- ----------- 6 5
INSERT #t1 (j) SELECT 2
SELECT SCOPE_IDENTITY(), MAX(i) FROM #t1
--------------------------------------- ----------- 8 8

Adding An Identity Column To A Table.

An identity column can be added to a table via an alter table statement. Values will be allocated to the column according to the seed and step.

In this case SCOPE_IDENTITY() will not return an allocated value.

CREATE TABLE #t2 (j INT)
INSERT #t2 (j) SELECT 1
INSERT #t2 (j) SELECT 1
INSERT #t2 (j) SELECT 1
INSERT #t2 (j) SELECT 1
SELECT * FROM #t2
j ----------- 1 1 1 1
ALTER TABLE #t2 ADD i INT IDENTITY (5,2)
SELECT * FROM #t2
j i ----------- ----------- 1 5 1 7 1 9 1 11

This can be useful for dealing with tables with duplicate rows.

Note - an existing column cannot be made into an identity. In this case you must drop the existing column and add a new one.

In this instance the existing values cannot be retained. To retain existing values create a new table and insert using identity_insert.  Also as this will update all rows in a table it can take a very long time on large tables and increase the log size.

Select Into

An identity column can be included in a table created using a select into statement via the identity function

SELECT *, IDENTITY(INT,1,1) AS id INTO #tbl FROM sysobjects

This is useful for creating a table from existing structures

Identity Datatypes

The identity must only contain integer values but the column can be of any numeric datatype (bigint, int, tinyint, numeric, decimal). This can be useful when values greater than be contained in a bigint are required.

Detecting Identity Columns And Their Properties

The existance of an identity column on a table can be checked via

SELECT OBJECTPROPERTY(OBJECT_ID('<tablename>'),'TableHasIdentity')

Which will return 1 if an identity exists on the table.

Similarly ...

SELECT COLUMNPROPERTY(OBJECT_ID('<tablename>'),'<columnname>','IsIdentity')

... Will show if a column has the identity property.

A more useful way of obtaining this information is by using the catalog view sys.identity_columns which returns a row for each column in the database with an identity property.

SELECT TableName = OBJECT_NAME(OBJECT_ID) ,
      
ColumnName = name ,
      
OriginalSeed = seed_value ,
      
Step = increment_value ,
      
LastValue = last_value ,
      
IsNotForReplication = is_not_for_replication
FROM sys.identity_columns

Character Values In An Identity Column

A common request is to hold a composite value and to allocate sequential values depending on the character part
eg.

a1
a2
a3
b1
b2

This is not possible and probably not even desirable. Notice that this column actually contains two values - what would be it's purpose?

Perhaps it is trying to allocate a sequence to the character part. That definition highlights the mistake - the sequence value is separate to the character part and should be a separate column.

Now we can use an identity for the numeric value and easily calculate a consecutive value for the character value from this when accessing the table or in a view.

If it is required to keep the sequence value in the table - maybe for performance reasons then this could be maintained via a trigger.

Bulk Insert

If the table has an identity column then a bulk insert will often fail if the identity values are not held in the text file. The easiest way around this is to create a view on the tabke excluding the identity column and bulk insert into the view. Another option is to create a format file to use with the bulk insert. I would avoid this option if possible as it adds an external object and is more difficult to maintain.

It is tempting to assume that the identity values will be allocated in the order of rows in the text file but this is not the case.  This is often an issue with unstructured data like XML. In this case an XML block cannot be parsed using the identity values. It often will work but cannot be guaranteed - especially if multiple threads are spawned - better not to rely on it.

To deal with such data import into text column (or varchar(max) in v2005+) then parse the data. This might be quite slow. You will not be able to define a column or row terminator and there is an interesting "feature" to be aware of in some versions of sql server:  If the text file length is divisble by 4 the bulk insert would fail without giving an error. Test your version to see if it has this problem and if so you can check the file length and and a dummy character if it's divisible by 4.

To Use An Identity Or Not

It is a question that often raises passions of almost religious fervour and a search will find many threads on the subject.

I have heard people say that every table in a database should have an identity column and that only those should be used in joins. Other people say that they have no place in a relational database and should never be used. I would not subscribe to either of these opinions but would use an identity where it seems sensible. When importing data into staging tables an identity can be useful to identify the rows which may otherwise contain duplicates. In the same situation it can be useful for batching rows to fit in with the memory available for processing.  A lookup table needs an ID - why not make it an identity if it is not allocated from a script. It can be useful for allocating IDs - e.g. a customer ID but be careful about different systems allocating the same ID.

Disaster Recovery

There can be an issue with disaster recovery and standby systems. If the identity values are used in another database then the databases may get out of step. When the standby system is brought on-line there needs to be some means of checking that the values are consistent across the databases

Summary Of Points Covered

  • An identity column has a name, initial seed and step.
  • An identity column is not guaranteed to be unique nor consecutive
  • We can find the current identity seed and change it by using dbcc checkident
  • The next value allocated is the step added to the current seed.
  • An insert failure can change the current seed value.
  • An explicit value may be inserted via set identity_insert on and including the column list.
  • A value explicitly inserted that is more than the current seed in the direction of the step will update the current seed.
  • A truncate table (but not delete) will update the current seed to the original seed value.
  • Scope_identity() can be used to find the last identity value allocated.
  • An identity column can be added to a table but the identity property of an existing column cannot be changed.
  • The identity function may be used to create an identity column on a table created using select into.
  • Identity columns and their properties can be found via sys.identity_columns.
  • Bulk insert cannot be guaranteed to allocate the identity values in the order of rows in a text file.
  • It is sometimes easier to use a view to bulk insert into a table with an identity column.
Nigel Rivett

Author profile:

Nigel spent his formative years working in assembler on IBM Series/1 but retrained in VB when that went out of fashion. He soon realised how little relational database expertise existed in most companies and so started to spend most of his time working on that. He now sticks to architecture, release control / IT processes, SQL Server, DTS, SSIS, and access methods in VB/ASP/.NET/Crystal Reports/reporting services. He has been involved with SQL Server for about 10 years from v4.2 to v2005, and was awarded Microsoft MVP status in 2003. He tries to stay away from anything presentation oriented (see www.mindsdoor.net). Theoretically he is semi-retired but seems to keep being offered potentially interesting work.

Search for other articles by Nigel Rivett

Rate this article:   Avg rating: from a total of 210 votes.


Poor

OK

Good

Great

Must read
Have Your Say
Do you have an opinion on this article? Then add your comment below:
You must be logged in to post to this forum

Click here to log in.


Subject: Very interesting...
Posted by: Anonymous (not signed in)
Posted on: Wednesday, May 28, 2008 at 4:15 AM
Message: ..but please learn the difference between "its" and "it's".

Subject: Nice article
Posted by: Dan Kennedy (view profile)
Posted on: Wednesday, May 28, 2008 at 5:02 AM
Message: ...I await the pro vs anti identity war to begin again...

I too agree that there are definitely certain instances where it is both useful and practical to employ identity columns. It may not fit the purely theoretical laws and they may even be implementation specific but I think most experienced DBAs would agree that they are a valuable tool used in the right place.

Subject: Well written
Posted by: Mike Greene (not signed in)
Posted on: Wednesday, May 28, 2008 at 6:36 AM
Message: I enjoyed this article and was unaware of the checkident function until now. I have been designing databases for over 20 years and I often use the identity as a unique key on smaller tables.
Like everything else it is useful when used carefully.
I always get a kick out of the absolutist types damning things in the name of theory. Some of them should have to actually design a working database every once in a while.

Subject: Identity Columns
Posted by: Anonymous (not signed in)
Posted on: Wednesday, May 28, 2008 at 7:41 AM
Message: I learned quite a bit from this article. Good job and thanks for sharing.

Subject: Identity Columns
Posted by: Anonymous (not signed in)
Posted on: Wednesday, May 28, 2008 at 7:41 AM
Message: I learned quite a bit from this article. Good job and thanks for sharing.

Subject: Identity Columns
Posted by: Bernie Black (not signed in)
Posted on: Wednesday, May 28, 2008 at 7:51 AM
Message: Nice article. I didn't realize a unique field wasn't guaranteed to be unique or consecutive. I'm sure I've told dozens of programmers it was. Oooops......

Subject: The origins of IDENTITY columns
Posted by: SkyBeaver (view profile)
Posted on: Wednesday, May 28, 2008 at 10:18 AM
Message: As most of you know, SQL Server wasn't originally developed by Microsoft, but rather by Sybase. IDENTITY columns are a feature that was put in by the Sybase engineeering team, so Microsoft mostly inherited the design and the semantics from them.

Sybase added the feature just to maintain parity with Oracle's ROWID feature, which is a physical ROW ID that is immutable for the life of the row. It wasn't really intended to be used as the primary key, and certainly not as a foreign key.

IDENTITY columns are a nice shortcut and spare you from having to implement a NEXT NUMBER table. They are terrible if you need to use BCP to move data between databases. You should use them ONLY when you honestly and truly do not care what the value of the column is, including in foreign key relationships.

Subject: group incrementing values
Posted by: rudy from r937.com (not signed in)
Posted on: Wednesday, May 28, 2008 at 10:24 AM
Message: Regarding the section "Character Values In An Identity Column" which Nigel claims is "not possible and probably not even desirable," it's too bad he cannot see where this might be useful. Yes, you would implement it as separate columns, maybe concatenated them into a computed column for display. However, you wouldn't "easily calculate a consecutive value for the character value" because then ur doing it wrong; the character values represent groups, and the numbers should increment separately within each group. This is difficult to achieve in SQL Server, which might account for the "probably not even desirable" sour grapes. By comparison, it's a simple and optional feature of auto_increments in MySQL.

Subject: Bulk Insert error
Posted by: Anonymous (not signed in)
Posted on: Wednesday, May 28, 2008 at 10:56 AM
Message: You state "...there is an interesting 'feature' to be aware of in some versions of sql server: If the text file length is divisble by 4 the bulk insert would fail without giving an error. Test your version to see if it has this problem and if so you can check the file length and and a dummy character if it's divisible by 4."

I've searched and can not locate any reference to this 'feature' (I'd call it an error). Can you point me to specific links that address it?

Subject: Nice article
Posted by: Prashanth (not signed in)
Posted on: Wednesday, May 28, 2008 at 11:58 AM
Message: Nice article. Learnt a lot about Identity

Subject: Is that the Prashanth that fired me from Microsoft?
Posted by: Tony (view profile)
Posted on: Wednesday, May 28, 2008 at 2:46 PM
Message: Hey great article dude.

Under the section To Use An Identity Or Not I would offer a sugestion of an example that states when identity is bad i.e. an historic pitfall you have seen with them in your career. You talk about moving data between diffrent sources but not anything that showed the issue. Just to balance the equation if you will.

Personally I like them but don't use them that much as I feed off of a diffrent source system and people before me wanted to use the source systems keys.

Thanks
Tony

Subject: it's vs its
Posted by: nigelrivett (view profile)
Posted on: Wednesday, May 28, 2008 at 7:09 PM
Message: >> but please learn the difference between "its" and "it's".

Fair comment - I blame the editor and Microsoft grammar checker.

Subject: bcp and primary key
Posted by: nigelrivett (view profile)
Posted on: Wednesday, May 28, 2008 at 7:14 PM
Message: >> They are terrible if you need to use BCP to move data between databases

Don't see why - BCP allows for identities and if you need to do this your design should take it into account.
It's a lack of thought rather than a problem with identity that causes issues.

>> It wasn't really intended to be used as the primary key

I meant to put a bit in the article about people being confused between identity and primary key - maybe I'll do another one aout uses of identities (and bcp?)

As I said they can be useful but each situation should be considered on its own merits.

Subject: Bulk Insert error
Posted by: nigelrivett (view profile)
Posted on: Wednesday, May 28, 2008 at 7:18 PM
Message: >> Test your version to see if it has this problem and if so you can check the file length and and a dummy character if it's divisible by 4."

>> I've searched and can not locate any reference to this 'feature' (I'd call it an error). Can you point me to specific links that address it?

Afraid not. I found it in a system I was writing and notified microsoft and was told it was a know bug. I'm sure someone would have posted about it though.
It's easy to test for - just create files with increasing number of charaters and see if one fails.
For random length files it should fail 25% of the time so if you have the problem you would soon notice.

Subject: To use an identity
Posted by: nigelrivett (view profile)
Posted on: Wednesday, May 28, 2008 at 7:21 PM
Message: >> Under the section To Use An Identity Or Not I would offer a sugestion of an example that states when identity is bad

I was trying to stay away from discussions about whether an identity is a good thing to use or not. Maybe another article, but a lot of people have fixed opinions and rational discussion is difficult.

Subject: Well written.
Posted by: George (view profile)
Posted on: Wednesday, May 28, 2008 at 7:28 PM
Message: Thanks for all that information. It is well written.

Subject: group incrementing values
Posted by: nigelrivett (view profile)
Posted on: Wednesday, May 28, 2008 at 7:50 PM
Message: I've never eally found a need for this.
I've come across a lot of instances where people believe they need it but it's usually a confusion between presentation and data storage.

I can believe there is a case for it in some circumstances but maybe if the fature was added then people would use it when it's not appropriate - but then I think that about a lit of functionality.

Subject: group incrementing values
Posted by: nigelrivett (view profile)
Posted on: Wednesday, May 28, 2008 at 8:26 PM
Message: I've never eally found a need for this.
I've come across a lot of instances where people believe they need it but it's usually a confusion between presentation and data storage.

I can believe there is a case for it in some circumstances but maybe if the fature was added then people would use it when it's not appropriate - but then I think that about a lit of functionality.

Subject: Great article
Posted by: Artur (not signed in)
Posted on: Wednesday, May 28, 2008 at 10:10 PM
Message: This is a very good job. A lot of useful information. I use identity columns a long but discovered some useful tips.

Discussion if we need to use identity doesn't make any sense. Depend on...

Subject: Great article
Posted by: Artur (not signed in)
Posted on: Wednesday, May 28, 2008 at 10:16 PM
Message: This is a very good job. A lot of useful information. I use identity columns a long but discovered some useful tips.

Discussion if we need to use identity doesn't make any sense. Depend on...

Subject: Got a question....
Posted by: Linson Daniel (not signed in)
Posted on: Wednesday, May 28, 2008 at 11:55 PM
Message: Ur article was really helpful.
I have got a question

Lets consider a table @A which has two columns ID and Name. Here the ID column would be be made the IDENTITY column with seed being 1 and incremented by 1.
Now if my table had 5 rows and I delete of the last row,the next row I insert into the table would have an ID of 6.But i require this to be 5.Are there any functions available for this ?



Subject: re: Got a question....
Posted by: nigelrivett (view profile)
Posted on: Thursday, May 29, 2008 at 3:24 AM
Message: Did you read the article?
Probably DBCC Checkident or maybe set identity_insert on will do what you want.
Have a look at them in the article.

Subject: re: Got a question
Posted by: nigelrivett (view profile)
Posted on: Thursday, May 29, 2008 at 3:26 AM
Message: But if this is a requirement (i.e. that the values are consecutive) maybe you should consider whether an identity is a good idea - or maybe change the design so that the values do not have to be consecutive.

Subject: If the text file length is divisble by 4 the bulk insert would fail without
Posted by: JJ (not signed in)
Posted on: Thursday, May 29, 2008 at 4:14 PM
Message: >> Afraid not. I found it in a system I was writing and notified microsoft and was told it was a know bug. I'm sure someone would have posted about it though.

I worked on that system after Nigel. I couldn't figure out what that code was for so in a "less is more" moment I took it out. The system promptly failed.

The moral here is "if you do something that bizarre, explain it in a comment". Either that or "don't mess with Nigel's code".

Subject: Concurrent users
Posted by: ++Vitoco (not signed in)
Posted on: Thursday, May 29, 2008 at 5:23 PM
Message: I'd like to see a section about multiple sessions working with transactions on the same table at the same time, commiting and/or rolling back.

Subject: Poor usage of IDENTITY
Posted by: J (not signed in)
Posted on: Thursday, May 29, 2008 at 6:31 PM
Message: OK,

ONE example where IDENTITY is a poor choice is for numbering invoices. The accountants and auditors are none too pleased to see gaps in the numbering sequence.

Which is exactly what you will get when an attempt to insert a new row fails, or a transaction is rolled back. The IDENTITY value is incremented anyways.

But I genereally use IDENTITY to create a primary key. Since this is covered by a clustered index, unique values are guaranteed.

Subject: 1
Posted by: Anonymous (not signed in)
Posted on: Friday, May 30, 2008 at 4:10 AM
Message: 1

Subject: Poor usage of IDENTITY
Posted by: nigelrivett (view profile)
Posted on: Friday, May 30, 2008 at 4:58 PM
Message: Yep - an identity isn't guaranteed to be consecutive so anything that requires that isn't a good candidate.

I'll try to remember this as an example of something that needs a consecutive value.
Often the sequence is to do with the client rather than the provider so it wouldn't be a sequence for the table - we're into this character + sequence thing again.
I would consider an identity but allocate the sequence separately as an invoice number maybe isn't a good thing to access a row by as a company grows.

Subject: Concurrent users
Posted by: nigelrivett (view profile)
Posted on: Friday, May 30, 2008 at 5:04 PM
Message: Yes you're right. I touched on this but should have been included in the definition at the top and more prominent in the section about obtaining the identity.

It reminds me that I've had a few queries about what happens to an identity if multiple threads do an insert at the same time so I should have remembered to highlight it.

Might see if I can get the article amended or maybe write a second one - but this should really be in this one.

thanks.

Subject: Bulk Insert
Posted by: Hutty (not signed in)
Posted on: Saturday, May 31, 2008 at 10:32 PM
Message: I am trying to bulk insert into a table that has an ID column set as Identity. The data file does not have a reference to the ID column, as I am thinking upon Bulk Insert the ID column will calculate the next number. I get an error type mismatch for the ID column. Any ideas what the solution may be? Thanks

Subject: Bulk Insert
Posted by: Hutty (not signed in)
Posted on: Sunday, June 01, 2008 at 8:29 PM
Message: I am trying to bulk insert into a table that has an ID column set as Identity. The data file does not have a reference to the ID column, as I am thinking upon Bulk Insert the ID column will calculate the next number. I get an error type mismatch for the ID column. Any ideas what the solution may be? Thanks

Subject: re: Bulk Insert
Posted by: nigelrivett (view profile)
Posted on: Tuesday, June 03, 2008 at 2:41 AM
Message: That's mentioned in the article I think.
Easiest way is to create a view without the identity column and bulk insert into that.
Another way is to create a format file
see
http://www.nigelrivett.net/SQLTsql/BCP_quoted_CSV_Format_file.html

Subject: IDENTITY COLUMNS
Posted by: DURGESH (not signed in)
Posted on: Wednesday, June 04, 2008 at 2:29 AM
Message: I HAVE LEARNT ALOT FROM THIS ARTICLE ABOUT IDENTITY PROPERTY.

Subject: The hotspot problem
Posted by: Anonymous (not signed in)
Posted on: Wednesday, June 04, 2008 at 4:20 AM
Message: Thank you for the good article!

One issue that I have come across is where Identity is combined with primary key and a clustered index. Once you attempt to insert a large number of rows in a nearly continuous process (as would be the case in batch processing), the insert of the first rows went pretty fast, but then it slowed down a great deal. As I understand, this is related to the uneven distribution of rows over the pages used to save the clustered index. Once the clustered index was separated from the identity, the inserts were much faster. Could you please elaborate on this issue?

Best regards

Subject: thanks
Posted by: Gio (not signed in)
Posted on: Wednesday, June 04, 2008 at 4:50 AM
Message: very good and complete survey on the entire issue. maybe it could be interesting to go deeper in the reasons in favour or opposite to the use of identity, because in my opinion there is often a misuse of identity column.

Subject: What happens when IDENTITY goes out of scope
Posted by: cardasim (not signed in)
Posted on: Wednesday, June 04, 2008 at 5:28 AM
Message: A frequent question that I've heard: "What happens when an IDENTITY column goes out of scope"?. What happens when an INT IDENTITY column goes beyound the MAX INT value?
This topic could be an addition to the next revisions of this excelent article!

Subject: Nice post
Posted by: pradeep Tiwari (not signed in)
Posted on: Wednesday, June 04, 2008 at 5:32 AM
Message:
Very good post, i was not known about many things

--pradeep
www.ZoomPune.com

Subject: fxtdrBPBWsIJ
Posted by: jckxoxn (not signed in)
Posted on: Wednesday, June 04, 2008 at 6:00 AM
Message: zQDkPF <a href="http://lqggtspqtasl.com/">lqggtspqtasl</a>, [url=http://qvtxybexodcy.com/]qvtxybexodcy[/url], [link=http://qiifycbqwkqp.com/]qiifycbqwkqp[/link], http://dtwvzswhvmvd.com/

Subject: Identity column - to use or not to use
Posted by: GISChimp.blogspot.com (not signed in)
Posted on: Wednesday, June 04, 2008 at 6:15 AM
Message: Informative article. Ident columns do seem to be more of a short-cut than anything else. The "need" for consecutively numbered records can be accomplished in other fashions - commenters mentioned number tables; also, using sp or triggers.

I'm sure that often DBAs get roped into Ident fields to satisfy business rules implemented by "spreadsheet gurus" from non-technical sides of an organization.

Your article dealt with a lot of aspects of the topic and I was introduced to some issues not previously considered. Thanks for the posting.

Subject: Nice Post
Posted by: Aditya Nayak (not signed in)
Posted on: Wednesday, June 04, 2008 at 7:29 AM
Message: Learnt many new things from ur post

Subject: Idenity uses
Posted by: Anonymous (not signed in)
Posted on: Wednesday, June 04, 2008 at 8:59 AM
Message: What I typically see and like is when customers have an identity column, but then use another column to store an 'external' unique identifier.

I write custom queries to import School SIS(Student Information System) DBs for our services. When we have a customer who changes SIS, having an external unique identifier lets me match users very easily. While having identity column makes for easy joins when you have 5-10 tables. Also, having a nice small integer to put an index on keeps indexes smaller than SIS's who use string/etc.

my $.02

Subject: Idenity uses
Posted by: Anonymous (not signed in)
Posted on: Wednesday, June 04, 2008 at 9:14 AM
Message: What I typically see and like is when customers have an identity column, but then use another column to store an 'external' unique identifier.

I write custom queries to import School SIS(Student Information System) DBs for our services. When we have a customer who changes SIS, having an external unique identifier lets me match users very easily. While having identity column makes for easy joins when you have 5-10 tables. Also, having a nice small integer to put an index on keeps indexes smaller than SIS's who use string/etc.

my $.02

Subject: Another Way...
Posted by: Squeel (not signed in)
Posted on: Wednesday, June 04, 2008 at 9:16 AM
Message: ...and then there's the Row_Number() over (partition by... order by...) statement (for SQL2005)...

Subject: Another option for getting the identity value
Posted by: Chad (view profile)
Posted on: Wednesday, June 04, 2008 at 9:37 AM
Message: Another way to get the value of an inserted identity is using the OUTPUT clause in SQL 2K5. This will also return all the values if you are doing an INSERT INTO...SELECT. Thanks for the article!

Subject: Another Way...
Posted by: Squeel (not signed in)
Posted on: Wednesday, June 04, 2008 at 9:38 AM
Message: ...and then there's the Row_Number() over (partition by... order by...) statement (for SQL2005)...

Subject: Reseeding surprise
Posted by: Anonymous (not signed in)
Posted on: Wednesday, June 04, 2008 at 9:40 AM
Message: Good article with coverage of most things IDENTITY. One little-known fact (it is documented, but not truly obvious until you run into it):

Using DBCC CHECKIDENT with the RESEED option gives different results depending on whether the table has ever had any rows inserted - even if they have been deleted!!!

Create Table #Reseed (ID Int IDENTITY(1,1), Data varchar(10))
DBCC CHECKIDENT (#Reseed, RESEED, 1001)
INSERT INTO #Reseed (Data) VALUES ('First Time')
SELECT * FROM #Reseed
DELETE FROM #Reseed
DBCC CHECKIDENT (#Reseed, RESEED, 1001)
INSERT INTO #Reseed (Data) VALUES ('Next Time')
SELECT * FROM #Reseed

Using TRUNCATE TABLE instead of DELETE, however, makes the table look "virgin", and both reseed give the same result.

Subject: Reseeding surprise
Posted by: Anonymous (not signed in)
Posted on: Wednesday, June 04, 2008 at 10:47 AM
Message: Good article with coverage of most things IDENTITY. One little-known fact (it is documented, but not truly obvious until you run into it):

Using DBCC CHECKIDENT with the RESEED option gives different results depending on whether the table has ever had any rows inserted - even if they have been deleted!!!

Create Table #Reseed (ID Int IDENTITY(1,1), Data varchar(10))
DBCC CHECKIDENT (#Reseed, RESEED, 1001)
INSERT INTO #Reseed (Data) VALUES ('First Time')
SELECT * FROM #Reseed
DELETE FROM #Reseed
DBCC CHECKIDENT (#Reseed, RESEED, 1001)
INSERT INTO #Reseed (Data) VALUES ('Next Time')
SELECT * FROM #Reseed

Using TRUNCATE TABLE instead of DELETE, however, makes the table look "virgin", and both reseed give the same result.

Subject: but WHY are gaps in ID possible?
Posted by: Anonymous (not signed in)
Posted on: Wednesday, June 04, 2008 at 11:32 AM
Message: I'm probably just proudly displaying my ignorance here, but there is one thing I don't understand.

Is there any real reason that an identity column's seed value must be incremented when an insert operation fails?

Put another way, what is stopping SQL Server from rolling back the identity, along with everything else, when an insert fails?


Subject: but WHY are gaps in ID possible?
Posted by: Jeff Moden (view profile)
Posted on: Wednesday, June 04, 2008 at 2:55 PM
Message: Because the next user may have already used the next seed...

Subject: Nice article...
Posted by: Jeff Moden (view profile)
Posted on: Wednesday, June 04, 2008 at 2:57 PM
Message: Nicely done, Nigel... just a thought... this would have been a great place to introduce the OUTPUT clause to return the ID's of the rows affected by a setbased operation to the GUI.

Subject: Output clause
Posted by: nigelrivett (view profile)
Posted on: Wednesday, June 04, 2008 at 6:00 PM
Message: >> Chad
>> Jeff Moden

It's a fair cop (probably an English phrase).
Completely slipped my mind.

Subject: re: Reseeding surprise
Posted by: nigelrivett (view profile)
Posted on: Wednesday, June 04, 2008 at 6:05 PM
Message: I didn't know that.
Thanks.

Subject: re: The hotspot problem
Posted by: nigelrivett (view profile)
Posted on: Wednesday, June 04, 2008 at 6:11 PM
Message: >> One issue that I have come across is where Identity is combined with primary key and a clustered index. Once you attempt to insert a large number of rows in a nearly continuous process (as would be the case in batch processing), the insert of the first rows went pretty fast, but then it slowed down a great deal. As I understand, this is related to the uneven distribution of rows over the pages used to save the clustered index. Once the clustered index was separated from the identity, the inserts were much faster. Could you please elaborate on this issue?

Do you mean there's a clustered index on just the identity column? It should mean that the rows are inserted in the order of the clusttered index so there should be very little page splitting or re-ordering necessary. A bit surprised that you see slowdown here unless it's to do with the log.
When you say the clustered index is seperated do you mean it's placed on an unordered column? I would expect that to be slower.

Subject: re: but WHY are gaps in ID possible?
Posted by: nigelrivett (view profile)
Posted on: Wednesday, June 04, 2008 at 6:14 PM
Message: It can't rollback the value as that would mean blocking the table from inserts on other threads until the commit.

The choices are - allocate the value on insert and allow gaps or allocate the value on successful commit so no gaps.
The problem with the second option is that anything running in a transaction (including triggers) couldn't get the identity value until the transaction is committed.

Subject: Removing Identity Columns
Posted by: Jono Rogers (not signed in)
Posted on: Wednesday, June 04, 2008 at 6:33 PM
Message: Great article, but one thing you missed was how to remove the identity spec from a column. You might need to do this for data conversion, when you want to generate a load of primary keys but then want your PK column to be not an identity once again.

The only way I know is to remove the primary key, create a duplicate column without identity, copy the data, delete the original column and re-create the key. Anyone know a better way?

Subject: Removing Identity Columns
Posted by: nigelrivett (view profile)
Posted on: Wednesday, June 04, 2008 at 11:18 PM
Message: It was meant to be covered by

An identity column can be added to a table but the identity property of an existing column cannot be changed

Don't get confused between a PK and an identity - they aren't connected. An identity nay or may not be defined as a PK like any other column.

Subject: Use the output clause to get the new identity value
Posted by: Gideon Kahl (not signed in)
Posted on: Tuesday, June 10, 2008 at 5:50 AM
Message: With SQL 2005 there is a new way of getting the identity value for a newly inserted row:

CREATE TABLE #t1 (i INT IDENTITY(5,1), j INT)
INSERT #t1 (j) output inserted.i SELECT 1

This statement will insert a new record and return the newly inserted record as output.

Subject: Unique PK??
Posted by: Anonymous (not signed in)
Posted on: Tuesday, June 10, 2008 at 1:51 PM
Message: So, how would one generate a row identifier? This seems clumsy....

create table tbl_rowid(
sysobj_name sysname
, syscol sysname
, lastnum varchar(15)
)

create table myTable(
row_id varchar(15)
, col1 varchar(5)
, col2 money
, col3 int
)

insert myTable (row_id,col1,col2,col3)
values (
select lastnum+1 from tbl_rowid where sysobj_name='myTable' and syscol='row_id'
, 'Fred'
, 1.2000
, 3
)

update tbl_rowid
set lastnumer=lastnumber+1
where sysobj_name='myTable' and syscol='row_id'

I mean, if I'm inserting via bulk......

Subject: Ooops!
Posted by: Anonymous (not signed in)
Posted on: Tuesday, June 10, 2008 at 1:54 PM
Message: I forgot to convert the varchar(15) to a number type before incrementing...

...another clumsy step, as well.

Subject: Ooops!
Posted by: Anonymous (not signed in)
Posted on: Tuesday, June 10, 2008 at 2:12 PM
Message: I forgot to convert the varchar(15) to a number type before incrementing...

...another clumsy step, as well.

Subject: Nice Article regarding Identity
Posted by: Vijay Sutariya (not signed in)
Posted on: Wednesday, June 11, 2008 at 1:54 AM
Message: Realy Goods. Keep it Up.

Subject: Great Value
Posted by: Jorge cunha (view profile)
Posted on: Thursday, June 12, 2008 at 4:57 PM
Message: Great value to me and for other as well



Subject: guide
Posted by: mirror (not signed in)
Posted on: Monday, July 28, 2008 at 1:20 AM
Message: hi,
i face one problem.
after continuous increment when i delete one row.
the next identity value start from the next increment not from one which i have deleted.

thanks for this valuable topic which is really helpful for designer.

ams

Subject: guide
Posted by: mirror (not signed in)
Posted on: Monday, July 28, 2008 at 2:44 AM
Message: hi,
i face one problem.
after continuous increment when i delete one row.
the next identity value start from the next increment not from one which i have deleted.

thanks for this valuable topic which is really helpful for designer.

ams

Subject: guide
Posted by: dao duy vu (not signed in)
Posted on: Thursday, August 07, 2008 at 11:28 PM
Message: I have a column set as identity property in database
So,How can I deleted a row in table while index in column idetity still In ascending

Subject: Upadate Indentiy
Posted by: lalitjagtap_it (view profile)
Posted on: Thursday, December 18, 2008 at 9:37 PM
Message: Can we Update Identity value

 

Phil Factor
Searching for Strings in SQL Server Databases

Sometimes, you just want to do a search in a SQL Server database as if you were using a search engine like Google.... Read more...

 View the blog

Top Rated

SQL Server XML Questions You Were Too Shy To Ask
 Sometimes, XML seems a bewildering convention that offers solutions to problems that the average... Read more...

Continuous Delivery and the Database
 Continuous Delivery is fairly generally understood to be an effective way of tackling the problems of... Read more...

The SQL Server Sqlio Utility
 If, before deployment, you need to push the limits of your disk subsystem in order to determine whether... Read more...

The PoSh DBA - Reading and Filtering Errors
 DBAs regularly need to keep an eye on the error logs of all their SQL Servers, and the event logs of... Read more...

MySQL Compare: The Manual That Time Forgot, Part 1
 Although SQL Compare, for SQL Server, is one of Red Gate's best-known products, there are also 'sister'... Read more...

Most Viewed

Beginning SQL Server 2005 Reporting Services Part 1
 Steve Joubert begins an in-depth tour of SQL Server 2005 Reporting Services with a step-by-step guide... Read more...

Ten Common Database Design Mistakes
 If database design is done right, then the development, deployment and subsequent performance in... Read more...

SQL Server Index Basics
 Given the fundamental importance of indexes in databases, it always comes as a surprise how often the... Read more...

Reading and Writing Files in SQL Server using T-SQL
 SQL Server provides several "standard" techniques by which to read and write to files but, just... Read more...

Concatenating Row Values in Transact-SQL
 It is an interesting problem in Transact SQL, for which there are a number of solutions and... Read more...

Why Join

Over 400,000 Microsoft professionals subscribe to the Simple-Talk technical journal. Join today, it's fast, simple, free and secure.