1

The documentation for SCOPE_IDENTITY states that it

Returns the last identity value inserted into an identity column in the same scope.

I wonder what last really means. Is it the largest value inserted in current scope, or the value of row that was physically last inserted into the table?

I tried to

  • create table with IDENTITY as a clustered index
    • this worked as expected; return value of SCOPE_IDENTITY was the highest value of identity column inserted
  • create table with IDENTITY as a column with different column as a cluster
    • i tried to use different ORDER BY statements to make sure that the generated IDENTITY didn't match the clustered index in any way
    • SCOPE_IDENTITY always returned the highest value of IDENTITY column reached, even though the identities were generated irrespectively of the clustered index
  • I didn't try a heap table

Why I ask: I stumbled upon a piece of code that inserts multiple rows and that uses SCOPE_IDENTITY to get the highest ID inserted, with which it then performs something. Given how SCOPE_IDENTITY seems to work, it currently works OK, but I wonder if it can bring in unexpected bugs.

My question is:
Should I alway assume that the SCOPE_IDENTITY returns the highest value of IDENTITY inserted (for ascending IDENTITY), no matter how the rows were either physically or logically inserted into the table? Or should I just use OUTPUT and than MAX on the identity column to be sure?

Edit: To put it into code, will following code always result in "This happens everytime", no matter what is in the CREATE TABLE #WithID, SELECT, FROM, WHERE or ORDER BY? Assuming the ID INTEGER IDENTITY(1,1) remains constant.

CREATE TABLE #WithID (ID INTEGER IDENTITY(1,1), SomeData INTEGER);
CREATE TABLE #Outputs (ID INTEGER);

INSERT INTO #WithID
OUTPUT Inserted.ID INTO #Outputs
SELECT Number
FROM dbo.Numbers
WHERE Number < 1000000
ORDER BY Number;

IF SCOPE_IDENTITY() = (SELECT MAX(ID) FROM #Outputs)
  PRINT 'This happens everytime'
ELSE
  PRINT 'Oops...';

note: I don't know how well known it is that the values created by IDENTITY are created in the order of the ORDER BY of the SELECT clause and not in the order the rows are physically inserted into the table.

andowero
  • 439
  • 4
  • 13
  • 1
    `Is it the largest value inserted in current scope, or the value of row that was physically last inserted into the table` - these two categories are the same because identity is an ever increasing value, so the last inserted row will have the largest identity of the current inserted group. And if you are providing [pre-calculated identities](https://learn.microsoft.com/en-us/sql/t-sql/statements/set-identity-insert-transact-sql?view=sql-server-ver15), then you don't need `scope_identity()` in the first place. – GSerg Jun 09 '21 at 06:41
  • 1
    I *suppose* (technically speaking) you *could* create an `IDENTITY` where the value is ever decreasing and the above wouldn't be true though, @GSerg . :) [db<>fiddle](https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=904ac8137e1c4819cc83716531e135e5) – Thom A Jun 09 '21 at 08:01
  • I actually have a scenario where that occurs, ie an identity value that decreases (an inherited system) so for that table the identity is always the lowest whether single row or batch of rows. – Stu Jun 09 '21 at 08:13
  • 2
    @Larnu Okay, that was not general enough. The last inserted row will have identity farthest from the identity starting point, in the direction of the identity increment sign. – GSerg Jun 09 '21 at 09:31
  • My comment was more meant in jest, @GSerg , don't worry. Though Stu above has said that they do have that scenario, I wouldn't expect to see if more than once or twice in a career. :) – Thom A Jun 09 '21 at 09:34
  • 2
    scope_identity() is only intended to be used with single row inserts. If you intend to insert multiple rows, such as from a tvp in a stored procedure, you probably want to use [output into](https://learn.microsoft.com/en-us/sql/t-sql/queries/output-clause-transact-sql#a-using-output-into-with-a-simple-insert-statement) to explicitly capture the identity values for each row inserted. – AlwaysLearning Jun 09 '21 at 09:50
  • 1
    What are you actually trying to achieve, or is this just an academic question? – Charlieface Jun 09 '21 at 10:33
  • @GSerg - These two categories are not the same. The identity doesn't need to follow the clustered index and i presume that, when inserting multiple values, the SQL Server inserts rows in the order of the clustered index. – andowero Jun 09 '21 at 11:48
  • 1
    @Charlieface - I stumbled upon a multi-row insert using SCOPE_IDENTITY in our production database and I wonder if it is safe to use it in such a scenario. It currently works, but I would like to know how safe we are in the future :) – andowero Jun 09 '21 at 11:51
  • Edited in an code example. – andowero Jun 09 '21 at 12:02
  • No of course it isn't, because the `IDENTITY` sequence might be reset. Why would you do this anyway, what "Oops" are you trying to catch, and who cares if the last `IDENTITY` value is the highest in the table? – Charlieface Jun 09 '21 at 12:12
  • I don't care if the value is reset, I care only about the latest `INSERT` statement. I assume that `IDENTITY` reset is a schema change and that it couldn't happen in the middle of running `INSERT`. The "Oops" would be, if `SCOPE_IDENTITY` returned i.e. 10, but the largest inserted `IDENTITY` was 20. – andowero Jun 09 '21 at 12:31
  • It's probably safe, and also probably something you don't want to rely on because it's not *obviously* safe. If the `SCOPE_IDENTITY` isn't going to be used to identify a single row, it should probably not be used. For explicit operations that require reserving a contiguous range of values or that allow you to make assumptions about the values, use sequences; for operations that want to start from "the current highest value", use a `MAX(ID)` as close to the point of the query that requires it (which has the benefit that you can make it atomic). – Jeroen Mostert Jun 09 '21 at 12:48
  • @and There is no insertion order, the tables are fundamentally unsorted. While an `order by` will be honored in an insert [if](https://stackoverflow.com/a/56005568/11683) it concerns an identity column (for the purpose of assigning the identities), that does [not](https://stackoverflow.com/a/41917512/11683) correspond with the order of physical writes in the table. `scope_identity()` returns the last (largest[*](https://stackoverflow.com/questions/67898744/value-of-scope-identity-when-inserting-more-than-one-row?noredirect=1#comment120017590_67898744)) identity value generated for the set. – GSerg Jun 09 '21 at 13:49
  • @GSerg if we look at it from the viewpoint of the IDENTITY itself, than yes, I get what your are trying to say. – andowero Aug 25 '21 at 16:12

0 Answers0