72

I have a table with huge amount of data. I'd like to add extra column id and use it as a primary key. What is the better way to fill this column with values from one 1 to row count

Currently I'm using cursor and updating rows one by one. It takes hours. Is there a way to do that quicker?

Thank you

General Grievance
  • 4,555
  • 31
  • 31
  • 45
Sergejs
  • 2,540
  • 6
  • 32
  • 51

2 Answers2

158

Just do it like this:

ALTER TABLE dbo.YourTable
ADD ID INT IDENTITY(1,1)

and the column will be created and automatically populated with the integer values (as Aaron Bertrand points out in his comment - you don't have any control over which row gets what value - SQL Server handles that on its own and you cannot influence it. But all rows will get a valid int value - there won't be any NULL or duplicate values).

Next, set it as primary key:

ALTER TABLE dbo.YourTable
ADD CONSTRAINT PK_YourTable PRIMARY KEY(ID)
marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
  • 2
    +1, but I hope the OP names it something more descriptive then `ID` like `WidgetID`, etc. – KM. Feb 23 '12 at 14:27
  • @KM: slim hope - he actually mentioned in his question that he wants to add `id` :-) – marc_s Feb 23 '12 at 14:28
  • 20
    I prefer id. WidgetID sounds like a foreign key. So i hope he doesn't – t-clausen.dk Feb 23 '12 at 14:29
  • 17
    @KM. - That's up for debate. Should a table `Person` have a PK called `ID` or called `PersonID`. There is no answer, it is preference, environment standard, but certainly not universally agreed. – MatBailie Feb 23 '12 at 14:29
  • Well, "proper values" is a bit subjective, since you don't really have any control over which row gets which identity value - it will depend on underlying indexes. :-) – Aaron Bertrand Feb 23 '12 at 14:30
  • @AaronBertrand: well, proper in terms of - there are going to be `int` values in there - not all NULLs or something :-) But you're right - these `int` are dished out without your control. But I updated my response to include that feedback of yours - thanks! – marc_s Feb 23 '12 at 14:30
  • 9
    @Dems my personal opinion is that it should be `PersonID` no matter where in the schema it is referenced. But that's a discussion for a different question, and probably not on this site. :-) – Aaron Bertrand Feb 23 '12 at 14:31
  • Something to note, OP mentioned "huge amount of data", which is subjective. However it takes hours to update using a cursor? He may want to seed with `identity (-2147483648, 1)` – Joey Feb 23 '12 at 14:42
  • 1
    For all we know he has a `WAITFOR DELAY '00:30:00';` in each iteration of the cursor. Or doing other silly or inefficient things. Huge is certainly subjective and don't underestimate the ability for people to make normally fast operations really slow, or to exaggerate. :-) – Aaron Bertrand Feb 23 '12 at 15:00
  • in my *opinion* columns should have a descriptive name, and ID is not descriptive, would you call a column `int` or `FK`?. in my *opinion* columns that join (FKs) should be of the same name and data type. As a result, a good PK would be PersonID and the matching FK column would still be PersonID. I understand that other people have other opinions, I'm just glad the people I work with share this philosophy with me. – KM. Feb 23 '12 at 15:18
  • Technically, I think you can cause the identities to get applied in a deterministic order if you define a clustered index with the columns/sorts you want -- but watch out!! -- this will cause the 'huge' table to get re-organized on disk. – Andrew Theken May 21 '13 at 18:22
  • 5
    @KM I hope you wouldn't prefix every column with the name of the table.. If `PersonID`, then why not also `PersonFirstName`, `PersonLastName`.. ? – Blorgbeard Feb 25 '14 at 20:03
  • 11
    @Blorgbeard, the idea is that you end up using "ID" columns in multiple tables as a foreign key. As a result, it is nice to have the same column name (PersonID) in both tables. In large systems, with many tables and loads of tsql code, it can be very painful where the foreign keys have different names. – KM. Feb 25 '14 at 21:43
  • @KM. I can see that perspective. – Blorgbeard Feb 25 '14 at 21:44
  • 4
    @KM. Prefixing with table name or alias is much cleaner approach IMHO. – Arek Bal Jun 26 '14 at 14:11
  • @t-clausen.dk I don't like small letters I'd go with `WidgetId` – bombek Aug 08 '19 at 08:55
4

If you want to add row numbers in a specific order you can do ROW_NUMBER() into a new table then drop the original one. However, depending on table size and other business constraints, you might not want to do that. This also implies that there is a logic according to which you will want the table sorted.

SELECT ROW_NUMBER() OVER (ORDER BY COL1, COL2, COL3, ETC.) AS ID, *
INTO NEW_TABLE
FROM ORIGINAL_TABLE
Misa
  • 109
  • 4