93

It's an old question I know, but with SQL Server 2012 is it finally ok to store files in the database, or should they really be kept in the filesystem with only references to them in the database?

If storing them in the database is considered acceptable these days, what is the most effective way to do it?

I'm planning to apply encryption so I appreciate processing will not be lightning fast.

Uwe Keim
  • 39,551
  • 56
  • 175
  • 291
CompanyDroneFromSector7G
  • 4,291
  • 13
  • 54
  • 97

3 Answers3

138

There's a really good paper by Microsoft Research called To Blob or Not To Blob.

Their conclusion after a large number of performance tests and analysis is this:

  • if your pictures or document are typically below 256K in size, storing them in a database VARBINARY column is more efficient

  • if your pictures or document are typically over 1 MB in size, storing them in the filesystem is more efficient (and with SQL Server 2008's FILESTREAM attribute, they're still under transactional control and part of the database)

  • in between those two, it's a bit of a toss-up depending on your use

If you decide to put your pictures into a SQL Server table, I would strongly recommend using a separate table for storing those pictures - do not store the employee photo in the employee table - keep them in a separate table. That way, the Employee table can stay lean and mean and very efficient, assuming you don't always need to select the employee photo, too, as part of your queries.

For filegroups, check out Files and Filegroup Architecture for an intro. Basically, you would either create your database with a separate filegroup for large data structures right from the beginning, or add an additional filegroup later. Let's call it "LARGE_DATA".

Now, whenever you have a new table to create which needs to store VARCHAR(MAX) or VARBINARY(MAX) columns, you can specify this file group for the large data:

 CREATE TABLE dbo.YourTable
     (....... define the fields here ......)
     ON Data                   -- the basic "Data" filegroup for the regular data
     TEXTIMAGE_ON LARGE_DATA   -- the filegroup for large chunks of data

Check out the MSDN intro on filegroups, and play around with it!

Marcel Gosselin
  • 4,610
  • 2
  • 31
  • 54
marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
  • 1
    Well said. It all depends on the use case of course, but filestream is frequently a good option. – TimothyAWiseman Nov 16 '12 at 17:02
  • 6
    The research article you have quoted is from April 2006. Surely a lot of things have changed since then. – Oxon Sep 16 '14 at 16:19
  • 2
    @1576573987: no, not really - these conclusions still are valid, as far as I can tell – marc_s Sep 16 '14 at 16:31
  • Regarding storing `VARBINARY(MAX)` in a separate filegroup, if you have a separate "file" table, do you just store that whole table in the separate filegroup or do you store the file "meta" data in the standard filegroup(s) and just store the `VARBINARY(MAX)` in the new filegroup? – RemarkLima Nov 20 '14 at 13:42
  • @RemarkLima: are you talking about the `FILETABLE` feature in SQL Server 2012? There, only the metadata about the file is stored inside SQL Server (like with `FILESTREAM`) - the actual file (the bytes that make it up) are stored outside the database, on a disk drive – marc_s Nov 20 '14 at 13:59
  • @marc_s Sorry, talking about SQL 2008 R2, so adding another `.nfd` / `.mdf` file as separate storage for blobs. Is that what you meant in your answer? To use a separate filegroup for BLOBS rather than `FILESTREAM` – RemarkLima Nov 20 '14 at 14:30
30

There's still no simple answer. It depends on your scenario. MSDN has documentation to help you decide.

There are other options covered here. Instead of storing in the file system directly or in a BLOB, you can use the FileStream or File Table in SQL Server 2012. The advantages to File Table seem like a no-brainier (but admittedly I have no personal first-hand experience with them.)

The article is definitely worth a read.

David
  • 72,686
  • 18
  • 132
  • 173
12

You might read up on FILESTREAM. Here is some info from the docs that should help you decide:

If the following conditions are true, you should consider using FILESTREAM:

  • Objects that are being stored are, on average, larger than 1 MB.
  • Fast read access is important.
  • You are developing applications that use a middle tier for application logic.

For smaller objects, storing varbinary(max) BLOBs in the database often provides better streaming performance.

Tim Lehner
  • 14,813
  • 4
  • 59
  • 76