Microsoft has released various recommendations about how large
content data bases should be in SharePoint in order to still allow
for optimal performance . Gökhan’s article explores the
complexities of these recommendations and how to best go about
managing large amounts of data in SharePoint. This blog from Gökhan
Yilmaz will serve as extensions to the topic that Gökhan has
already explored in his piece.
I wrote this while searching for ways to optimize my SharePoint
2010 farm. At the time I didn’t have many content databases, but
the few that I did have were quite large. I had heard about the
recommended 200 GB limitation from Microsoft and decided to do some
further research in order to make sure that I operate a fully
supported SP2010 environment. I found an article on Microsoft
TechNet that mentions recommended terms limits, thresholds and
boundaries. The article can be found here:
Here are some of the key points mentioned in the article:
– Boundaries are absolute limits that cannot be exceeded by
design. It is important to understand these limits to ensure that
you do not make incorrect assumptions when you design your
-Thresholds are those that have a default value that cannot be
exceeded unless the value is modified. Thresholds can, in certain
circumstances, be exceeded to accommodate variances in your farm
design, but it is important to understand that doing this may
affect the performance of the farm in addition to the effective
value of other limits.
– Supported limits define the tested value for a given
parameter. The default values for these limits were defined by
testing, and represent the known limitations of the product.
Exceeding supported limits may cause unexpected results,
significant decrease in performance, or other harmful effects.
A standard boundary in SharePoint 2010 is the maximum size of a
file that can be uploaded. This is obvious as content is written in
into a field in the SQL Server backend which is using the datatype
varbinary(max). SQL Server Books Online says that varbinary(max)
storage size is 2^31-1 bytes, which is 2 GB.
mentioned above, one should consider how to backup such databases.
The size will have a negative effect on the time needed to backup.
Backup frequency and time required for restores are main factors
when considering databases of this magnitude.
Microsoft has released its second TechNet Article on SP2010 SP1
mentioning supported scenarios with content database sizes of 4 TB
and larger. At first glance that seemed very encouraging, but
after taking a closer look I found it to be more complicated that
it appears. While it is true that under specific circumstances the
effective maintenance of content databases sizes of 4 TB or more is
possible, a set of preconditions must be adhered to for this to be
I then asked myself if I would rather maintain and administer
many small content databases, or few very large databases that
might be easier to administer and manage, but are subject to the
set of preconditions (
defined by Microsoft.
Microsoft itself doesn’t really set a limit as long as only 5%
of content is accessed, less than 1% is edited and DocumentCenter
and RecordCenter Templates are being used. By using alerts or
workflows one would lose support from Microsoft. Also defining
permissions on an item level would cause a loss of support. These
scenarios simply did not come into question for my environment
because of the high level of activity in my SharePoint farm.
I use SharePoint for Project Sites, Collaboration and with some
DMS functionality. Considering the current upload behavior of users
I would be over the 4 TB within just a year. A quick report has
shown that my content database consists of about 90% BLOBs, meaning
I would have 400 GB of data in my SQL and about 3,600 GB of
BLOBs. I found myself with two options. Either I could adhere
to the criteria set by Microsoft to manage large content databases
or I could consider an alternative method: moving the BLOBs out of
my databases. The latter option would mean up to 90% of my content
would be stored outside of SQL Server.
In the end, Microsoft’s new database size recommendations didn’t
really change my situation. I am still tied to the 200 GB limit
with SP1. By using BLOBing mechanisms, I would not be able to
bypass this limitation but can improve my backup windows and
shorten restore times for single databases as well as increase my
overall performance. From my point of view, using sophisticated
BLOBing mechanism allows for much better scaling. I can also attain
aggressive SLAs and minimize
RPO/RTO because BLOB-data is stored outside of the content data
bases (and outside of SQL server).