I've seen plenty of technical mistakes when implementing
SharePoint, particularly in larger environments when the risks of
failure are higher. Here's a countdown of my top ten "favorite"
SharePoint mistakes:
10. SQL Server Performance
SQL Server performance is the lifeblood of SharePoint, and yet
frequently folks don't size SQL Servers correctly. If you want to
know if your SQL Server has enough RAM, there's a simple counter to
watch: SQL Server Buffer Manager: Page Life Expectancy.
This is how long the buffer manager expects it can keep a cached
page in memory. You want this number to be 300 (seconds) or better.
If it's not, your performance is suffering because you're forcing
the SQL Server to go to the disks too often.
9. SAN Configuration
All too often storage capacity is the name of the SAN game.
However, performance is more important. You want to make sure that
the SAN can respond to both read and write requests within 20ms -
ideally within 10ms. This is a combination of smaller, faster disks
and more of them.
It's also a matter of using RAID 10 instead of RAID 5 for striping.
If you believe the "snake oil" that the configuration of disks in a
SAN doesn't matter because your vendor is "special," you might need
to look for a new line of work. The physics of disks applies
whether your vendor wants them to or not.
8. Load Balancer Configuration
The load balancer is the traffic cop for your environment, and a
bad load balancer configuration can make performance bad. You want
to configure your load balancer for session affinity, or sticky, or
whatever they want to call it to keep sessions on the same server
they started on.
That's because SharePoint caches a ton of information locally on
the server. Keeping a session on the same server will perform
better over time. Keep them on the same server for long periods of
time, for example, 20 minutes, not 20 seconds.
7. SharePoint Server Disk
Whether your plan missed the disk capacity for search indexes, or
you skipped over the performance of those disks, search query
performance relies on query servers which need about 30 percent of
the disk that you're crawling content for.
Thirty percent is a generally safe number. Make sure you plan for
how much storage you need on the SharePoint servers - including
performance.
6. Core Network
There's some argument about segmenting user traffic from back end
traffic on SharePoint servers; however, everyone agrees that
network performance between the SharePoint Servers and SQL Server
is critical. It should be low latency and high-capacity.
Generally this means only switches between the SharePoint servers
and the SQL Servers. Putting a firewall between SharePoint servers
and SQL Server is silly.
Make sure your latency between servers is less than 10ms. For the
record, my observation is that you should aggregate all network
interfaces rather than segmenting front- and back-end
traffic.
5. Not Having a Quality Assurance Environment
Sure, most large implementations implement QA environments but all
too often their configuration is allowed to drift from the
production environment. QA should match production in terms of the
types of components, and should be fractionalized in terms of the
number of servers and resources for cost reasons.
Make sure that your QA environment has a load balancer and all the
firewalls that your production environment has and that the rules
are the same. You've been warned.
4. Crosstalk Between Environments
Environments shouldn't be able to talk to each other. QA shouldn't
be able to see into development, and production shouldn't be able
to peer into QA.
If you do allow this, you should expect you'll create an unexpected
cross-environment dependency. You'll take down the development
environment, and production will crash. Not good.
3. Abstract IP
One of the neat tricks that sometimes will happen is the use of
reverse proxies in front of a SharePoint farm. It sounds good on
the surface, until you realize that your SharePoint server won't
see the client IP address.
What's the problem? Well, try debugging your production server when
you can't figure out which traffic is having the problem just once,
and you won't have to ask again.
2. Monitoring
SharePoint will warn you it's having trouble. From ULS logs and
event logs to the health score that's returned with every HTTP
request, SharePoint isn't shy about telling you it needs
help.
Of course, you have to be listening. Load balancers watch servers
to see if they're in trouble, and so does System Center Operations
Manager, but you have to set these things up, and respond to
trouble tickets when they come.
1. Big Bang Roll Out
Someone sends out an email that the new intranet site, My Sites,
and collaboration platform are available. Suddenly everyone in the
organization comes flooding in, and in the process, they put the
entire farm underwater.
The servers encounter more load in an hour than they'll typically
encounter in weeks of operation, and a great environment is
tarnished by one big email. Rather than doing one big-bang email to
everyone, stage your communication over the course of a day or two
to even out the load a bit.
It's much better to be twiddling your thumbs because the servers
aren't busy than trying to scramble to keep the environment
functional due to overwhelming demand.
That's my top 10 list, what's yours?
Robert Bogue is a Microsoft MVP for SharePoint, an internationally
renowned speaker, and author of 22 books including the SharePoint
Shepherd's Guide for End Users. You can find out more about
Robert's work to encourage business value out of SharePoint at
SharePoint Shepherd or more about his technical solutions at Thor
Projects.
I've seen plenty of technical mistakes when implementing
SharePoint, particularly in larger environments when the risks of
failure are higher. Here's a countdown of my top ten "favorite"
SharePoint mistakes:
10. SQL Server Performance
SQL Server performance is the lifeblood of SharePoint, and yet
frequently folks don't size SQL Servers correctly. If you want to
know if your SQL Server has enough RAM, there's a simple counter to
watch: SQL Server Buffer Manager: Page Life Expectancy.
This is how long the buffer manager expects it can keep a cached
page in memory. You want this number to be 300 (seconds) or better.
If it's not, your performance is suffering because you're forcing
the SQL Server to go to the disks too often.
9. SAN Configuration
All too often storage capacity is the name of the SAN game.
However, performance is more important. You want to make sure that
the SAN can respond to both read and write requests within 20ms -
ideally within 10ms. This is a combination of smaller, faster disks
and more of them.
It's also a matter of using RAID 10 instead of RAID 5 for
striping. If you believe the "snake oil" that the configuration of
disks in a SAN doesn't matter because your vendor is "special," you
might need to look for a new line of work. The physics of disks
applies whether your vendor wants them to or not.
8. Load Balancer Configuration
The load balancer is the traffic cop for your environment, and a
bad load balancer configuration can make performance bad. You want
to configure your load balancer for session affinity, or sticky, or
whatever they want to call it to keep sessions on the same server
they started on.
That's because SharePoint caches a ton of information locally on
the server. Keeping a session on the same server will perform
better over time. Keep them on the same server for long periods of
time, for example, 20 minutes, not 20 seconds.
7. SharePoint Server Disk
Whether your plan missed the disk capacity for search indexes,
or you skipped over the performance of those disks, search query
performance relies on query servers which need about 30 percent of
the disk that you're crawling content for.
Thirty percent is a generally safe number. Make sure you plan
for how much storage you need on the SharePoint servers - including
performance.
6. Core Network
There's some argument about segmenting user traffic from back
end traffic on SharePoint servers; however, everyone agrees that
network performance between the SharePoint Servers and SQL Server
is critical. It should be low latency and high-capacity.
Generally this means only switches between the SharePoint
servers and the SQL Servers. Putting a firewall between SharePoint
servers and SQL Server is silly.
Make sure your latency between servers is less than 10ms. For
the record, my observation is that you should aggregate all network
interfaces rather than segmenting front- and back-end traffic.
5. Not Having a Quality Assurance
Environment
Sure, most large implementations implement QA environments but
all too often their configuration is allowed to drift from the
production environment. QA should match production in terms of the
types of components, and should be fractionalized in terms of the
number of servers and resources for cost reasons.
Make sure that your QA environment has a load balancer and all
the firewalls that your production environment has and that the
rules are the same. You've been warned.
4. Crosstalk Between Environments
Environments shouldn't be able to talk to each other. QA
shouldn't be able to see into development, and production shouldn't
be able to peer into QA.
If you do allow this, you should expect you'll create an
unexpected cross-environment dependency. You'll take down the
development environment, and production will crash. Not good.
3. Abstract IP
One of the neat tricks that sometimes will happen is the use of
reverse proxies in front of a SharePoint farm. It sounds good on
the surface, until you realize that your SharePoint server won't
see the client IP address.
What's the problem? Well, try debugging your production server
when you can't figure out which traffic is having the problem just
once, and you won't have to ask again.
2. Monitoring
SharePoint will warn you it's having trouble. From ULS logs and
event logs to the health score that's returned with every HTTP
request, SharePoint isn't shy about telling you it needs help.
Of course, you have to be listening. Load balancers watch
servers to see if they're in trouble, and so does System Center
Operations Manager, but you have to set these things up, and
respond to trouble tickets when they come.
1. Big Bang Roll Out
Someone sends out an email that the new intranet site, My Sites,
and collaboration platform are available. Suddenly everyone in the
organization comes flooding in, and in the process, they put the
entire farm underwater.
The servers encounter more load in an hour than they'll
typically encounter in weeks of operation, and a great environment
is tarnished by one big email. Rather than doing one big-bang email
to everyone, stage your communication over the course of a day or
two to even out the load a bit.
It's much better to be twiddling your thumbs because the servers
aren't busy than trying to scramble to keep the environment
functional due to overwhelming demand.
That's my top 10 list, what's yours?
Robert Bogue is a Microsoft MVP for SharePoint, an
internationally renowned speaker, and author of 22 books including
the SharePoint Shepherd's Guide for End Users. You can find out
more about Robert's work to encourage business value out of
SharePoint at SharePoint
Shepherd or more about his technical solutions at Thor
Projects.
This article was first published on
SharePoint Pro.
Stay tuned for more SharePoint content by joining our community or by
following us on twitter or
facebook.