Home > Clustering, Microsoft, Windows Server > Windows Server 2003 (R2) System Level Fault Tolerance (Clustering/NLB) Best Practices

Windows Server 2003 (R2) System Level Fault Tolerance (Clustering/NLB) Best Practices

  • Always use quality server & networking hardware for fault-tolerant systems.
  • Use RAID to create disk subsystem redundancy.
  • Don’t run MSCS and NLB on the same computer, as it’s not supported by Microsoft.
  • When possible, try to use cluster-aware applications, so you can use cluster service to monitor the application. If you use cluster-unaware application, it can run on a cluster, but the application is not monitored by cluster service.
  • Use active/passive clustering mode, when performance is not critical. It is easier to administrate and licensing costs are lower.
  • If you got TCP/IP-based services such as Terminal Services, Web sites, VPN services or streaming media services, use NLB.
  • For mission critical applications (enterprise messaging, databases, file and print services) use Windows Server 2003 Cluster Services to provide server failover functionality.
  • Disable power management on each of the cluster nodes. IN BIOS and in operating system’s control panel to avoid unwanted failovers.
  • Choose carefully whether you should use nonshared or shared disk approcah to clustering.
  • When you plan to use MSN cluster, always purchase 1 additional node.
  • Be sure that MS and software manufacturer certify that 3rd party software for Cluster Service works on Windows Server 2003 cluster or you might be faced with limited support when troubleshooting is needed.
  • In each node use multiple network cards. For example one card can be dedicated to private network (internal cluster communications), other can be used for public network (client connectivity) or both can be used for mixed network (public and private communication)
  • Configure failback schedule to allow failback only during non-peak times or after hours to reduce the chance of having a group failing back to a node during regular business hours after a failure.
  • Test failover and failback mechanism thoroughly.
  • If you are logged in with Cluster Service account, don’t use AD Users & Computers or Windows security box to change the password.
  • If you’re removing a node from MNS cluster, make sure that majority of the nodes remain running to keep the cluster in a working state.
  • Carefully consider how to backup and restore a cluster.
  • Perform ASR backups periodically and immediately after any hardware changes to a cluster node including changes on a shared storage device or local disk configuration.
  • Before deciding which clustering technology to use, make sure you understand the application that will be used thoroughly.
  • Create a rule that allows only specific ports to the clustered IP address and block all others.
  • Use tools like robocopy.exe to replicate data between NLB nodes.

-Eric

Advertisements
  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: