Considerations for Multi Site Clusters in Windows Server 2012 (Part 2)

by [Published on 11 June 2013 / Last Updated on 22 Aug. 2013]

This article continues the discussion of multi-site clusters by examining the challenges of providing cluster storage in a multi-site environment.

If you would like to read the other parts of this article series please go to:

Introduction

In the first part of this series, I discussed some quorum considerations that must be taken into account when building a multi-site cluster. In spite of the undeniable importance of making sure that the cluster is always able to maintain quorum, there are other considerations that must also be taken into account.

Storage

Most on premise clusters are based around the use of a cluster shared volume. A cluster shared volume is a shared storage mechanism that is accessible to each cluster node, generally either through iSCSI or through Fibre Channel. Although Windows Server 2012 makes it possible to build a cluster without providing a cluster shared volume, Microsoft recommends the use of shared storage for performance reasons.

Although the use of a cluster shared volume is widely regarded as a best practice, multi-site clusters are often better off avoiding the use of shared storage. There are a number of reasons for this. Imagine for example, that you built a multi-site cluster and that the cluster shared volume physically resided in your primary datacenter. Now imagine that a failover occurred to a node in a remote datacenter. Although the cluster would continue to function in this situation, the cluster node in the remote datacenter would have to perform all read and write operations across a WAN link. This would most likely result in seriously degraded performance.

The use of a cluster shared volume can also be problematic in the event of a WAN link failure. Imagine a situation such as the one described above in which a cluster failed over to a node in a remote datacenter. Now, imagine that the WAN link also failed. In this situation, the remote datacenter would lose communications with the cluster shared volume, thereby resulting in a cluster level failure.

The only way to avoid these types of problems is to make sure that each datacenter has its own storage resources. There are a number of different ways to accomplish this, but one of the most popular solutions involves storage replication. The replication process generally occurs at the hardware level (as a function of an underlying SAN), but there are software layer storage replication solutions available as well.

Validating a Cluster

Any time that you build a Windows Server based cluster, whether single site or multi site, you will need to run the Validate a Configuration Wizard. This wizard can be launched through the Windows Failover Cluster Manager console, and is designed to make sure that you have met all of the prerequisites for building the cluster.

The reason why I mention this is because if you attempt to build a Windows Server 2012 based multi-site cluster that does not make use of a cluster shared volume then there are a number of the validation tests that will fail. This happens because some of the tests that the wizard performs are specifically designed with shared storage in mind. You will be able to build the cluster in spite of the fact that the shared storage related validation tests have failed. If you would rather not be distracted by cluster storage related validation error messages, the Validate a Cluster Wizard does give you the option of choosing which tests you want to run as a part of the validation process. That way you can simply skip any validation tests that do not apply to your situation.

It is also worth noting that the cluster validation results can be a little bit misleading in another way. As the cluster validation tests progress, you will notice that the individual tests are color coded. Tests that have not yet been performed are listed in black and the test that is currently being run is displayed in blue. When a test completes, the test is listed in either green to indicate that the test was successful or it is listed in red to indicate failure. However, just because a test appears in red does not necessarily mean that a problem has occurred.

For whatever reason, Microsoft does not use a dedicated color (such as yellow) to indicate a warning state. Instead, warnings are displayed in red, just as failures are. Therefore you cannot simply assume that an error has occurred merely because a test is displayed in red. The only way to really know for sure where you stand with regard to the validation tests is to take an in depth look at the Failover Cluster Validation Report that is automatically created at the end of the testing process.

Additional Storage Considerations

Earlier in this article, I suggested that cluster shared volumes might not always be a good fit for multi-site clusters, and that a better solution might be to use storage replication to create identical copies of the cluster storage in each datacenter. While this approach probably sounds good in theory, you might be wondering how it works.

The actual replication process is beyond the scope of this article series. There are a number of different mechanisms that can be used for storage replication either at the hardware level or at the software level. You can use any mechanism you want for storage replication so long as you do ensure that such a replication engine is in use.

With that said, the trick to making the cluster work is to make sure that each cluster node is configured to use the correct disk. The process of doing so actually begins when you run the Create Cluster Wizard. This wizard’s confirmation screen contains a check box that is used to add all eligible storage to the cluster. The check box is selected by default, so any storage that is visible to the cluster nodes that also meets the clustering prerequisite requirements will be added to the cluster.

After the cluster has been created, you can open the Failover Cluster Manager and navigate through the console tree to Failover Cluster Manager | <your cluster> | Storage | Disks. Upon doing so, you should see a disk for each cluster node. Only the local disk will appear to be online, because the local cluster node cannot access storage that is directly attached to the other cluster nodes. In fact, when you use the High Availability Wizard to add a clustered resource (such as a clustered file server), you will only be able to add the cluster disk that the server sees as being online.

To make the replicated storage work with the cluster you must make the other node’s cluster storage available to the cluster. Each node needs access to its own disk in the event of a failover. I will explain how to accomplish this in Part 3 of this series.

Conclusion

In this article, I have explained that there are some challenges related to building multi site clusters without the use of shared storage. However, avoiding the use of a cluster shared volume will often make for a more reliable cluster because otherwise the cluster shared volume could potentially become a single point of failure.

In Part 3, I will continue the discussion by talking more about best practices for non-shared storage for your multi-site cluster.

If you would like to read the other parts of this article series please go to:

Featured Links