Strategies for Monitoring Failover Clusters (Part 5)

by [Published on 17 Jan. 2012 / Last Updated on 17 Jan. 2012]

This article continues the discussion of failover cluster monitoring by showing the types of health information that System Center Operations Manager exposes for a failover cluster.

If you would like to read the other parts in this article series please go to:

Introduction

In the previous article in this series, I showed you how to connect to your failover cluster using System Center Operations Manager, and how to use the simple diagram shown in Figure A to assess the cluster’s health. Now that we are actively monitoring a failover cluster, I want to spend a little bit of time talking about the types of health information that System Center Operations Manager can provide about your cluster.


Figure A: System Center Operation Manager uses diagrams to display cluster health.

As you look at the diagram above you will notice that I have expanded a Windows failover cluster named ProdCluster. When I select this cluster, the pane at the bottom of the console displays some basic information such as the cluster’s name and OS version. What is more interesting however, are the elements of the diagram that are displayed just to the right of the ProdCluster icon.

There is actually a lot more to the diagram than what you can see in the screen capture because the diagram extends well beyond the boundaries of the screen. That being the case, I will start out by talking about the parts of the diagram that you can see in the figure, and then I will come back and talk about some of the stuff that you can’t see.

The icons that are displayed to the right of the ProdCluster icon represent the individual elements that make up the cluster. You will notice in figure A that each of these items has a green checkmark icon next to it, indicating that the component is in a healthy state. If any of the cluster components were in an error or a warning state then the cluster as a whole (the ProdCluster icon) would reflect that state as well.

Cluster Networking

The top two icons shown in Figure A are Cluster Network 1 and Cluster Network 2. As the names imply, these icons represent the network resources used by the cluster. The servers on this network are connected to two separate networks. One is the regular private network that is used by all of my other servers and desktops. The other network is a dedicated network used solely for cluster heartbeat and replication traffic.

System Center Operations Manager allows you to expand a cluster network to view some basic information about it. If you look at Figure B for example, you will notice that when I select Cluster Network 1, the pane at the bottom of the window displays the network’s IP address. The diagram also expands to show which physical network adapters are being used by the network.


Figure B: System Center Operations Manager displays cluster related network health and configuration information.

Clustered Applications

If you look back at Figure A, the icon in the lower, right corner of the diagram is labeled E2K7CMS (ProdCluster). This icon represents a clustered application (named E2K7CMS) that is running on top of the cluster (ProdCluster). In this case, the clustered application is an Exchange 2007 mailbox server.

This portion of System Center Operations Manager isn’t going to give us a tremendous amount of information about the clustered mailbox server’s health. After all, Microsoft offers an entire management pack just for Exchange Server. Even so, there is a certain amount of information that the console will display for any clustered application.

If you look at Figure C, you can see that when I expand the listing for my clustered application, the console shows me which cluster nodes the application is configured to run on, as well as the health of those nodes. The Detail View, which is located in the pane beneath the diagram tells you whether or not the application is in a persistent state and which cluster node is initially treated as the active node for the clustered application.


Figure C: System Center Operations Manager displays the cluster nodes that host the clustered application.

The Cluster Nodes

If you scroll to the lower part of the diagram, System Center Operations Manager displays an icon for each node within the selected cluster. Once again, a green checkmark icon next to each node indicates that the individual nodes are all healthy.

If you look at Figure D, you can see two cluster nodes (ExchNode1 and ExchNode2). ExchNode2, which appears at the bottom of the figure, is in its default view and merely identifies the node and confirms that it is healthy.


Figure D: You can get information about individual cluster nodes.

In the figure above, I have expanded the listing for ExchNode1. You will notice that by doing so I can see the cluster service and all of the individual components that make up the cluster node. Selecting the individual icons causes System Center Operations Manager to display a wealth of information about the cluster node. For instance in Figure E, you can see information related to the way that the System Center Operations Manager agent interacts with the cluster node.


Figure E: You can view information about how the agent interacts with the cluster node.

Available Storage

If you scroll to the top of the diagram, you will see a listing for Available Storage. If you look at Figure F, you will see that when I expand the Available Storage container, System Center Operations Manager displays both of my cluster nodes. Normally it would be possible to click on a node and get information about the underlying storage. However, I am working with a clustered Exchange mailbox server and Exchange mailbox servers do not use shared storage. That being the case, the Available Storage container doesn’t really provide anything useful in this diagram.


Figure F: System Center Essentials displays information related to available storage.

The Cluster Group

The final icon that is provided by the diagram of the cluster is the Cluster Group icon. When you expand this icon, the System Center Operations Manager console shows you the individual nodes that make up the cluster group, as shown in Figure G.


Figure G: Expanding the Cluster Group icon causes the console to reveal the individual nodes that make up the failover cluster.

What is more interesting however is the information that is displayed in the Detail View. As you can see in the figure above, System Center Operations Manager provides you with some basic information about the way that the cluster is configured. For example, you can see the cluster group name, the failover period, the failover threshold, and the always important auto failback type. In addition, the Detail View tells you whether or not the cluster is presently in a persistent state and which cluster node is initially active. In this particular case for example, the node named ExchNode1 is configured to initially act as the active node.

Conclusion

As you can see, you can gain a wealth of information about the current state of your failover cluster simply by expanding the diagram provided by System Center Operations Manager. As you may recall however, the original premise to this article series was that if a failure occurs within a failover cluster then you need to know about it. In Part 6 I will conclude this series by showing you how alerting works for cluster failures.

If you would like to read the other parts in this article series please go to:

Featured Links