Microsoft Cloud Networking Infrastructure Deployment Scenarios with Windows Server 8 (Part 1) - Traditional Datacenter

by [Published on 5 April 2012 / Last Updated on 5 April 2012]

In this article we'll begin with a short introduction to the cloud and issues with cloud networking infrastructure.

If you would like to be notified when Deb Shinder releases the next part of this article series please sign up to the WindowsNetworking.com Real time article update newsletter.

Introduction

The next generation of Windows Server, currently known as Windows Server 8, is being touted by Microsoft as the secure cloud operating system – with literally hundreds of new features and technologies included in the new Windows platform to support clouds of all types. These are the technologies that we’ve been asking to have for over a decade, and now they’re finally here! The beta became available on March 1 and it showcases some pretty exciting new features. But exactly how will you go about deploying a secure cloud infrastructure based on the new operating system, while providing optimal performance?

I’ve talked about cloud computing in several other articles on this site and on our sister site, WindowsSecurity.com, so I don’t want to rehash that. If you want a good explanation of cloud computing, then check out my husband’s lecture on Cloud Computing Principles, Concepts and Patterns that he delivered at TechDays 2012 in Belgium. And for a primer on securing your cloud environment, see A Solution for Private Cloud Security on the TechNet Wiki.

Cloud computing is a new approach when it comes to datacenter design and operations. Instead of the traditional datacenter approach where software and services are linked tightly to a particular server or group of servers, cloud computing takes a loosely coupled approach so that the services and the hardware that runs the services are loosely coupled. This enables your applications and services to be mobile. Mobile applications and services are not tied to any specific piece of hardware in the datacenter and they take advantage of the hardware abstraction you see in a cloud datacenter. This has many advantages but also presents new challenges, especially in the security arena.

So what does this look like in the datacenter? At a high level, what does the network, storage and compute infrastructure look like? Is there a single way to put together a cloud datacenter – or are there best practices or patterns that you can take advantage of while planning your move to a cloud deployment? That’s what we’re going to be addressing in this series of articles.

Although we hear a lot about cloud today, there isn’t a ton of information on how to actually build out a secure cloud – especially at the infrastructure layer. I think that’s because a lot of the chatter regarding cloud computing has been focused on public cloud Software as a Service offerings. Public Cloud SaaS hides all the infrastructure, platform, and software components from the cloud consumer and enables the consumer of the public cloud SaaS solution only very limited visibility into the application and no visibility at all into the cloud infrastructure or application platform.

As you start architecting your new cloud-based datacenter, you’re going to need to know how to build out the infrastructure. To help us get started, Microsoft has provided an early look at some of the cloud infrastructure designs that we can consider when building out a new cloud infrastructure on-premises. This cloud infrastructure then might be used to create a private cloud Infrastructure as a Service, Platform as a Service, or Software as a Service offering. Over time, this on-premises private cloud might also be used to connect to hosted private cloud providers, or to public cloud providers.

Windows 8 Cloud Using a Traditional Datacenter Approach

The first scenario that Microsoft describes is what they call a cloud using a traditional datacenter approach, and you can see how it works in the figure below. The cloud takes advantage of Windows Failover Cluster and each member of the Windows Failover Cluster will be configured in the same way, using the same hardware (this takes advantage of the Principle of Homogeneity).


Figure 1

In this figure, we see that each type of traffic is dedicated to its own network interface. The types of traffic include:

  • Storage traffic
  • Live Migration traffic
  • Cluster traffic
  • Management traffic
  • Guest or tenant traffic

The type of NIC that you use for each type of traffic depends on how much speed and what capabilities you want to enable for that traffic. The new technologies that are included in Windows Server 8 that support high speed and highly reliable networking are mostly available for the new 10 GB NICs that are coming on the market. The problem with the new high speed 10 GB NICs is that they are very expensive. The NICs themselves are costly and the per-port cost on the 10 GB switch is also very costly. Therefore, you need to think about the traffic profiles for each of the types of traffic and tailor your NIC plans to the needs of that traffic.

North/South traffic

In this “non-converged” scenario (where the traffic to and from the members of the cluster are not converged onto a single NIC), you need to think about two general types of traffic – the first type, referred to as “North/South” traffic, is the traffic moving between the cluster members and the corpnet or Internet. The amount of North/South traffic you have mostly depends on the types of workloads you are running on your tenants. If the tenants are primarily compute intensive and not network intensive, you probably can get away with using teamed 1 GB NICs to support this “tenant network”.

Remember that Windows Server 8 supports network teaming right out of the box! In the past, the NIC vendor had to provide driver support for NIC teaming. This was a big problem for most of us, since we considered NIC teaming to be a hard requirement in our datacenters. The problem was made even worse by the fact that, if there was a problem with NIC teaming, Microsoft support would tell us the first thing we needed to do when troubleshooting the issue was to turn off NIC teaming. Then the finger pointing began. With out of the box support for NIC team, which is independent of vendor driver support, we don’t have to worry about finger pointing any more. Also, the out of the box NIC teaming will work for NICs from different vendors. Nice!

East/West traffic

The other type of traffic you need to think about is what is referred to as “East/West” traffic. The East/West traffic is the traffic that moves between the various components of the cloud. This traffic tends to have higher data transfer rate requirements and will likely be the traffic for which you will want to consider the high speed, high cost, 10 GB network interface cards. East/West traffic includes:

  • Storage traffic
  • Live Migration traffic
  • Cluster Traffic
  • Management Traffic

In general, the Live Migration, cluster and management traffic isn’t going to be very network bandwidth intensive. When you perform a Live Migration, you aren’t moving the storage itself, you’re just moving the configuration files and the contents of memory to another member of the Hyper-V cluster. If the workload is small, Live Migration shouldn’t take much network bandwidth. Of course, if you’re running workloads with 16GB, 32GB or more memory, then the bandwidth requirements become significant, and in that case, you might think about using a high speed 10 GB NIC for your Live Migration traffic. But even then, the service continues to run even during the Live Migration, and if there is network interrupt during the Live Migration it will continue to run. That means even if you have high memory requirement workloads running, you might not be able to make a strong argument for requiring a high speed 10 GB NIC. You can team multiple 1 GB NICs and get the bandwidth you need in most cases.

For the cluster and management traffic, it’s hard to see where you would need more than a single NIC for each of those traffic classes. Cluster traffic is for managing the state of the cluster, which certainly won’t tax a 1 GB NIC. And Management traffic is also relatively puny in comparison to the bandwidth delivered on a 1 GB NIC. In almost all cases, a single, non-teamed 1 GB NIC would be used for the cluster and management traffic, at least from a bandwidth perspective.

However, there is the reliability perspective that you might want to account for. Since cloud is all about service delivery and taking a service provider’s approach, you might want to use more than one NIC for the Live Migration, cluster and management traffic. Keep in mind that the “official” name for NIC teaming in Windows 8 is “Load Balancing and Failover (LBFO)”. We’ve been focusing on the bandwidth requirements up to this point, but the high-availability requirements need to be considered too.

If we want to take a service providers approach, we need to make sure the entire solution is as available as possible, by taking advantage of the principle of resiliency. While resiliency is focused on software, in contrast to hardware, to solve high availability issues, the combination of software and hardware also support the concept of resiliency. Therefore, in order for our cloud to be resilient to failure, and to help assure that faults do not become disasters, you should take advantage of NIC teaming even for your low bandwidth requirement traffic classes.

Finally, there is the storage traffic. Storage is a bit different, since you may or may not be working with Ethernet when connecting to storage. Other options include Fibre Channel and Infiniband. FC and IB solutions tend to be every cost intensive, not only from the hardware perspective, but also in terms of finding people with experience in these areas to deploy them. In contrast, iSCSI and JBOD based storage solutions use Ethernet and of course experience in this area is commonplace.

Storage traffic requires high speed and low latency. We need to be able to come as close to bus speeds as possible for this type of traffic. In the past, this was a big challenge, because the networking hardware and software to enable performance close to DAS connected storage just wasn’t there – that’s why we needed FC and IB. However, with Windows Server 8, that game has changed.

New technologies in Windows Server 8, such as Remote Direct Memory Access (RDMA) and Server Message Block 2.2 (SMB 2.2) enable you to get storage performance that is roughly 97% that of Direct Attached Storage (DAS). Wow! If you team a pair of 10 GB NICs for your storage traffic, you have a total over 20 Gbps up and down. It’s hard to fill a pipe that big. And you’ll need this for your live storage migrations that you’ll be able to do with Windows Server 8 Hyper-V.

Summary

In this article we began with a short introduction to the cloud and issues with cloud networking infrastructure. Then we described the first of three cloud datacenter networking scenarios where each traffic class was assigned a separate physical NIC or separate teams of physical NICs. We discussed some of the new Windows Server 8 technologies that optimize the networking infrastructure for the cloud, and some of the considerations you’ll want to take when thinking about North/South and East/West traffic. In the next article, we’ll finish up by talking about the other two cloud networking deployment scenarios. See you then! –Deb.

If you would like to be notified when Deb Shinder releases the next part of this article series please sign up to the WindowsNetworking.com Real time article update newsletter.

Advertisement

Featured Links