Home > News content

From ServerSwitch to SONiC Chassis: Ten Years of Data Center Switch Technology

via:博客园     time:2019/7/9 19:24:20     readed:975

Editor's note: SONiC (Software for Open Networking in the Cloud) has been deployed on a large scale in Microsoft Azure Data Center, but how to deploy SONiC on the high-level Chassis Switch remains a challenge. To this end, Microsoft Asia Research Institute and Azure Network Products Department have successfully constructed the first SONiC Chassis prototype system, which contains researchers'unremitting exploration after ten years.

With the advent of the era of cloud computing, companies continue to build data centers around the world to support the rapidly growing cloud computing business. In 2011, a non-profit organization was launched by a company, such as Facebook.OPenCOmputePProject, OCP, aims to reconstruct the next generation hardware of data center through open source, and develop innovative hardware such as server, storage, network and infrastructure for the next generation data center.

OCP is currently one of the most widely covered and influential organizations in the direction of computing infrastructure hardware in the world. Its members include Facebook, Microsoft, Google, Intel, AMD, Alibaba, Tencent, Baidu and Huawei.

In data centers, tens of thousands of servers are connected by countless switches, forming a network with high bandwidth and low latency. As we all know, cloud computing business has a very high demand for reliability, which requires network operators to be highly controllable and manageable for switches, to be able to keep abreast of what happens to the network, and to locate and troubleshoot faults quickly when faults occur.

In addition, the new cloud computing services also constantly put forward new functional requirements for switches, which requires network developers to implement new switch functions and deploy online in a short time. In the face of these new challenges and requirements, the equipment of traditional switch manufacturers has become increasingly inadequate. Therefore, major cloud computing companies have begun their own switch self-research journey.

SONiC Chassis

As part of the OCP collaboration project, Microsoft released the Open Source Switch Operating System Project SONiC at the 2016 OCP Summit.(SOftware forOPenNEtworkingIN theCLoud). The project uses Switch Abstraction Interface (SAI) to provide a unified management and operation interface for different switching chips, and decomposes the switch software into several container modules to accelerate the iterative development of software, as shown in Figure 1. SONiC has been responded by many cloud computing, switch and chip manufacturers. Current members of the open source community include cloud computing vendors such as Microsoft, Alibaba, Tencent and switch and chip vendors such as Mellanox, Cisco and Arista.

data-ratio=0.634375

Figure 1: SONiC system architecture

At present, SONiC has been deployed on a large scale in Microsoft Azure Data Center (as shown in Figure 2). However, the current deployment of SONiC is confined to switch layers T0 and T1. How to deploy SONiC on Hassis Switch layers (T2/T3) remains a huge challenge. Therefore, in September 2018, Microsoft Azure Network Products Department and Microsoft Research Institute for Asia Systems and Network Group launched a cooperative research project on SONiC Chassis to design Chassis Switch supporting SONiC.

data-copyright=0

Figure 2: SONiC deployment in Microsoft's data center

The traditional Chassis Switch architecture is actually composed of multiple switch chips (see Figure 3 for an example). Front-end chips and back-end chips are connected by a special Lossless network based on Cell Switching.

Currently, there are no open topologies and routing standards for Cell networks within Chassis Switch. The implementation of Cell network in Hassis Switch of different chip manufacturers is different, and the details are not public, so it can be said that it is a black box. In this opaque situation, it is difficult for network managers to manage Chassis Switch with SONiC and to detect and diagnose network problems within Chassis.

data-copyright=0

Figure 3: Internal architecture of traditional Chassis Switch

To enable SONiC to run on Chassis Switch, we first need to turn Chassis Switch into a white box familiar to network administrators. Like the traditional Chassis Switch, SONiC Chassis is still composed of multiple switching chips. But the difference is that we use standard (second-tier) Clos Ethernet (Ethernet Network) to connect these chips (as shown in Figure 4).

Clos Ethernet is the standard architecture of today's data center. In this way, network managers can easily transplant a large number of mature technologies (such as control plane protocol, flow control mechanism and fault diagnosis technology) and operation and maintenance management experience of data center network directly to Chassis internal network.

data-copyright=0

Figure 4: Internal network topology of SONiC Chassis

After the topology is determined, the next challenge is the Control Plane within Chassis. Each chip of SONiC Chassis runs a SONiC instance (Instance) and uses BGP-EVPN as the control plane protocol. SONiC on the front-end chip exchanges external routing information directly through BGP-EVPN, but does not involve SONiC on the back-end chip. In this way, we only need to use expensive large routing table chips in the front-end, and we can have more choices in the choice of back-end chips. For example, the back end can choose the chip of high Port Density small routing table to improve the port density of the whole SONIC Chassis Switch.

In order to cooperate with BGP-EVPN,SONiC Chassis, a standard network virtualization technology is adopted.

data-copyright=0

Figure 5: Control plane of SONiC Chassis. In this example, SONiC on VTEP1 sends routing information directly to SONiC on VTEP6 at 10.0.1.0/24.

After nearly half a year of close cooperation, Microsoft Asia Research Institute and Azure Network Products Department jointly constructed the first prototype system of SONiC Chassis in February this year, and demonstrated it at the OCP Global Summit in March, which attracted wide attention from industry.

At present, we are continuing to solve some key technical problems in SONiC Chassis, such as congestion control and fault monitoring and diagnosis mechanism in Chassis, and strive to deploy SONiC Chassis to Microsoft's data center as soon as possible.

data-copyright=0

Figure 6: The prototype system of SONiC Chassis, presented at the OCP Global Summit in March 2019

From ServerSwitch to SONiC Chassis:Ten years of perseverance

data-copyright=0

Figure 7: ServerSwitch architecture

ServerSwitch connects commercial switch chips and service providers through a high-bandwidth PCI-E interface, fully tapping the programmability of switch chips and CPUs, and realizing a high-performance programmable platform. In 2011, the ServerSwitch paper was published at USENIX Symposium on Networked Systems Design and Implementation (NSDI), a top conference on computer systems and networks, and won the Best Paper Award.

However, such a popular job in academia has gone through twists and turns within Microsoft.

The turnaround took place in 2013, when Lv Guohao, the core developer of ServerSwitch (now chief development manager of Microsoft's SONiC project), joined the Azure network product division. So Microsoft Asia Research Institute and Azure Network Products Department launched the Azure Cloud Switch (ACS) cooperation project to design a cross-platform modular switch operating system. This may also be Microsoft's earliest attempt on its own research switch operating system.

data-copyright=0

Figure 8: Server Switch

When everyone is full of joy, fate makes a joke. ACS was originally developed on Microsoft's Windows platform. Just as the ACS Windows prototype system was about to be completed, Microsoft began actively embracing Linux and open source systems. Windows was not the mainstream operating system in open source network systems, so Albert Greenberg, head of Azure's network product department, set up the system. Stop ACS Windows development and turn to Linux, which is today's SONiC project.

At present, SONiC's partnership ecosystem covers the mainstream hardware and software vendors in the industry. Its openness gives users a variety of choices. Partners can freely choose hardware and software according to network requirements. The participation of partners has further promoted the improvement of SONiC ecosystem.

In retrospect of the past decade's exploration, the network researchers of Microsoft Asia Research Institute experienced the excitement of winning the best paper at the top meeting, the frustration of the technology landing, the helplessness of the project being called off at the last moment, and the joy of the periodic results to this day. Their journey of exploration continues.

China IT News APP

Download China IT News APP

Please rate this news

The average score will be displayed after you score.

Post comment

Do not see clearly? Click for a new code.

User comments