Open Source

Project Falco joins SONiC Community (Software for Open Networking in the Cloud)

One of the key components of our vision for LinkedIn’s global infrastructure is to ultimately build a programmable data center fabric on top of an open network operating system. While scaling our data centers out, we want to control the complexity of the data center fabric by moving toward an automated, self-defined, and purpose-built, application-centric network that operates on its own.

To help reach this goal, we are happy to announce today that SONiC, Software for Open Networking in the Cloud (SONiC), is the open source platform we've chosen to enable Project Falco to focus more on control and management software components.

Project Falco: A Brief History

In late 2014, we launched Project Falco to focus on decoupling network hardware and software. The Falco working group put thousands of hours into developing our first network switch, Pigeon, a 3.2 Tbps switching hardware which enabled choice in hardware and software selection and ultimately brought more control to our infrastructure and owning our architecture. Pigeon was deployed at a larger scale in our next-generation data center in Oregon last year. Project Falco takes advantage of a standard Linux operating system to run control plane and management plane tools that are very similar to what we use in LinkedIn's production network on our servers. You can read the blog post from our announcement of Project Falco here.

Open Sourced by Microsoft

Running one of the most extensive clouds in the world, Microsoft has gained tremendous insight into building and managing a global network with tens of thousands of switches. With this expertise, they built SONiC and open sourced it to the community, making it available on the SONiC GitHub Repository. SONiC is a uniquely extensible platform, with a large and growing ecosystem.

sonic2

The Production Engineering team at LinkedIn quickly recognized the value that SONiC and its open source community would bring as the network operating system for our Project Altair-defined architecture data center fabric.

As discussed in earlier blog posts, our requirements for a switching platform are to be one that allows us to be more flexible in our system design choices, while abstracting away complexity. We ensure this flexibility through a set of predefined criteria:

  • Run our silicon of choice on any hardware platform
  • Run some of the same infrastructure software and tools we use on our application servers on the switching platform; for example, telemetry, alerting, Kafka, logging, security, and software development toolkits
  • Respond quickly to requirements and change
  • Advance DevOps operations such that switches are run like servers and share a single automation and operational platform
  • Limitless programmability options
  • Feature velocity and simplicity
  • Faster and better innovation cycle
  • Greater control of hardware and software costs

SONiC is built on the Switch Abstraction Interface (SAI), which defines a standardized API, and allows network hardware vendors to develop innovative hardware platforms that can achieve significant speeds while keeping the programming interface ASIC (Application-Specific Integrated Circuit) consistent. Microsoft open sourced SAI in 2015, and it was accepted by OCP in July of the same year. This approach enables operators to take advantage of the rapid innovation in silicon, CPU, power, port density, optics, and speed, while preserving their investment in one unified software solution across multiple platforms. Similarly, SAI enabled us to support different merchant silicon options in our data centers in a unified manner.

LinkedIn will utilize SONiC as the base platform to implement the OpenFabric initiative (a web-scale protocol designed for data center fabric). In addition, LinkedIn is planning to contribute back several items to the SONiC community:

  • Enabling a Free Range Routing stack (FRR)
  • Adjusting and improving the routing stack, BGP convergence, and FIB acceleration
  • Improving the ECMP (Equal Cost Multi-Path) for data center fabric
  • Introducing OpenFabric web-scale protocol
  • Onboarding ODM (Whitebox) solutions to the project
  • Introducing and supporting Open19 hardware platforms

LinkedIn’s adoption of and collaboration with SONiC and SAI is aimed at improving the quality and convergence of the project. For example, we have worked on several improvements to the routing stack, writing our own OpenFabric control plane on top of SONiC, and integrating these projects with Open19 hardware.  We are very excited for the increasingly strong momentum behind the platform to build programmable networks that can connect and deliver the best experience to our members.

Acknowledgements

Falco and integration with SONiC is based on the efforts of the Falco working group in the PIE organization at LinkedIn. Special thanks to technical lead Shawn Zandi, Microsoft software development teams, LinkedIn technical program managers Fabio Parodi, Vish Shetty, and Melanie Wong, and the network development group, Sadaf Fardeen, Doug Hanks, Ravi Jonnadula, Rodny Molina, Nikos Triantafillis, Russ White, and Zhenggen Xu.