Rethinking Path Validation: Pt. 2
March 16, 2016
BGP, or the Border Gateway Protocol, is a widely-used protocol that allows very large networks, such as the Internet, to be able to scale. While BGP was originally developed for Internet routing, now it is used in some large institutional networks as well.
This two-part post will discuss one proposed model for improving BGP security, based on a talk given by LinkedIn Network Architect Russ White at NANOG 66. The first part explained how BGP works, described the shortcomings of “transitive trust” and outlined the requirements for a system that would provide better security. The second part, below, describes an architecture that meets those requirements.
We left off last time after having described the eight operational requirements that must be met for any system that reduces our reliance on transitive trust in relation to the AS Path. As a reminder, AS, or Autonomous Systems, is a set of routers which all follow a standard set of policies. A typical AS is run by a single organization, such as a company or an ISP. Now, let's see how this kind of system might work in practice.
Let’s look at a new advertisement type that carries connectivity information. In the network diagram above, for instance:
- AS65004 would advertise a connection to AS65002 and AS65003
- AS65002 would advertise a connection to AS65004 and AS65000
- AS65000 would advertise a connection to AS65002 and AS65003
- AS65001 would advertise a connection to AS65003
- AS65003 would advertise a connection to AS65000 and AS65004
Each of these advertisements would be carried in a way that they could be cryptographically verified, so that AS65004, on receiving AS65000’s connectivity advertisement, would be able to verify the contents of the advertisement have not been changed using AS65000’s public key.
Once AS65004 has advertisements from each of the autonomous systems in the internetwork, some device within the AS can compute a set of path pairs which represent valid connections.
In computing these path pairs, it’s important to check both of the connected autonomous systems for a claim of connectivity. For instance, if AS65001 claims a connection to AS65000, but AS65000 doesn’t claim a connection to AS65001, the connection doesn’t exist, and hence any AS Path that contains the pair [65000,65001] is considered invalid. Another way to phrase this is to say that some device can build a Directed Acyclic Graph (DAG), and checking each edge in the graph for two-way connectivity (TWCC). Another way of constructing this graph is to reverse the nodes and edges, so the autonomous systems are edges, and the links between the autonomous systems are the nodes.
It’s important to note that one of the requirements is we don’t want to tell any particular AS how to use the information; we’re looking for the final result, rather than the processing that takes place within the AS. Because of this, we don’t need to spend a lot of time considering whether a set of AS Path pairs, a DAG, or an inverted DAG are actually used in any particular implementation.
Once the processing is done on this information, a filter or process can be configured on each eBGP speaker to check received AS Paths against the set of valid paths (remembering that these valid paths can be represented in a number of different ways within an AS, depending on the implementation). Router H, for example, receives two different paths for 2001:db8:0:1::/64, one with an AS Path of [65000,65002], and the second with an AS Path of [65000,65001,65003]. Router H can examine the path against the received path pairs (as processed locally within AS65004), and determine the path through AS65001 is not valid—there is no connection between AS65000 and AS65001.
BGP Address Families
How could this sort of information be carried and secured between the autonomous systems? One of the requirements on the list is to use BGP, as it’s a familiar and readily available tool. A potential option, then, is to create a new BGP Address Family (AF), which is designed to carry these connection advertisements. One interesting point about this AF is it needs to be flooded throughout the internetwork, rather than advertised as BGP traditionally does, finding a best path at each AS. This means the best path algorithm needs to be changed slightly to propagate the right certificate, and we need to add a sequence number to the mix to prevent replaying these advertisements (as the newest in time is no longer always the best/correct advertisement). The figure below illustrates a potential format for an AF carrying this connectivity information.
What’s interesting about this solution is it allows each network operator to determine the best way to consume these new connectivity advertisements within their AS. The following figure illustrates.
There are two different sets of BGP advertisements traced through this network; the newer connectivity advertisements follow the eBGP peering sessions shown in the light grey dashed lines, while the darker solid lines show not only the physical links, but also standard eBGP sessions. AS65004 has decided to accept these connectivity advertisements on a normal peering router (Router H is shown), and processes them on the router itself. Each edge router might accept these advertisements, or a single edge router might, and the resulting pair-wise sets or DAG can be distributed to the other edge eBGP speakers. AS65002, on the other hand, accepts the new connectivity advertisements on its edge speakers, but it reflects them into a centralized route server/reflector for processing.
Finally, AS65000 only accepts the new connectivity advertisements on a centralized eBGP speaker; in this case, AS65000 requests that AS65002 peer not only with Router A, but also using a multihop eBGP session with Router B, which is actually an eBGP speaker, when the peering arrangement is set up. Router D would negotiate only the standard reachability AFs with Router A, and it would negotiate only the connectivity advertisement AF with Router B. The crucial point is that each AS (each operator) can choose to accept this new AF in a different way, and process the information contained in them in a different way. The point is to make the information available rather than to dictate processing within an AS. If an operator doesn’t want to replace or touch their edge routers to participate in BGP path validation, they can build a set of virtual routers (VRs) throughout their network, consume the connectivity advertisements on those VRs, and then distribute the resulting path validation information throughout their network using internal mechanisms.
To return to the requirements list: there are two interlocking requirements that need to be addressed. First, some operators have said they would like to hide some connections either until they are used or permanently. Given BGP is used as a transport mechanism, two things are needed to make this happen. First, a single AS must be able to advertise multiple connectivity advertisements—this is possible by making certain the packet formatting is designed to allow a fragmented connectivity set, and whatever internal mechanisms used to build the path validation information within each AS is able to handle putting multiple connectivity advertisements together to form a complete picture of any AS connectivity during processing. This is simple enough to accomplish.
The second part of the problem is to filter advertisements to specific peering operators, or even to signal that an advertised connectivity set should not be re-advertised to a peer’s peers. Communities are already widely used, understood, and accepted for these sorts of tasks. Again, if BGP is assumed for the transport, adding community support to the AF is a simple solution that would be automatically widely supported and understood.
Finally, we need to examine some potential objections to this scheme, and formulate some answers.
The type of solution described here does not solve the problem of non-transit autonomous systems transiting traffic. This is simple enough to solve, however, by adding the ability to advertise policy towards a peer in the connectivity set. Communities, individual bit fields, or other solutions could be used to provide policy information. Each individual operator could choose what to advertise in addition to raw connectivity, so each operator could determine the correct balance between increasing security and protecting the status of business relationships. For instance, one provider might advertise just who their peers are, while another might advertise not only who their peers are, but also which peers are customers, and which peers are able to transit traffic to and from the provider.
The type of solution described here provides information on an AS level, rather than a per prefix (or reachable destination) level. Again, however, it’s easy enough to encode policy on a per destination basis in the connectivity advertisements. This would be quite a bit messier than, say, encrypting individual BGP route advertisements, but each scheme has its strengths and weaknesses. Signing updates has a set of weaknesses that far outweigh the messiness of advertising policies per destination, particularly as such advertisements can be limited in their scope.
What is the current state of this work? We (LinkedIn along with several vendors, providers, and independent consultants) are working on experimental implementations, working out the tradeoffs between various options, and considering how best to formulate the encryption pieces that need to be built for this type of system. A number of providers are already involved, and the list will grow over time.
This appears to be a solution that will solve, perhaps, 80% of the BGP path validation problems in a way that has a minimal impact on network operators. Even with this sort of system in place, research into solving the remaining 20% would be needed. One way to think of this is as a firewall/intrusion detection system pairing; different solutions can be focused on different parts of the problem, and low-hanging fruit can be solved with minimal footprint solutions.