ISO/IEC TR 29181-3:2013 Switching and routing

Courtesy: ISO/IEC TR 29181-3:2013 Switching and routing

In some networks, routing is complicated by the fact that no single entity is responsible for selecting paths; instead, multiple entities are involved in selecting paths or even parts of a single path. Complications or inefficiency can result if these entities choose paths to optimize their own objectives, which may conflict with the objectives of other participants.

A classic example involves traffic in a road system, in which each driver picks a path that minimizes their travel time. With such routing, the equilibrium routes can be longer than optimal for all drivers. In particular, Braess’s paradox shows that adding a new road can lengthen travel times for all drivers.

In a single-agent model used, for example, for routing automated guided vehicles (AGVs) on a terminal, reservations are made for each vehicle to prevent simultaneous use of the same part of an infrastructure. This approach is also referred to as context-aware routing.

The Internet is partitioned into autonomous systems (ASs) such as internet service providers (ISPs), each of which controls routes involving its network. Routing occurs at multiple levels. First, AS-level paths are selected via the BGP protocol that produces a sequence of ASs through which packets flow. Each AS may have multiple paths, offered by neighboring ASs, from which to choose. These routing decisions often correlate with business relationships with these neighboring ASs, which may be unrelated to path quality or latency. Second, once an AS-level path has been selected, there are often multiple corresponding router-level paths to choose from. This is due, in part, because two ISPs may be connected through multiple connections. In choosing the single router-level path, it is common practice for each ISP to employ hot-potato routing: sending traffic along the path that minimizes the distance through the ISP’s own network—even if that path lengthens the total distance to the destination.

For example, consider two ISPs, A and B. Each has a presence in New York, connected by a fast link with latency 5 ms—and each has a presence in London connected by a 5 ms link. Suppose both ISPs have trans-Atlantic links that connect their two networks, but A‘s link has latency 100 ms and B‘s has latency 120 ms. When routing a message from a source in A‘s London network to a destination in B‘s New York network, A may choose to immediately send the message to B in London. This saves A the work of sending it along an expensive trans-Atlantic link, but causes the message to experience latency 125 ms when the other route would have been 20 ms faster.

A 2003 measurement study of Internet routes found that, between pairs of neighboring ISPs, more than 30% of paths have inflated latency due to hot-potato routing, with 5% of paths being delayed by at least 12 ms. Inflation due to AS-level path selection, while substantial, was attributed primarily to BGP’s lack of a mechanism to directly optimize for latency, rather than to selfish routing policies. It was also suggested that, were an appropriate mechanism in place, ISPs would be willing to cooperate to reduce latency rather than use hot-potato routing. Such a mechanism was later published by the same authors, first for the case of two ISPs and then for the global case

As the Internet and IP networks have become mission critical business tools, there has been increased interest in techniques and methods to monitor the routing posture of networks. Incorrect routing or routing issues cause undesirable performance degradation, flapping or downtime. Monitoring routing in a network is achieved using route analytics tools and techniques.

Centralized routing

In networks where a logically centralized control is available over the forwarding state, for example, using software-defined networking, routing techniques can be used that aim to optimize global and network-wide performance metrics. This has been used by large internet companies that operate many data centers in different geographical locations attached using private optical links, examples of which include Microsoft’s Global WAN, Facebook’s Express Backbone, and Google’s B4.

Global performance metrics to optimize include maximizing network utilization, minimizing traffic flow completion times, maximizing the traffic delivered prior to specific deadlines and reducing the completion times of flows. Work on the later over private WAN discusses modeling routing as a graph optimization problem by pushing all the queuing to the end-points. The authors also propose a heuristic to solve the problem efficiently while sacrificing negligible performance.