Paper Review: Achieving High Utilization with Software-Driven WAN

ztex, Tony, Liu

5 min readMar 15, 2024

解決思路: Demand Shifting

Background: Multi-protocol Label Switching (MPLS)

Label-switching

fast forwarding on the label in the packet header
header between layer 2 and layer 3 headers
also has a class-of-service field for priority

MPLS Forwarding

Label-switched routers (LSRs)
Ingress LSR inserts labels in the MPLS header.
All LSRs forward on labels may rewrite labels.
Egress LSR removes labels.

Benefit: We can use label forwarding to do traffic engineering

不用 TE -> 不能走 longer path -> 需要 tunneling (Google: IP-in-IP tunneling, Microsoft: MPLS tunneling)

MPLS for Traffic Engineering

透過 OSPF 廣播找到 A->C 的最短路徑，該路徑必須 ≥ 40Mbps
A 告訴 B 保留 40Mbps, B 告訴 C 保留 40Mbps
Router A, B, C 對應設置 label table (signaling: RSVP-TE) 以建立從 A->C 的 tunnel
開始傳輸

Problems of MPLS TE

Inter-DC WANs suffer from two key problems today.

Poor efficiency:

The amount of traffic the WAN carries tends to be low compared to capacity. For a production inter-DC WAN, which we call IDN (§6.1), we find that the average utilization of half the links is under 30% and of three in four links is under 50%.

Why?

services send whenever and however much traffic they want, without regard to the current state of the network or other services
the local, greedy resource allocation model of MPLS TE is inefficient.

Drawback: Inefficiency

因為原本用 Greedy Approach, 給定同樣的 topology 可能會因為 order 不同得到 sub-optimal allocation

Drawback: Fairness

MPLS 沒辦法保證 fairness, 可能造成某些節點被 allocated 更多 capacity

為了達到 fair allocation: 需要 carefully select path

SWAN Design Goals:

Goals:

優先級
同一個優先級：Fair sharing within each class

Approach: SDN-based

Services 告訴 controller demands (= Google’s Bandenforcer 一樣)
Controller allocates paths 來滿足 (= Google’s TE server)
… Controller programs switches (= Google’s Gateway)

Challenge: Computing Allocations

Optimal global allocations must:

How SWAN address: fast, approximately-fair algorithm

Challenge: Congestion-Free Label Updates

當 routing changes -> update routing label-> 如果沒有 careful ordering, 造成 congestion

Approach: deriving a congestion-free ordering

Challenge: Switch Table Capacity

Switches have limited space for label tables, 可超過 table size

SWAN Components

Hosting:

tracks demands, and tell broker
rate-limiting

Broker (= Google’s Bandwidth Enforcer):

Aggregate service demand
Tell controller
Instructs hosts to allocated/limit rate

Network agents (= Google’s Gateway)

Convey topology sate to controller (= Google’s TE server)
Track topology and traffic with the aid of switches.

Controller:

Computer allocations to services
Changes forwarding state
Compute the service allocations and forwarding plane configuration for the network
Signal new allocations to services whose allocation has decreased. Wait for Th seconds for the service to lower its sending rate.
Change the forwarding state and then signal the new allocations to services whose allocation has increased.

Computing Allocations: Inputs and Outputs

Inputs:

demands between each DC pair
tunnels between each DC pair
link capacity of each link

Output: fraction of flow i over tunnel j

Goal

maximize network utilization while max-min fairness among same priority services

Allocation LP:

MCF (multi-commodity flow) function that maximizes the overall throughput while preferring shorter paths
sPri is the fraction of scratch link capacity that enables congestion-managed network updates
SWAN also ensures that higher priority traffic is likelier to use shorter paths
To speed up: by invoking MCF in T steps, with the constraint that at step
k, flows are allocated rates in the range [αk−1U,αkU], but
no more than their demand
runs the LP at the granularity of DCs instead of switches
SWAN aggregates the demands from all services in the same priority class between a pair of DCs. This reduces the number of flows
Parameters α and U trade-off unfairness for runtime

What’s max-min fairness?

No user is allocated more demand
Among all users whose demand is not satisfied, the resulting allocation has the highest min allocation

Say we have 10 cookes, Demands: A:2, B:5, C:6

Allocation 1: 1, 4, 5

Allocation 2: 2, 3, 5

Allocation 3: 2, 4, 4

Allocation 3 is more fair than Allocation 2 because the min-values are 3 and 4. and 3 < 4

Building Block: Multi-Commodity Flow

Maximize total allocation, subject to

demand constraints
capacity constraints

Two wrinkles

add constraint to limit allocation within bounds
leave scratch capacity for congestion-free updates

Maximizing Throughput

For each priority class

maximize throughput using MCF

By doing this in priority order

high priority traffc is allocated shorter paths

Ensuring Table Size Constraint

Problem：Routers limited routing table sizes

Solution: Run Optimization to find tunnels set first. Rule out tunnels carrying less traffic and being to long. Run optimization again

B4 v.s. SWAN

As in SWAN, B4 uses SDNs in the context of inter-DC WANs.

Although this parallel work shares a similar high-level architecture, it addresses different challenges.

While B4 develops custom switches and mechanisms to integrate existing routing protocols (BGP+ISIS)in an SDN environment, SWAN develops mechanisms for congestion-free data plane updates and for effectively using the limited forwarding table capacity of commodity switches.