Algorithmic Foundations

Transfer Patterns, RAPTOR, CSA, and the trade-off space that defines modern route planning

Why Transit Routing Is Hard

The shortest-path problem on a static graph is solved. Dijkstra's algorithm, published in 1959, runs in O(E+VlogV)O(E + V \log V) with a Fibonacci heap. For road networks specifically, techniques like Contraction Hierarchies and Customizable Route Planning exploit the near-planarity of road graphs to achieve query times in the low microseconds after a preprocessing step. These are mature technologies; the interesting engineering is in the preprocessing, not the search.

Transit routing is a different problem. A road network is a weighted graph whose structure is fixed and whose edge weights change slowly. A transit network is a graph whose effective structure depends on the time of day. The 08:47 from King's Cross exists only at 08:47. If you arrive at 08:48, the set of reachable destinations changes, possibly radically. The coupling of space and time means that a journey planner cannot simply find a path; it must find one that respects a timetable.

The naive approach constructs a "time-expanded" graph in which each node represents a (station, time) pair. This is correct but explosive: a single day's schedule for London generates millions of nodes. The algorithms that dominate modern transit routing emerged from efforts to avoid this expansion while preserving exactness. They fall into two broad families, distinguished by whether they require preprocessing.

Transfer Patterns

Transfer Patterns was published at ESA 2010 by Hannah Bast, Erik Carlsson, Arno Eigenwillig, Robert Geisberger, Chris Harrelson, Veselin Raychev, and Fabien Viger. Bast was a visiting scientist at Google; Carlsson, Eigenwillig, Harrelson, Raychev, and Viger were Google engineers in Zurich; and Geisberger was at the Karlsruhe Institute of Technology. The paper itself notes that the work was deployed in Google Maps. A 2016 blog post by Google engineer Arno Eigenwillig confirmed that Transfer Patterns remained in production at that time. Google holds a patent family covering the technique (US 8,417,409 B2 and continuations), filed in April 2010.

The algorithm's premise is that optimal journeys between any two stations, across all possible departure times, share a limited number of "transfer patterns", sequences of stations where a passenger changes vehicles. If you precompute these patterns for all station pairs, answering a query reduces to looking up the relevant patterns and instantiating them against the current timetable. This is fast: millisecond-range query times on networks with hundreds of millions of arcs.

The cost to this is in preprocessing. The original paper reports computation times of hours and storage of tens of gigabytes for large networks. Bast and collaborators published two follow-up papers reducing these costs. Scalable Transfer Patterns (ALENEX 2016) achieved roughly a thousandfold reduction in storage through graph compression and clustering, bringing worldwide transit data to approximately 30 GB. Frequency-Based Search (SIGSPATIAL 2014) accelerated the preprocessing itself by a factor of sixty.

Transfer Patterns occupies a particular position in the trade-off space: heavy preprocessing, negligible query time, poor amenability to real-time updates. If a train is cancelled, the affected patterns must be recomputed. This is manageable for a company with Google's infrastructure but awkward for transit agencies that need to reflect disruptions instantly.

RAPTOR

RAPTOR stands for Round-Based Public Transit Routing. It was published at ALENEX 2012 by Daniel Delling, Thomas Pajor, and Renato Werneck. Delling and Werneck were at Microsoft Research Silicon Valley; Pajor was at the Karlsruhe Institute of Technology. The journal version appeared in Transportation Science in 2015. Where Transfer Patterns precomputes structure, RAPTOR requires no preprocessing at all.

The algorithm processes the timetable in rounds. Round zero establishes walking connections from the origin. Round one finds all stations reachable by taking exactly one vehicle. Round two handles one-transfer journeys. The process continues until no further improvements are found. Within each round, RAPTOR iterates over "routes" (sequences of stops served by a common line) and examines each route at most once. This exploits the regularity of transit networks: a metro line serves the same stops in the same order throughout the day, so scanning it once per round suffices.

The absence of preprocessing is RAPTOR's defining advantage. When a train is cancelled or delayed, the algorithm simply uses the updated timetable for the next query. This made it attractive to transit agencies and to the open-source community. OpenTripPlanner adopted a RAPTOR-based transit routing engine in version 2.0 (November 2020), replacing an earlier A*-based search. Navitia, the French-origin routing engine maintained by Hove (formerly Kisio Digital), uses a RAPTOR variant through its "Kraken" engine. Conveyal's R5, designed for accessibility analysis, extends RAPTOR to handle range queries efficiently. MOTIS, the engine behind the community-run Transitous service, implements RAPTOR through its "nigiri" library. At the national level, Norway's Entur and Finland's Digitransit both run OTP2 with RAPTOR in production.

RAPTOR comes in three principal variants. Base RAPTOR optimises two criteria: arrival time and number of transfers. McRAPTOR extends this to arbitrary Pareto-optimal criteria, including fare zones, walking distance, or comfort measures. Range-RAPTOR (rRAPTOR) computes optimal journeys for all departure times within a specified window, supporting both "depart after" and "arrive by" queries. All three preserve the round-based structure.

The Connection Scan Algorithm

CSA was introduced at SEA 2013 by Julian Dibbelt, Thomas Pajor, Ben Strasser, and Dorothea Wagner, all at KIT. The journal version appeared in the ACM Journal of Experimental Algorithmics in 2018. Like RAPTOR, it requires no preprocessing, but its internal organisation is different.

Whereas RAPTOR groups computation by route, CSA groups it by connection, where a connection is a single departure of a vehicle from a stop. The algorithm sorts all connections in the timetable by departure time and scans this array once. For each connection, if the departure stop is reachable by the departure time, the arrival stop's earliest known arrival time is updated. The simplicity of this data access pattern yields excellent cache performance on modern processors. The journal paper claims roughly a factor-of-two speed advantage over RAPTOR for basic earliest-arrival queries.

The advantage reverses for multi-criteria search. RAPTOR's round structure makes Pareto-optimal computation natural; CSA requires additional bookkeeping that erodes its speed advantage. In practice, this has meant that RAPTOR dominates production deployments where multi-criteria optimisation is needed, while CSA has seen less commercial adoption. It has found a home in academic projects and in Linked Open Data approaches to decentralised routing.

Two of CSA's four inventors, Dibbelt and Pajor, now work at Apple, where they have been since around 2016. This, combined with the fact that Thomas Pajor co-authored both RAPTOR and CSA, makes it plausible that Apple Maps uses CSA-derived techniques, though Apple has disclosed little publicly about its transit routing.

Trade-Offs

These algorithms are not competitors in the sense that one is uniformly better. They occupy different positions in a space defined by preprocessing cost, query speed, memory consumption, and adaptability to real-time changes.

Transfer Patterns sits at one extreme: expensive preprocessing, trivial queries, poor adaptability. RAPTOR and CSA sit at the other: no preprocessing, fast-but-not-trivial queries, immediate adaptability. Trip-Based Routing (Witt, ESA 2015) occupies a middle ground, precomputing transfers between individual trips rather than between stops. The ULTRA framework (Baum et al., ESA 2019, journal version in Transportation Science 2023) represents the current frontier, preprocessing "transfer shortcuts" that allow any transit algorithm to handle unlimited multimodal transfers without sacrificing query speed.

For road routing specifically, the dominant preprocessing framework is Customizable Route Planning (CRP), published by Delling, Goldberg, Pajor, and Werneck (Transportation Science, 2017). CRP hierarchically partitions the road network into regions, identifies boundary nodes where edges cross between regions, and precomputes shortest-path distances between boundary nodes within each region. Queries proceed by searching at successively higher levels of the hierarchy, using precomputed shortcuts to skip over region interiors. CRP will be important later because the alternative routes paper builds its learned representations directly on top of the CRP partition structure.

The Researcher Diaspora

Routing expertise is notably concentrated at a small number of companies. Delling (RAPTOR, CRP co-author) moved from Microsoft Research to Apple in 2014, then to Google Maps in November 2022. Dibbelt and Pajor (CSA co-authors; Pajor also co-authored RAPTOR) are both at Apple. Werneck (RAPTOR co-author) is at Amazon.

Bast (Transfer Patterns creator) returned to academia at the University of Freiburg in 2009, where she holds the Chair of Algorithms and Data Structures and has served as Dean of Engineering.

Apple almost certainly uses advanced routing algorithms informed by its world-class researchers. Google's last public confirmation that Transfer Patterns is in production dates to 2016.

The most active academic work continues at KIT under Dorothea Wagner's group (Wagner is now emerita), producing ULTRA, FLASH-TB, and delay-robust multimodal algorithms through 2024 and 2025.

End of the Line for the Classical Paradigm

All of these algorithms share a structural property: they optimise for objectives that decompose over the components of a journey. Arrival time is the sum of edge traversal times plus waiting times. Number of transfers is a count. Fare is a function of zones traversed. These objectives can be represented as edge weights or tracked as auxiliary criteria in a Pareto search.

Some things users want do not decompose this way. "Route me via a coffee shop" is a constraint on the route as a whole, not on individual edges. "Avoid the Northern Line during rush hour" is a conditional constraint that changes the effective graph structure. "Plan a day trip visiting several museums within a time budget" is a variant of the knapsack problem embedded in a routing query. Current production systems handle such requests through "cost modifiers" that rescale edge weights, through explicit waypoint chaining, or through post-hoc filtering of candidate routes. These approaches work for narrow cases but do not scale to the open-ended preferences that users increasingly expect from mapping applications.

The two papers discussed in the remainder of this series address this gap. They differ in ambition, architecture, and readiness for production, so they deserve separate treatment.

References