Design a Flight Search Engine — System Design Interview Guide
Design a flight search engine asks you to build the system that, given an origin + destination + date, searches across hundreds of airlines, computes multi-leg itineraries with connections, prices each one, and returns a ranked list — typically in a few seconds. The hard part is the combinatorial explosion of multi-leg paths combined with airline pricing APIs that are slow and rate-limited.
By Alex Chen, Founder, InterviewChamp.AI · Last verified
Reported in interviews at
- Booking.com
- Expedia
- Airbnb
- Amazon
Sourced from Glassdoor, Levels.fyi, and Blind interview reports.
Functional requirements
- Accept a search query: origin airport, destination airport, departure date (+ optional return date), passenger count, cabin class
- Find all viable itineraries including direct flights and multi-leg connections (typically 1-3 connections)
- Price each itinerary against airline fare rules, including taxes, fees, and ancillaries
- Rank results by user preference (cheapest, fastest, best, fewest stops)
- Return results in a few seconds with progressive refinement (show partial results, refine as more arrive)
- Optional: support multi-city itineraries with arbitrary leg counts
Non-functional requirements
- Response latency: first results <2 sec, full result set <10 sec at p99
- Scale: ~10K+ searches/sec at peak across a global metasearch platform
- Freshness: pricing must reflect current airline inventory within minutes
- Cost: each upstream airline call costs money (rate limits + per-call fees from GDS providers); minimize per-search call volume
- Availability: 99.95%+; users tolerate slow searches but not failed ones
Capacity estimation
Anchor on metasearch scale: ~10K searches/sec at peak across a large flight metasearch (Google Flights, Booking, Expedia combined). Each search expands into anywhere from 10 to 1000+ candidate itineraries before filtering. The combinatorial explosion is the core engineering problem.
Latency budget breakdown for the 2-second first-results target: ~50ms request parsing + cache lookup, ~100ms airport-routing graph search to enumerate candidate connection paths, ~500ms first wave of airline pricing calls (parallel fan-out to 20-50 carriers), ~500ms result merging + ranking, ~850ms response serialization + render. Full result set takes longer because the slower carriers' responses arrive over the next several seconds; results are streamed progressively to the client.
Combinatorial estimate: for a non-direct route, the search must consider every reasonable hub city as a possible connection. Globally there are ~50-100 major hubs (ATL, ORD, JFK, LHR, AMS, DXB, HND, etc.). For a 1-stop search, that's ~50-100 candidate routings per origin-destination pair. For 2-stop, it's the hub × hub Cartesian product = ~2500-10K candidate routings — most of which will be filtered out by basic sanity checks (no backtracking, max total time, etc.).
Airline pricing call volume: a naive design calls every airline for every candidate routing = 100 carriers × 10K candidate routings = 1M API calls per search. At ~$0.01/call from GDS providers, that's $10,000 per search — economically dead. Real systems aggressively prune candidates before pricing and aggressively cache priced results. The pricing-fan-out reduction is the central cost-engineering problem.
Storage estimates: schedule + route data is small. ~100K daily flight legs globally × ~1 KB metadata = ~100 MB. Refreshed daily from carrier feeds. Inventory + pricing cache is larger: ~10M searched origin-destination-date triples × ~10 KB priced itineraries = ~100 GB. Replaced daily because prices drift fast — yesterday's $400 fare is today's $450.
The shape that matters: this is a search-fan-out + cost-engineering problem. Raw QPS is moderate (~10K). The hard part is keeping per-search upstream call cost below the platform's per-search revenue (~$0.10-$1.00 per search after monetization).
High-level design
Five core components: schedule graph, candidate-path enumerator, pricing fan-out + cache, ranking engine, and the result streaming interface.
The schedule graph is the static foundation. Nodes are airports; edges are scheduled flight legs with metadata (carrier, departure time, arrival time, equipment, capacity). The graph is built nightly from carrier schedule feeds (OAG, Cirium, or direct from carriers) and held in memory. It changes slowly — new routes launch weekly, not minutely.
The candidate-path enumerator takes (origin, destination, departure date) and returns a list of candidate routings: direct flights, single-connection routings via each viable hub, and two-connection routings for long-haul cases. Enumeration uses graph search with pruning: max connections (typically 2), max total travel time (typically 1.5x the direct great-circle distance), minimum connection time at each hub (varies by hub and airline — ATL is 45min, LHR can be 90min), no backtracking. The enumerator returns ~50-200 candidate routings per search, down from the unbounded ~10K naive set.
The pricing fan-out is the expensive step. For each candidate routing, the engine needs the actual current price. Pricing requires calling either the airline directly or a GDS provider (Amadeus, Sabre, Travelport) that aggregates many airlines. Fan-out is parallel across carriers. Crucially, before calling, the engine checks a pricing cache for recent prices on the same routing — if a recent priced result exists (within a few minutes), use it directly. Cache hit rates of 60-80% are typical and absolutely necessary to make the economics work.
The ranking engine takes the priced itineraries and applies the user's sort preference. 'Best' is a composite score that weighs price, total duration, number of stops, layover quality, and carrier preference. The engine returns the top N results immediately and continues to refine as more pricing responses arrive.
Result streaming: the client establishes a long-poll or websocket connection to the search endpoint. First results stream within ~1-2 seconds (direct flights + cached priced routings). Slower results stream over the next ~5-10 seconds. The client renders incrementally, with a 'searching...' indicator that resolves once all carriers have responded or timed out. Some carriers regularly time out at ~10 seconds; the search engine continues returning the available results rather than waiting indefinitely.
The cache is the engine room. Pricing cache keys include (carrier, origin, destination, departure_date, cabin_class) and possibly day-of-week + hour bucket. Cache TTL is short — typically 5-15 minutes — because airline inventory drifts and a stale price shown to the user that fails on book is a major user-experience failure. The cache is invalidated explicitly when a carrier signals an inventory change via push (some carriers support this) and otherwise expires by TTL.
Deep dive — the hard problem
Three deep dives: candidate-path pruning, the priced-stale-result problem, and the multi-city itinerary explosion.
Candidate-path pruning — the naive enumeration is exponential in connection count. For a 2-connection search through ~100 hubs, the candidate set is ~10K routings before pruning. Each pricing call costs money and time, so pruning before pricing is the only economic option.
The most powerful pruning rules: (a) total-time cap — reject routings where total travel time exceeds 1.5x the direct great-circle time. This drops ~70% of candidates. (b) backtracking cap — reject routings where any leg flies away from the destination then back. (c) minimum-connection-time — reject routings where the connection time at a hub is below the carrier's stated minimum. (d) carrier-compatibility — reject routings that mix carriers without an interline agreement; users can't typically book a JetBlue→Air France itinerary as a single ticket. (e) historical-popularity scoring — for each (origin, destination, hub) triple, the engine tracks historical book-through-this-hub rates and prioritizes hubs that historically convert; this is a learned pruning signal that improves over time.
After all rules, the engine typically prices 50-200 candidates per search instead of 10K. That's a 50-100x reduction in upstream API cost, which is the difference between a viable business and a money-burning one.
The priced-stale-result problem — the cache speeds responses and reduces cost, but it introduces a correctness hazard: the cached price might be stale, and when the user tries to book, the actual price has changed. This is the 'price changed at book time' bug that erodes user trust. Two production mitigations.
(a) Cache TTLs are short (5-15 minutes) and cache hits are re-priced opportunistically as the user browses. When the user clicks a specific itinerary to view details, the engine re-prices that itinerary even if it's a cache hit on the search page. This narrows the staleness window.
(b) The booking handoff is the canonical truth-check. When the user clicks 'book', the engine makes a fresh pricing call to the airline (a 'shopping basket' call) that reserves inventory at the quoted price for a few minutes. If the airline returns a different price, the user is shown the new price and asked to confirm. Roughly 5-10% of searches end with a price change at booking — this is unavoidable but should be the exception, not the rule. If price-changes exceed 15%, the cache TTLs are too long or the upstream feed is too stale; treat the rate as a monitoring signal.
Multi-city itinerary explosion — multi-city searches (NYC → London → Paris → NYC) compound the combinatorics. Each leg is its own origin-destination search, and the legs must be consistent in date and time. A 3-leg multi-city with 50 candidates per leg is 50^3 = 125K candidate itineraries. Pricing all of them is impossible.
The standard approach is greedy + composition: search each leg independently with a per-leg candidate cap (~10-20 candidates per leg), then compose all combinations of cached leg-prices into full itineraries (cheap, no upstream calls), rank, and return the top N. The user sees results within the normal latency budget. The tradeoff is that some multi-leg itineraries that would have been cheaper as a multi-city ticket (carriers offer multi-city discounts) are missed because each leg is priced independently. Some metasearch engines do a second pass for the top-ranked itineraries, re-pricing them as a true multi-city ticket to capture the discount.
Fourth deep dive: pricing rule complexity. Airline pricing isn't a simple lookup — it's evaluated by fare-rule engines that consider booking class availability, advance-purchase windows, minimum-stay rules, Saturday-night-stay rules, round-trip-versus-one-way pricing differences, and ancillary bundling (bag fees, seat selection, change fees). The metasearch engine doesn't typically run fare rules locally — it delegates to the airline's pricing API which returns the rule-evaluated price. Mention this delegation explicitly — interviewers reward the candidate who recognizes that fare rules are too complex to replicate.
Common mistakes
- Calling every airline for every candidate routing — the upstream cost is economically dead without aggressive pruning + caching
- Skipping the candidate-path pruner entirely — the combinatorial set is too large to price directly
- Treating the pricing cache as eternal — short TTLs are mandatory because airline inventory drifts
- Forgetting the re-price-at-book step — without it, users see prices that disappear at the booking handoff
- Designing for multi-city as a separate code path — production systems compose per-leg searches with a final ranking pass
Likely follow-up questions
- How would you support flexible-date search where the user wants the cheapest fare in a date range?
- What changes if you need to surface ancillary bundles (bag, seat, lounge) alongside the base fare?
- How would you handle carrier-direct booking (no GDS intermediary) where some carriers offer lower prices on their own channels?
- How would you detect and surface 'hidden city' itineraries where the user gets off at a connection?
- How would you implement price-prediction telling the user whether to book now or wait?
Practice Design a Flight Search Engine live with an AI interviewer
Free, no sign-up required. Get real-time feedback on your design.
Practice these liveFrequently asked questions
- How is flight search different from a generic search engine?
- Flight search is search + dynamic pricing + multi-leg routing, all under strict cost constraints from upstream APIs. A generic search engine ranks indexed documents; a flight search engine must construct itineraries on the fly, call paid pricing APIs, and stream results progressively. The combinatorial explosion + the cost-per-upstream-call are the defining constraints.
- Do I need to know specific GDS protocols?
- Naming the major GDS providers (Amadeus, Sabre, Travelport) as the integration boundary is enough. Specific message formats (EDIFACT, NDC) are bonus signal but not required unless this is a travel-platform-team interview.
- What's the senior-bar topic?
- Pruning + caching to control upstream API cost. Specifically: explain how candidate-path pruning reduces 10K routings to ~100, how the pricing cache lifts hit rate to 60-80%, and how the re-price-at-book step preserves correctness. If you can quantify the API call reduction (in dollars per search or calls per search), that's senior-plus signal.
- Should I discuss the booking flow in detail?
- Mention that the search engine hands off to a booking flow that reserves inventory and processes payment, and that the booking step re-prices to confirm the quoted price is still valid. Drilling into the full booking + payment system is overkill — that's a separate scenario (Design Online Payments or Design a Payment Gateway).