Analysis

10: Trip Frequency vs OTP

Route and Service Drivers

Coverage: 2019-01 to 2025-11 (from otp_monthly).

Built 2026-03-03 02:23 UTC · Commit defd5c8

Page Navigation

Analysis Navigation

Data Provenance

flowchart LR
  10_frequency_vs_otp(["10: Trip Frequency vs OTP"])
  t_otp_monthly[("otp_monthly")] --> 10_frequency_vs_otp
  01_data_ingestion[["Data Ingestion"]] --> t_otp_monthly
  t_route_stops[("route_stops")] --> 10_frequency_vs_otp
  01_data_ingestion[["Data Ingestion"]] --> t_route_stops
  t_routes[("routes")] --> 10_frequency_vs_otp
  01_data_ingestion[["Data Ingestion"]] --> t_routes
  d1_10_frequency_vs_otp(("polars (lib)")) --> 10_frequency_vs_otp
  d2_10_frequency_vs_otp(("scipy (lib)")) --> 10_frequency_vs_otp
  classDef page fill:#dbeafe,stroke:#1d4ed8,color:#1e3a8a,stroke-width:2px;
  classDef table fill:#ecfeff,stroke:#0e7490,color:#164e63;
  classDef dep fill:#fff7ed,stroke:#c2410c,color:#7c2d12,stroke-dasharray: 4 2;
  classDef file fill:#eef2ff,stroke:#6366f1,color:#3730a3;
  classDef api fill:#f0fdf4,stroke:#16a34a,color:#14532d;
  classDef pipeline fill:#f5f3ff,stroke:#7c3aed,color:#4c1d95;
  class 10_frequency_vs_otp page;
  class t_otp_monthly,t_route_stops,t_routes table;
  class d1_10_frequency_vs_otp,d2_10_frequency_vs_otp dep;
  class 01_data_ingestion pipeline;

Findings

Findings: Trip Frequency vs OTP

Summary

There is no meaningful correlation between peak weekday trip frequency and OTP. The previous finding (r = -0.39) was an artifact of using SUM(trips_wd) across stops, which conflated frequency with route length. After correcting to MAX(trips_wd) (peak frequency at any single stop), the correlation vanishes.

Key Numbers

  • All routes: Pearson r = 0.03 (p = 0.81, n = 92) -- essentially zero
  • Bus only: Pearson r = -0.06 (p = 0.55, n = 89)
  • Bus only: Spearman r = -0.11 (p = 0.29)

Methodology Note

The original analysis summed trips_wd across all stops on a route. Because trips_wd is recorded per stop, a route with 50 trips per day and 100 stops produces a sum of ~5,000, while a route with 50 trips per day and 20 stops produces ~1,000. This made the metric a proxy for frequency x stop_count rather than pure frequency. Using MAX(trips_wd) isolates the peak trip frequency at the busiest stop, which is a better measure of how often the route actually runs.

Observations

  • Running more trips per se does not degrade OTP. The previous apparent correlation was entirely driven by the confounding of frequency with route length (stop count).
  • This result is consistent with Analysis 07's finding that stop count is the real structural predictor -- once route complexity is removed from the frequency metric, the effect disappears.
  • Some of the highest-frequency routes (P1 East Busway, RAIL lines) actually have excellent OTP, because they combine high frequency with few stops and dedicated right-of-way.

Implication

PRT should not expect OTP penalties from increasing service frequency on existing routes. The capacity to run more trips does not inherently strain schedule adherence. The real lever for improving OTP is route design (stop count, right-of-way), not service volume.

Caveats

  • trips_wd in route_stops represents current weekday frequency, not historical. Frequency may have changed over the analysis period.
  • MAX(trips_wd) captures the peak stop, which for short-turn routes may overstate the frequency experienced by riders at outer stops.
  • Routes with fewer than 12 months of OTP data are excluded (1 route dropped vs prior version).
  • Three correlation tests were run (Pearson all-routes, Pearson bus-only, Spearman bus-only) without multiple-comparison correction. Since all three are non-significant (smallest p = 0.29), correction would not change any conclusion.

Review History

  • 2026-02-11: RED-TEAM-REPORTS/2026-02-11-analyses-01-05-07-11.md -- 6 issues (1 significant). Updated METHODS.md to reflect MAX(trips_wd) instead of SUM; documented all three correlation tests; added minimum-month filter (HAVING COUNT >= 12); added NULL filter for trips_wd; replaced manual regression with scipy.stats.linregress; noted multiple-test caveat.

Output

Methods

Methods: Trip Frequency vs OTP

Question

Is there a correlation between how often a route runs (trip frequency) and its on-time performance? High-frequency routes may suffer from schedule adherence issues like bunching.

Approach

  • Compute maximum weekday trips per route from route_stops (MAX(trips_wd) across all stops, used as a peak frequency proxy). Stops with trips_wd IS NULL are excluded.
  • Compute average OTP per route from otp_monthly, requiring at least 12 months of data (HAVING COUNT(*) >= 12).
  • Scatter plot of trip frequency vs average OTP, colored by mode.
  • Compute Pearson correlation (all routes), Pearson correlation (bus-only), and Spearman rank correlation (bus-only).

Data

Name Description Source
otp_monthly Monthly OTP per route prt.db table
route_stops Trip counts (trips_wd, trips_7d) prt.db table
routes Mode classification prt.db table

Output

  • output/frequency_otp.csv -- per-route frequency and OTP summary
  • output/frequency_vs_otp.png -- scatter plot with correlation

Sources

NameTypeWhy It MattersOwnerFreshnessCaveat
otp_monthly table Primary analytical table used in this page's computations. Produced by Data Ingestion. Updated when the producing pipeline step is rerun. Coverage depends on upstream source availability and ETL assumptions.
route_stops table Primary analytical table used in this page's computations. Produced by Data Ingestion. Updated when the producing pipeline step is rerun. Coverage depends on upstream source availability and ETL assumptions.
routes table Primary analytical table used in this page's computations. Produced by Data Ingestion. Updated when the producing pipeline step is rerun. Coverage depends on upstream source availability and ETL assumptions.
polars dependency Runtime dependency required for this page's pipeline or analysis code. Open-source Python ecosystem maintainers. Version pinned by project environment until dependency updates are applied. Library updates may change behavior or defaults.
scipy dependency Runtime dependency required for this page's pipeline or analysis code. Open-source Python ecosystem maintainers. Version pinned by project environment until dependency updates are applied. Library updates may change behavior or defaults.