Analysis

11: Directional Asymmetry

Route and Service Drivers

Coverage: 2019-01 to 2025-11 (from otp_monthly).

Built 2026-04-03 20:09 UTC · Commit 7c56b9a

Page Navigation

Analysis Navigation

Data Provenance

flowchart LR
  11_directional_asymmetry(["11: Directional Asymmetry"])
  t_otp_monthly[("otp_monthly")] --> 11_directional_asymmetry
  01_data_ingestion[["Data Ingestion"]] --> t_otp_monthly
  u1_01_data_ingestion[/"data/routes_by_month.csv"/] --> 01_data_ingestion
  u2_01_data_ingestion[/"data/PRT_Current_Routes_Full_System_de0e48fcbed24ebc8b0d933e47b56682.csv"/] --> 01_data_ingestion
  u3_01_data_ingestion[/"data/Transit_stops_(current)_by_route_e040ee029227468ebf9d217402a82fa9.csv"/] --> 01_data_ingestion
  u4_01_data_ingestion[/"data/PRT_Stop_Reference_Lookup_Table.csv"/] --> 01_data_ingestion
  u5_01_data_ingestion[/"data/average-ridership/12bb84ed-397e-435c-8d1b-8ce543108698.csv"/] --> 01_data_ingestion
  t_route_stops[("route_stops")] --> 11_directional_asymmetry
  01_data_ingestion[["Data Ingestion"]] --> t_route_stops
  t_routes[("routes")] --> 11_directional_asymmetry
  01_data_ingestion[["Data Ingestion"]] --> t_routes
  d1_11_directional_asymmetry(("polars (lib)")) --> 11_directional_asymmetry
  d2_11_directional_asymmetry(("scipy (lib)")) --> 11_directional_asymmetry
  classDef page fill:#dbeafe,stroke:#1d4ed8,color:#1e3a8a,stroke-width:2px;
  classDef table fill:#ecfeff,stroke:#0e7490,color:#164e63;
  classDef dep fill:#fff7ed,stroke:#c2410c,color:#7c2d12,stroke-dasharray: 4 2;
  classDef file fill:#eef2ff,stroke:#6366f1,color:#3730a3;
  classDef api fill:#f0fdf4,stroke:#16a34a,color:#14532d;
  classDef pipeline fill:#f5f3ff,stroke:#7c3aed,color:#4c1d95;
  class 11_directional_asymmetry page;
  class t_otp_monthly,t_route_stops,t_routes table;
  class d1_11_directional_asymmetry,d2_11_directional_asymmetry dep;
  class u1_01_data_ingestion,u2_01_data_ingestion,u3_01_data_ingestion,u4_01_data_ingestion,u5_01_data_ingestion file;
  class 01_data_ingestion pipeline;

Findings

Findings: Directional Asymmetry

Summary

The correlation between directional trip imbalance and OTP is weak and not statistically significant (r = -0.12, p = 0.26). After correcting methodology issues in the original analysis, PRT routes show very little directional asymmetry, and what asymmetry exists does not predict OTP.

Key Numbers

  • All routes: Pearson r = -0.12 (p = 0.26, n = 90)
  • Bus only: Pearson r = -0.12 (p = 0.28, n = 87)
  • Bus only: Spearman r = -0.17 (p = 0.11)
  • Maximum asymmetry index: 0.077 (Route 19L)

Methodology Note

The original analysis used SUM(trips_wd) per direction and excluded stops with direction = 'IB,OB'. This inflated asymmetry because IB,OB stops (common on shared corridors) were dropped rather than counted in both directions. Routes 11 and 60 appeared 100% asymmetric because their IB stops were all coded as IB,OB. The corrected analysis uses MAX(trips_wd) per direction (peak frequency, not total stop-visits), includes IB,OB stops in both directions, and excludes routes present in only one direction.

Most Asymmetric Routes

Route IB Trips OB Trips Asymmetry OTP
19L - Emsworth Limited 7 6 0.077 66.0%
67 - Monroeville 27 25 0.038 60.4%
29 - Robinson 22 21 0.023 65.1%

Observations

  • With corrected methodology, PRT routes are remarkably balanced directionally. The most asymmetric route (19L) has only a 7.7% imbalance (7 vs 6 trips), and most routes are at or near 0%.
  • The previous finding of routes with 100% asymmetry (Routes 11 and 60) was a data artifact caused by excluding bidirectional (IB,OB) stops. These routes are excluded in the corrected analysis because they lack separate IB and OB data after the fix.
  • The slight negative correlation (r = -0.12) suggests marginally worse OTP with more asymmetry, but the effect is too small and not statistically significant.

Conclusion

Directional imbalance does not predict OTP. PRT routes are sufficiently balanced that this is not a measurable factor in schedule adherence. The hypothesis that scheduling asymmetry creates operational strain is not supported by this data.

Caveats

  • The analysis uses peak trip frequency (MAX(trips_wd)) at a single stop per direction, which may not capture all scheduling nuances.
  • The route_stops data reflects current service, not historical. Historical asymmetry may have differed.
  • Including IB,OB stops in both directions can compress the asymmetry index toward zero when an IB,OB stop has the highest trips_wd for a route. In that case, both IB and OB MAX values equal the same number, mechanically forcing asymmetry to zero. This is a known limitation -- the alternative (excluding IB,OB stops) creates the opposite bias by dropping data.
  • Routes with fewer than 12 months of OTP data are excluded.
  • Three correlation tests were run (Pearson all-routes, Pearson bus-only, Spearman bus-only) without multiple-comparison correction. Since all three are non-significant (smallest p = 0.11), correction would not change any conclusion.

Review History

  • 2026-02-11: RED-TEAM-REPORTS/2026-02-11-analyses-01-05-07-11.md -- 7 issues (1 significant). Updated METHODS.md to correctly describe IB,OB handling (include in both, not exclude); corrected "total trips" to "peak frequency (MAX)"; added minimum-month filter (HAVING COUNT >= 12); added NULL filter for trips_wd; replaced manual regression with scipy.stats.linregress; documented IB,OB compression effect; noted multiple-test caveat.

Output

Methods

Methods: Inbound vs Outbound Asymmetry

Question

Does a structural directional imbalance in trip frequency correlate with worse OTP? Routes with significantly different IB vs OB trip counts may face operational challenges.

Approach

  • For each route, compute peak IB frequency and peak OB frequency (MAX(trips_wd)) from route_stops. Stops with trips_wd IS NULL are excluded.
  • Include stops with IB,OB direction in both IB and OB counts to avoid exclusion bias.
  • Compute an asymmetry index: abs(IB_trips - OB_trips) / (IB_trips + OB_trips).
  • Compute average OTP per route from otp_monthly, requiring at least 12 months of data (HAVING COUNT(*) >= 12).
  • Correlate asymmetry with average OTP using Pearson (all routes), Pearson (bus-only), and Spearman (bus-only).
  • Investigate routes with highest asymmetry.

Data

Name Description Source
route_stops Directional trip counts prt.db table
otp_monthly OTP per route prt.db table
routes Route metadata prt.db table

Output

  • output/directional_asymmetry.csv -- per-route directional trip breakdown and OTP
  • output/directional_asymmetry.png -- scatter plot of asymmetry vs OTP

Source Code

"""Analysis of inbound vs outbound trip asymmetry and its correlation with OTP."""

import math

import polars as pl

from prt_otp_analysis.common import analysis_dir, correlate_by_mode, mode_scatter, phase, query_to_polars, run_analysis, save_chart, save_csv, setup_plotting

OUT = analysis_dir(__file__)


def load_data() -> tuple[pl.DataFrame, pl.DataFrame]:
    """Load directional peak trip frequency and average OTP per route."""
    # Use MAX(trips_wd) per route-direction to get peak frequency, not stop-visits.
    # Include IB,OB stops in both directions to avoid exclusion bias.
    directional = query_to_polars("""
        SELECT route_id, 'IB' AS direction,
               MAX(trips_wd) AS trips_wd, MAX(trips_7d) AS trips_7d
        FROM route_stops
        WHERE direction IN ('IB', 'IB,OB')
          AND trips_wd IS NOT NULL
        GROUP BY route_id
        UNION ALL
        SELECT route_id, 'OB' AS direction,
               MAX(trips_wd) AS trips_wd, MAX(trips_7d) AS trips_7d
        FROM route_stops
        WHERE direction IN ('OB', 'IB,OB')
          AND trips_wd IS NOT NULL
        GROUP BY route_id
    """)
    avg_otp = query_to_polars("""
        SELECT o.route_id, r.route_name, r.mode,
               AVG(o.otp) AS avg_otp, COUNT(*) AS months
        FROM otp_monthly o
        JOIN routes r ON o.route_id = r.route_id
        GROUP BY o.route_id
        HAVING COUNT(*) >= 12
    """)
    return directional, avg_otp


def analyze(directional: pl.DataFrame, avg_otp: pl.DataFrame) -> tuple[pl.DataFrame, dict]:
    """Compute asymmetry index per route and correlate with OTP."""
    # Pivot to get IB and OB columns
    pivoted = directional.pivot(on="direction", index="route_id", values="trips_wd")

    if "IB" not in pivoted.columns or "OB" not in pivoted.columns:
        print("  Warning: Missing IB or OB direction data")
        return pl.DataFrame(), {}

    pivoted = pivoted.rename({"IB": "ib_trips_wd", "OB": "ob_trips_wd"})

    # Drop routes missing a direction entirely (likely loop routes, not genuinely asymmetric)
    pivoted = pivoted.filter(
        pl.col("ib_trips_wd").is_not_null() & pl.col("ob_trips_wd").is_not_null()
    )

    # Compute asymmetry index
    pivoted = pivoted.with_columns(
        total_trips=pl.col("ib_trips_wd") + pl.col("ob_trips_wd"),
    )
    pivoted = pivoted.filter(pl.col("total_trips") > 0)
    pivoted = pivoted.with_columns(
        asymmetry_index=(
            (pl.col("ib_trips_wd") - pl.col("ob_trips_wd")).abs()
            / pl.col("total_trips")
        ),
    )

    # Join with OTP
    result = pivoted.join(avg_otp, on="route_id", how="inner")

    # Compute correlations
    results = correlate_by_mode(result, "asymmetry_index", "avg_otp")

    return result.sort("asymmetry_index", descending=True), results


def make_chart(df: pl.DataFrame) -> None:
    """Generate scatter plot of directional asymmetry vs OTP."""
    plt = setup_plotting()
    fig, ax = plt.subplots(figsize=(10, 7))
    mode_scatter(ax, df, "asymmetry_index", "avg_otp")
    ax.set_xlabel("Directional Asymmetry Index |IB - OB| / (IB + OB)")
    ax.set_ylabel("Average OTP")
    ax.set_title("Directional Trip Asymmetry vs On-Time Performance")
    ax.legend(fontsize=9)
    ax.set_ylim(0, 1)
    ax.set_xlim(-0.05, 1.05)
    save_chart(fig, OUT / "directional_asymmetry.png")


@run_analysis(11, "Directional Asymmetry")
def main() -> None:
    """Entry point: load data, analyze asymmetry, chart, and save."""
    with phase("Loading data"):
        directional, avg_otp = load_data()
        print(f"  {len(directional)} directional records, {len(avg_otp)} routes with OTP")

    with phase("Analyzing"):
        result, results = analyze(directional, avg_otp)
        if len(result) == 0:
            print("  No data to analyze.")
            return

        print(f"  {results['all_n']} routes analyzed (routes with both IB and OB data)")
        print(f"  All routes:  Pearson r = {results['all_pearson_r']:.4f} (p = {results['all_pearson_p']:.4f})")
        if not math.isnan(results["bus_pearson_r"]):
            print(f"  Bus only:    Pearson r = {results['bus_pearson_r']:.4f} (p = {results['bus_pearson_p']:.4f})")
            print(f"               Spearman r = {results['bus_spearman_r']:.4f} (p = {results['bus_spearman_p']:.4f})")
            print(f"               n = {results['bus_n']} bus routes")

        top5 = result.head(5)
        print("\n  Most asymmetric routes:")
        for row in top5.iter_rows(named=True):
            print(f"    {row['route_id']:>5} - {row['route_name']}: "
                  f"IB={row['ib_trips_wd']}, OB={row['ob_trips_wd']}, "
                  f"asymmetry={row['asymmetry_index']:.3f}, OTP={row['avg_otp']:.1%}")

        save_csv(result, OUT / "directional_asymmetry.csv")

    with phase("Generating chart"):
        make_chart(result)


if __name__ == "__main__":
    main()

Sources

NameTypeWhy It MattersOwnerFreshnessCaveat
otp_monthly table Primary analytical table used in this page's computations. Produced by Data Ingestion. Updated when the producing pipeline step is rerun. Coverage depends on upstream source availability and ETL assumptions.
Upstream sources (5)
  • file data/routes_by_month.csv — Monthly route OTP source table in wide format.
  • file data/PRT_Current_Routes_Full_System_de0e48fcbed24ebc8b0d933e47b56682.csv — Current route metadata and mode classifications.
  • file data/Transit_stops_(current)_by_route_e040ee029227468ebf9d217402a82fa9.csv — Current stop-to-route coverage and trip counts.
  • file data/PRT_Stop_Reference_Lookup_Table.csv — Historical stop reference file with geography attributes.
  • file data/average-ridership/12bb84ed-397e-435c-8d1b-8ce543108698.csv — Average ridership by route and month.
route_stops table Primary analytical table used in this page's computations. Produced by Data Ingestion. Updated when the producing pipeline step is rerun. Coverage depends on upstream source availability and ETL assumptions.
Upstream sources (5)
  • file data/routes_by_month.csv — Monthly route OTP source table in wide format.
  • file data/PRT_Current_Routes_Full_System_de0e48fcbed24ebc8b0d933e47b56682.csv — Current route metadata and mode classifications.
  • file data/Transit_stops_(current)_by_route_e040ee029227468ebf9d217402a82fa9.csv — Current stop-to-route coverage and trip counts.
  • file data/PRT_Stop_Reference_Lookup_Table.csv — Historical stop reference file with geography attributes.
  • file data/average-ridership/12bb84ed-397e-435c-8d1b-8ce543108698.csv — Average ridership by route and month.
routes table Primary analytical table used in this page's computations. Produced by Data Ingestion. Updated when the producing pipeline step is rerun. Coverage depends on upstream source availability and ETL assumptions.
Upstream sources (5)
  • file data/routes_by_month.csv — Monthly route OTP source table in wide format.
  • file data/PRT_Current_Routes_Full_System_de0e48fcbed24ebc8b0d933e47b56682.csv — Current route metadata and mode classifications.
  • file data/Transit_stops_(current)_by_route_e040ee029227468ebf9d217402a82fa9.csv — Current stop-to-route coverage and trip counts.
  • file data/PRT_Stop_Reference_Lookup_Table.csv — Historical stop reference file with geography attributes.
  • file data/average-ridership/12bb84ed-397e-435c-8d1b-8ce543108698.csv — Average ridership by route and month.
polars dependency Runtime dependency required for this page's pipeline or analysis code. Open-source Python ecosystem maintainers. Version pinned by project environment until dependency updates are applied. Library updates may change behavior or defaults.
scipy dependency Runtime dependency required for this page's pipeline or analysis code. Open-source Python ecosystem maintainers. Version pinned by project environment until dependency updates are applied. Library updates may change behavior or defaults.