Analysis

15: Municipal/County Equity

Route and Service Drivers

Coverage: 2019-01 to 2025-11 (from otp_monthly).

Built 2026-06-15 11:52 UTC · Commit e5cf673

Page Navigation

Analysis Navigation

Data Provenance

flowchart LR
  15_municipal_equity(["15: Municipal/County Equity"])
  t_otp_monthly[("otp_monthly")] --> 15_municipal_equity
  01_data_ingestion[["Data Ingestion"]] --> t_otp_monthly
  u1_01_data_ingestion[/"data/routes_by_month.csv"/] --> 01_data_ingestion
  u2_01_data_ingestion[/"data/PRT_Current_Routes_Full_System_de0e48fcbed24ebc8b0d933e47b56682.csv"/] --> 01_data_ingestion
  u3_01_data_ingestion[/"data/Transit_stops_(current)_by_route_e040ee029227468ebf9d217402a82fa9.csv"/] --> 01_data_ingestion
  u4_01_data_ingestion[/"data/PRT_Stop_Reference_Lookup_Table.csv"/] --> 01_data_ingestion
  u5_01_data_ingestion[/"data/average-ridership/12bb84ed-397e-435c-8d1b-8ce543108698.csv"/] --> 01_data_ingestion
  t_route_stops[("route_stops")] --> 15_municipal_equity
  01_data_ingestion[["Data Ingestion"]] --> t_route_stops
  t_routes[("routes")] --> 15_municipal_equity
  01_data_ingestion[["Data Ingestion"]] --> t_routes
  t_stops[("stops")] --> 15_municipal_equity
  01_data_ingestion[["Data Ingestion"]] --> t_stops
  d1_15_municipal_equity(("polars (lib)")) --> 15_municipal_equity
  d2_15_municipal_equity(("scipy (lib)")) --> 15_municipal_equity
  classDef page fill:#dbeafe,stroke:#1d4ed8,color:#1e3a8a,stroke-width:2px;
  classDef table fill:#ecfeff,stroke:#0e7490,color:#164e63;
  classDef dep fill:#fff7ed,stroke:#c2410c,color:#7c2d12,stroke-dasharray: 4 2;
  classDef file fill:#eef2ff,stroke:#6366f1,color:#3730a3;
  classDef api fill:#f0fdf4,stroke:#16a34a,color:#14532d;
  classDef pipeline fill:#f5f3ff,stroke:#7c3aed,color:#4c1d95;
  class 15_municipal_equity page;
  class t_otp_monthly,t_route_stops,t_routes,t_stops table;
  class d1_15_municipal_equity,d2_15_municipal_equity dep;
  class u1_01_data_ingestion,u2_01_data_ingestion,u3_01_data_ingestion,u4_01_data_ingestion,u5_01_data_ingestion file;
  class 01_data_ingestion pipeline;

Findings

Findings: Municipal/County Equity

Summary

81 municipalities had enough stops (10+) to analyze. There is a 25 pp spread between the best and worst municipalities, similar to the neighborhood-level equity gap. Cross-jurisdictional routes (serving 2+ municipalities) perform no differently from single-municipality routes.

Key Numbers

81 municipalities analyzed (with much better data coverage than the 89 neighborhoods in Analysis 04)
Best: Castle Shannon borough (84.0%)
Worst: Plum borough (59.1%)
Spread: 24.9 pp
Suburban median OTP: 68.1%
Cross-jurisdictional routes (n=74): avg OTP = 69.5%
Single-municipality routes (n=19): avg OTP = 69.2%
Cross vs single t-test: p = 0.85 -- no significant difference

Observations

The best-performing municipalities (Castle Shannon, Dormont, Beechview) are all served by the light rail T line, consistent with the mode advantage found in Analysis 02.
The worst-performing municipalities (Plum, Penn Hills, Wilkinsburg) are served primarily by long local bus routes through the eastern corridor.
The suburban median OTP (68.1%) is very close to the overall system average, suggesting no systematic suburban vs urban disadvantage.
Cross-jurisdictional routes -- which might be expected to suffer from longer distances and more complexity -- perform identically to single-municipality routes. Route length and stop count matter more than jurisdictional boundaries.
The municipal analysis has much better data coverage than the neighborhood analysis (Analysis 04), which lost 58% of stops due to missing hood data.

Implication

The equity gap is driven by mode and route structure, not by geography per se. Municipalities on rail or busway corridors get 80%+ OTP; those served only by long local bus routes get 60%. Municipal boundaries and suburban/urban distinctions are not meaningful predictors.

Caveats

Route OTP is projected onto stops and then onto municipalities (ecological fallacy). A route's performance may vary along its length, and municipalities at the ends of long routes may experience different OTP than those near the middle.
Trip weights (trips_7d) are a static snapshot applied across the full study period. Service levels changed over time, especially during COVID.

Review History

2026-02-10: RED-TEAM-REPORTS/2026-02-10-analyses-12-18.md — 3 issues (2 significant, inherent to data). Ecological fallacy documented; Welch's t-test applied (no material change).

Output

image pittsburgh_vs_suburban.png
comparison chart.

image top_bottom_municipalities.png
bar chart of best/worst municipalities.

No interactive outputs declared.

data municipal_otp.csv

per-municipality average OTP and stop count.

Preview CSV

Expand to load preview.

Methods

Methods: Municipal/County Equity

Question

Analysis 04 examined neighborhood equity but lost 58% of stops due to missing hood data. The muni (municipality) and county fields have broader coverage. Do suburban municipalities get better or worse service reliability than the City of Pittsburgh? Do routes that cross municipal boundaries perform differently?

Approach

For each stop, assign OTP from the routes serving it (trip-weighted average from route_stops and otp_monthly).
Aggregate stop-level OTP by municipality and county, weighted by trips.
Rank municipalities by average OTP.
Identify cross-jurisdictional routes (routes with stops in 2+ municipalities) and compare their OTP to single-municipality routes.
Compare Pittsburgh city vs suburban municipalities.
Bar chart of top/bottom municipalities, and Pittsburgh vs suburban comparison.

Data

Name	Description	Source
`stops`	Municipality (`muni`) and county for each stop	`prt.db` table
`route_stops`	Links routes to stops with trip counts	`prt.db` table
`otp_monthly`	Monthly OTP per route	`prt.db` table
`routes`	Mode classification	`prt.db` table

Output

output/municipal_otp.csv -- per-municipality average OTP and stop count
output/top_bottom_municipalities.png -- bar chart of best/worst municipalities
output/pittsburgh_vs_suburban.png -- comparison chart

  Source Code
    
      
      """Municipal and county equity analysis of on-time performance."""

import polars as pl
from scipy import stats

from prt_otp_analysis.common import OTP_GOOD_THRESHOLD, OTP_WARNING_THRESHOLD, analysis_dir, phase, query_to_polars, run_analysis, save_chart, save_csv, setup_plotting, weighted_mean

OUT = analysis_dir(__file__)

MIN_STOPS = 10


def load_data() -> tuple[pl.DataFrame, pl.DataFrame]:
    """Load per-municipality OTP and cross-jurisdictional route data."""
    # Trip-weighted OTP per stop via the routes that serve it
    stop_otp = query_to_polars("""
        SELECT rs.stop_id, rs.route_id, rs.trips_wd,
               s.muni, s.county,
               AVG(o.otp) AS route_avg_otp
        FROM route_stops rs
        JOIN stops s ON rs.stop_id = s.stop_id
        JOIN otp_monthly o ON rs.route_id = o.route_id
        WHERE s.muni IS NOT NULL AND s.muni != '0'
        GROUP BY rs.stop_id, rs.route_id, rs.trips_wd, s.muni, s.county
    """)

    # Per-municipality: trip-weighted average OTP
    muni_otp = (
        stop_otp
        .group_by("muni", "county")
        .agg(
            avg_otp=weighted_mean("route_avg_otp", "trips_wd"),
            n_stops=pl.col("stop_id").n_unique(),
            n_routes=pl.col("route_id").n_unique(),
            total_trips=pl.col("trips_wd").sum(),
        )
        .filter(pl.col("n_stops") >= MIN_STOPS)
        .sort("avg_otp", descending=True)
    )

    # Cross-jurisdictional routes: routes with stops in 2+ municipalities
    route_munis = query_to_polars("""
        SELECT rs.route_id, COUNT(DISTINCT s.muni) AS n_munis
        FROM route_stops rs
        JOIN stops s ON rs.stop_id = s.stop_id
        WHERE s.muni IS NOT NULL AND s.muni != '0'
        GROUP BY rs.route_id
    """)
    avg_otp_by_route = query_to_polars("""
        SELECT o.route_id, r.route_name, r.mode,
               AVG(o.otp) AS avg_otp
        FROM otp_monthly o
        JOIN routes r ON o.route_id = r.route_id
        GROUP BY o.route_id
    """)
    cross_jur = route_munis.join(avg_otp_by_route, on="route_id", how="inner")
    cross_jur = cross_jur.with_columns(
        pl.when(pl.col("n_munis") >= 2)
        .then(pl.lit("cross-jurisdictional"))
        .otherwise(pl.lit("single-municipality"))
        .alias("jurisdiction_type")
    )

    return muni_otp, cross_jur


def analyze(muni_otp: pl.DataFrame, cross_jur: pl.DataFrame) -> dict:
    """Compute summary stats and comparisons."""
    results = {}

    # Pittsburgh vs suburban
    pgh = muni_otp.filter(pl.col("muni") == "Pittsburgh")
    suburban = muni_otp.filter(pl.col("muni") != "Pittsburgh")

    if len(pgh) > 0:
        results["pgh_otp"] = pgh["avg_otp"][0]
        results["pgh_stops"] = pgh["n_stops"][0]
    results["suburban_median_otp"] = suburban["avg_otp"].median()
    results["suburban_mean_otp"] = suburban["avg_otp"].mean()
    results["n_munis"] = len(muni_otp)

    # Spread
    results["best_muni"] = muni_otp.sort("avg_otp", descending=True).head(1)["muni"][0]
    results["best_otp"] = muni_otp.sort("avg_otp", descending=True).head(1)["avg_otp"][0]
    results["worst_muni"] = muni_otp.sort("avg_otp").head(1)["muni"][0]
    results["worst_otp"] = muni_otp.sort("avg_otp").head(1)["avg_otp"][0]
    results["spread"] = results["best_otp"] - results["worst_otp"]

    # Cross-jurisdictional comparison
    cross = cross_jur.filter(pl.col("jurisdiction_type") == "cross-jurisdictional")
    single = cross_jur.filter(pl.col("jurisdiction_type") == "single-municipality")
    results["cross_mean_otp"] = cross["avg_otp"].mean()
    results["single_mean_otp"] = single["avg_otp"].mean()
    results["n_cross"] = len(cross)
    results["n_single"] = len(single)

    if len(cross) > 2 and len(single) > 2:
        t, p = stats.ttest_ind(cross["avg_otp"].to_list(), single["avg_otp"].to_list(),
                              equal_var=False)  # Welch's t-test: unequal group sizes
        results["cross_t"] = t
        results["cross_p"] = p

    return results


def make_charts(muni_otp: pl.DataFrame, cross_jur: pl.DataFrame, results: dict) -> None:
    """Generate bar charts for municipality ranking and Pittsburgh comparison."""
    plt = setup_plotting()

    # Top/bottom municipalities
    n_show = 10
    top = muni_otp.sort("avg_otp", descending=True).head(n_show)
    bottom = muni_otp.sort("avg_otp").head(n_show)
    combined = pl.concat([top, bottom.sort("avg_otp", descending=True)])
    # Remove duplicates if muni appears in both
    combined = combined.unique(subset=["muni"]).sort("avg_otp", descending=True)

    fig, ax = plt.subplots(figsize=(12, 8))
    munis = combined["muni"].to_list()
    otps = combined["avg_otp"].to_list()
    colors = ["#22c55e" if v >= OTP_GOOD_THRESHOLD else "#f59e0b" if v >= OTP_WARNING_THRESHOLD else "#ef4444" for v in otps]
    bars = ax.barh(range(len(munis)), otps, color=colors, edgecolor="white")
    ax.set_yticks(range(len(munis)))
    ax.set_yticklabels(munis, fontsize=8)
    ax.set_xlabel("Average OTP")
    ax.set_title(f"Top & Bottom Municipalities by OTP (min {MIN_STOPS} stops)")
    ax.set_xlim(0.4, 1.0)
    ax.invert_yaxis()
    save_chart(fig, OUT / "top_bottom_municipalities.png")

    # Pittsburgh vs suburban
    fig, ax = plt.subplots(figsize=(8, 5))
    categories = ["Pittsburgh", "Suburban\n(median)", "Suburban\n(mean)"]
    values = [
        results.get("pgh_otp", 0),
        results["suburban_median_otp"],
        results["suburban_mean_otp"],
    ]
    colors = ["#3b82f6", "#22c55e", "#22c55e"]
    ax.bar(categories, values, color=colors, edgecolor="white", width=0.5)
    ax.set_ylabel("Average OTP")
    ax.set_title("Pittsburgh vs Suburban Municipalities")
    ax.set_ylim(0.5, 0.85)
    for i, v in enumerate(values):
        ax.text(i, v + 0.005, f"{v:.1%}", ha="center", fontsize=10)
    save_chart(fig, OUT / "pittsburgh_vs_suburban.png")


@run_analysis(15, "Municipal/County Equity")
def main() -> None:
    """Entry point: load data, analyze, chart, and save."""
    with phase("Loading data"):
        muni_otp, cross_jur = load_data()
        print(f"  {len(muni_otp)} municipalities with {MIN_STOPS}+ stops")

    with phase("Analyzing"):
        results = analyze(muni_otp, cross_jur)
        print(f"  Best:  {results['best_muni']} ({results['best_otp']:.1%})")
        print(f"  Worst: {results['worst_muni']} ({results['worst_otp']:.1%})")
        print(f"  Spread: {results['spread']:.1%}")
        if "pgh_otp" in results:
            print(f"  Pittsburgh: {results['pgh_otp']:.1%} ({results['pgh_stops']} stops)")
        print(f"  Suburban median: {results['suburban_median_otp']:.1%}")
        print(f"  Cross-jurisdictional routes: {results['n_cross']} "
              f"(avg OTP={results['cross_mean_otp']:.1%})")
        print(f"  Single-municipality routes: {results['n_single']} "
              f"(avg OTP={results['single_mean_otp']:.1%})")
        if "cross_p" in results:
            print(f"  Difference t-test: t={results['cross_t']:.3f}, p={results['cross_p']:.4f}")

        save_csv(muni_otp, OUT / "municipal_otp.csv")

    with phase("Generating charts"):
        make_charts(muni_otp, cross_jur, results)


if __name__ == "__main__":
    main()

    

    

Sources

Name	Type	Why It Matters	Owner	Freshness	Caveat
otp_monthly	table	Primary analytical table used in this page's computations.	Produced by Data Ingestion.	Updated when the producing pipeline step is rerun.	Coverage depends on upstream source availability and ETL assumptions.
Upstream sources (5) file data/routes_by_month.csv — Monthly route OTP source table in wide format. file data/PRT_Current_Routes_Full_System_de0e48fcbed24ebc8b0d933e47b56682.csv — Current route metadata and mode classifications. file data/Transit_stops_(current)_by_route_e040ee029227468ebf9d217402a82fa9.csv — Current stop-to-route coverage and trip counts. file data/PRT_Stop_Reference_Lookup_Table.csv — Historical stop reference file with geography attributes. file data/average-ridership/12bb84ed-397e-435c-8d1b-8ce543108698.csv — Average ridership by route and month.
route_stops	table	Primary analytical table used in this page's computations.	Produced by Data Ingestion.	Updated when the producing pipeline step is rerun.	Coverage depends on upstream source availability and ETL assumptions.
Upstream sources (5) file data/routes_by_month.csv — Monthly route OTP source table in wide format. file data/PRT_Current_Routes_Full_System_de0e48fcbed24ebc8b0d933e47b56682.csv — Current route metadata and mode classifications. file data/Transit_stops_(current)_by_route_e040ee029227468ebf9d217402a82fa9.csv — Current stop-to-route coverage and trip counts. file data/PRT_Stop_Reference_Lookup_Table.csv — Historical stop reference file with geography attributes. file data/average-ridership/12bb84ed-397e-435c-8d1b-8ce543108698.csv — Average ridership by route and month.
routes	table	Primary analytical table used in this page's computations.	Produced by Data Ingestion.	Updated when the producing pipeline step is rerun.	Coverage depends on upstream source availability and ETL assumptions.
Upstream sources (5) file data/routes_by_month.csv — Monthly route OTP source table in wide format. file data/PRT_Current_Routes_Full_System_de0e48fcbed24ebc8b0d933e47b56682.csv — Current route metadata and mode classifications. file data/Transit_stops_(current)_by_route_e040ee029227468ebf9d217402a82fa9.csv — Current stop-to-route coverage and trip counts. file data/PRT_Stop_Reference_Lookup_Table.csv — Historical stop reference file with geography attributes. file data/average-ridership/12bb84ed-397e-435c-8d1b-8ce543108698.csv — Average ridership by route and month.
stops	table	Primary analytical table used in this page's computations.	Produced by Data Ingestion.	Updated when the producing pipeline step is rerun.	Coverage depends on upstream source availability and ETL assumptions.
Upstream sources (5) file data/routes_by_month.csv — Monthly route OTP source table in wide format. file data/PRT_Current_Routes_Full_System_de0e48fcbed24ebc8b0d933e47b56682.csv — Current route metadata and mode classifications. file data/Transit_stops_(current)_by_route_e040ee029227468ebf9d217402a82fa9.csv — Current stop-to-route coverage and trip counts. file data/PRT_Stop_Reference_Lookup_Table.csv — Historical stop reference file with geography attributes. file data/average-ridership/12bb84ed-397e-435c-8d1b-8ce543108698.csv — Average ridership by route and month.
polars	dependency	Runtime dependency required for this page's pipeline or analysis code.	Open-source Python ecosystem maintainers.	Version pinned by project environment until dependency updates are applied.	Library updates may change behavior or defaults.
scipy	dependency	Runtime dependency required for this page's pipeline or analysis code.	Open-source Python ecosystem maintainers.	Version pinned by project environment until dependency updates are applied.	Library updates may change behavior or defaults.