Analysis

17: Weekend vs Weekday Service Profile

Route and Service Drivers

Coverage: 2019-01 to 2025-11 (from otp_monthly).

Built 2026-04-03 20:09 UTC · Commit 7c56b9a

Page Navigation

Analysis Navigation

Data Provenance

flowchart LR
  17_weekend_weekday_profile(["17: Weekend vs Weekday Service Profile"])
  t_otp_monthly[("otp_monthly")] --> 17_weekend_weekday_profile
  01_data_ingestion[["Data Ingestion"]] --> t_otp_monthly
  u1_01_data_ingestion[/"data/routes_by_month.csv"/] --> 01_data_ingestion
  u2_01_data_ingestion[/"data/PRT_Current_Routes_Full_System_de0e48fcbed24ebc8b0d933e47b56682.csv"/] --> 01_data_ingestion
  u3_01_data_ingestion[/"data/Transit_stops_(current)_by_route_e040ee029227468ebf9d217402a82fa9.csv"/] --> 01_data_ingestion
  u4_01_data_ingestion[/"data/PRT_Stop_Reference_Lookup_Table.csv"/] --> 01_data_ingestion
  u5_01_data_ingestion[/"data/average-ridership/12bb84ed-397e-435c-8d1b-8ce543108698.csv"/] --> 01_data_ingestion
  t_route_stops[("route_stops")] --> 17_weekend_weekday_profile
  01_data_ingestion[["Data Ingestion"]] --> t_route_stops
  t_routes[("routes")] --> 17_weekend_weekday_profile
  01_data_ingestion[["Data Ingestion"]] --> t_routes
  d1_17_weekend_weekday_profile(("polars (lib)")) --> 17_weekend_weekday_profile
  d2_17_weekend_weekday_profile(("scipy (lib)")) --> 17_weekend_weekday_profile
  classDef page fill:#dbeafe,stroke:#1d4ed8,color:#1e3a8a,stroke-width:2px;
  classDef table fill:#ecfeff,stroke:#0e7490,color:#164e63;
  classDef dep fill:#fff7ed,stroke:#c2410c,color:#7c2d12,stroke-dasharray: 4 2;
  classDef file fill:#eef2ff,stroke:#6366f1,color:#3730a3;
  classDef api fill:#f0fdf4,stroke:#16a34a,color:#14532d;
  classDef pipeline fill:#f5f3ff,stroke:#7c3aed,color:#4c1d95;
  class 17_weekend_weekday_profile page;
  class t_otp_monthly,t_route_stops,t_routes table;
  class d1_17_weekend_weekday_profile,d2_17_weekend_weekday_profile dep;
  class u1_01_data_ingestion,u2_01_data_ingestion,u3_01_data_ingestion,u4_01_data_ingestion,u5_01_data_ingestion file;
  class 01_data_ingestion pipeline;

Findings

Findings: Weekend vs Weekday Service Profile

Summary

There is no meaningful correlation between a route's weekend-to-weekday service ratio and its OTP. Routes that run heavy weekend service perform identically to commuter-oriented weekday-heavy routes.

Key Numbers

Pearson r = -0.03 (p = 0.79, n = 93)
Spearman rho = -0.02 (p = 0.84)

Service Tier	Routes	Mean OTP
Weekday-heavy (<0.3)	27	69.8%
Balanced (0.3-0.7)	45	68.8%
Weekend-heavy (>0.7)	21	70.3%

Observations

The three service tiers are virtually indistinguishable in OTP (69.8%, 68.8%, 70.3%).
Neither Pearson nor Spearman correlations approach significance.
This null result makes sense: the weekend service ratio reflects demand patterns and scheduling choices, not route structure. A route with high weekend service isn't inherently harder to run on time.
Since OTP is reported monthly (not by day-of-week), it aggregates weekday and weekend performance, which may mask day-specific patterns.
Bus-only correlation (r = -0.06, p = 0.56) confirms the null result holds within the dominant mode.

Implication

Weekend vs weekday service intensity is not a useful predictor of OTP. The structural factors identified in other analyses (stop count, mode, route length) dominate.

Review History

2026-02-10: RED-TEAM-REPORTS/2026-02-10-analyses-12-18.md — 2 issues (both moderate). Bus-only correlation added; caveats strengthened. Null finding unchanged.

Output

image service_tier_comparison.png
box plot by service profile tier.

image weekend_ratio_vs_otp.png
scatter plot.

No interactive outputs declared.

data service_profile.csv

per-route trip counts, weekend ratio, and OTP.

Preview CSV

Expand to load preview.

Methods

Methods: Weekend vs Weekday Service Profile

Question

Do commuter-oriented routes (high weekday, low weekend service) perform differently than all-day routes (similar weekday and weekend service)? The ratio of weekend to weekday trips signals route purpose, and since OTP is reported monthly, it likely reflects weekday-dominant measurement.

Approach

For each route, compute peak weekday trips (MAX trips_wd), peak Saturday trips (MAX trips_sa), and peak Sunday trips (MAX trips_su) from route_stops.
Compute weekend ratio = (max_sa + max_su) / (2 * max_wd), representing the proportion of weekday service provided on weekends (1.0 = identical, 0 = weekday-only).
Correlate weekend ratio with average OTP.
Classify routes as weekday-heavy (ratio < 0.3), balanced (0.3-0.7), or weekend-heavy (> 0.7) and compare OTP distributions.
Scatter plot and box plot.

Data

Name	Description	Source
`route_stops`	Weekday, Saturday, Sunday trip counts per stop	`prt.db` table
`otp_monthly`	Monthly OTP per route	`prt.db` table
`routes`	Mode and name	`prt.db` table

Output

output/service_profile.csv -- per-route trip counts, weekend ratio, and OTP
output/weekend_ratio_vs_otp.png -- scatter plot
output/service_tier_comparison.png -- box plot by service profile tier

  Source Code
    
      
      """Weekend vs weekday service profile analysis of on-time performance."""

import polars as pl

from prt_otp_analysis.common import analysis_dir, correlate, mode_scatter, phase, query_to_polars, run_analysis, save_chart, save_csv, setup_plotting

OUT = analysis_dir(__file__)

# Weekend-to-weekday service ratio boundaries for tier classification.
WEEKEND_RATIO_LOW = 0.3   # below = weekday-heavy
WEEKEND_RATIO_HIGH = 0.7  # above = weekend-heavy


def load_data() -> pl.DataFrame:
    """Load per-route service profile and average OTP."""
    trips = query_to_polars("""
        SELECT route_id,
               MAX(trips_wd) AS max_wd,
               MAX(trips_sa) AS max_sa,
               MAX(trips_su) AS max_su
        FROM route_stops
        GROUP BY route_id
    """)
    avg_otp = query_to_polars("""
        SELECT o.route_id, r.route_name, r.mode,
               AVG(o.otp) AS avg_otp, COUNT(*) AS months
        FROM otp_monthly o
        JOIN routes r ON o.route_id = r.route_id
        GROUP BY o.route_id
    """)
    df = avg_otp.join(trips, on="route_id", how="inner")

    # Weekend ratio: (sa + su) / (2 * wd), capped at 0 if no weekday service
    df = df.with_columns(
        pl.when(pl.col("max_wd") > 0)
        .then((pl.col("max_sa") + pl.col("max_su")) / (2.0 * pl.col("max_wd")))
        .otherwise(0.0)
        .alias("weekend_ratio")
    )

    # Classify service profile
    wkday_label = f"weekday-heavy (<{WEEKEND_RATIO_LOW})"
    balanced_label = f"balanced ({WEEKEND_RATIO_LOW}-{WEEKEND_RATIO_HIGH})"
    wkend_label = f"weekend-heavy (>{WEEKEND_RATIO_HIGH})"
    df = df.with_columns(
        pl.when(pl.col("weekend_ratio") < WEEKEND_RATIO_LOW).then(pl.lit(wkday_label))
        .when(pl.col("weekend_ratio") <= WEEKEND_RATIO_HIGH).then(pl.lit(balanced_label))
        .otherwise(pl.lit(wkend_label))
        .alias("service_tier")
    )

    return df


def analyze(df: pl.DataFrame) -> dict:
    """Compute correlations and tier statistics."""
    results = {}
    results["n_routes"] = len(df)

    # Filter to routes with weekday service for meaningful ratio
    with_wd = df.filter(pl.col("max_wd") > 0)

    all_corr = correlate(with_wd, "weekend_ratio", "avg_otp")
    results["ratio_r"] = all_corr["pearson_r"]
    results["ratio_p"] = all_corr["pearson_p"]
    results["ratio_rho"] = all_corr["spearman_r"]
    results["ratio_rho_p"] = all_corr["spearman_p"]

    # Bus-only (avoids Simpson's paradox)
    bus_wd = with_wd.filter(pl.col("mode") == "BUS")
    bus_corr = correlate(bus_wd, "weekend_ratio", "avg_otp")
    results["bus_ratio_r"] = bus_corr["pearson_r"]
    results["bus_ratio_p"] = bus_corr["pearson_p"]
    results["n_bus"] = bus_corr["n"]

    # Tier stats
    for tier_label in [f"weekday-heavy (<{WEEKEND_RATIO_LOW})", f"balanced ({WEEKEND_RATIO_LOW}-{WEEKEND_RATIO_HIGH})", f"weekend-heavy (>{WEEKEND_RATIO_HIGH})"]:
        subset = df.filter(pl.col("service_tier") == tier_label)
        key = tier_label.split(" ")[0]
        results[f"{key}_n"] = len(subset)
        if len(subset) > 0:
            results[f"{key}_mean_otp"] = subset["avg_otp"].mean()
            results[f"{key}_median_otp"] = subset["avg_otp"].median()

    return results


def make_charts(df: pl.DataFrame, results: dict) -> None:
    """Generate scatter and box plots."""
    plt = setup_plotting()
    # Scatter: weekend ratio vs OTP
    fig, ax = plt.subplots(figsize=(10, 7))
    mode_scatter(ax, df, "weekend_ratio", "avg_otp", trend=False)
    ax.set_xlabel("Weekend/Weekday Service Ratio")
    ax.set_ylabel("Average OTP")
    ax.set_title(f"Weekend Service Ratio vs OTP (r={results['ratio_r']:.3f}, p={results['ratio_p']:.3f})")
    ax.legend(fontsize=9)
    ax.set_ylim(0, 1)
    ax.set_xlim(-0.05, 1.5)
    save_chart(fig, OUT / "weekend_ratio_vs_otp.png")

    # Box plot by service tier
    fig, ax = plt.subplots(figsize=(8, 6))
    tiers = [f"weekday-heavy (<{WEEKEND_RATIO_LOW})", f"balanced ({WEEKEND_RATIO_LOW}-{WEEKEND_RATIO_HIGH})", f"weekend-heavy (>{WEEKEND_RATIO_HIGH})"]
    box_data = []
    box_labels = []
    for tier in tiers:
        vals = df.filter(pl.col("service_tier") == tier)["avg_otp"].to_list()
        if vals:
            box_data.append(vals)
            box_labels.append(f"{tier}\n(n={len(vals)})")
    bp = ax.boxplot(box_data, tick_labels=box_labels, patch_artist=True)
    colors = ["#ef4444", "#f59e0b", "#22c55e"]
    for patch, color in zip(bp["boxes"], colors):
        patch.set_facecolor(color)
        patch.set_alpha(0.6)
    ax.set_ylabel("Average OTP")
    ax.set_title("OTP by Service Profile Tier")
    save_chart(fig, OUT / "service_tier_comparison.png")


@run_analysis(17, "Weekend vs Weekday Service Profile")
def main() -> None:
    """Entry point: load data, analyze, chart, and save."""
    with phase("Loading data"):
        df = load_data()
        print(f"  {len(df)} routes with service profile and OTP data")

    with phase("Analyzing"):
        results = analyze(df)
        print(f"  All-mode:  Pearson r = {results['ratio_r']:.4f} (p = {results['ratio_p']:.4f})")
        print(f"             Spearman rho = {results['ratio_rho']:.4f} (p = {results['ratio_rho_p']:.4f})")
        print(f"  Bus-only:  Pearson r = {results['bus_ratio_r']:.4f} (p = {results['bus_ratio_p']:.4f}), "
              f"n = {results['n_bus']}")
        for tier, key in [("Weekday-heavy", "weekday-heavy"), ("Balanced", "balanced"), ("Weekend-heavy", "weekend-heavy")]:
            n = results.get(f"{key}_n", 0)
            if n > 0:
                print(f"  {tier}: n={n}, mean OTP={results[f'{key}_mean_otp']:.1%}")

        save_csv(df, OUT / "service_profile.csv")

    with phase("Generating charts"):
        make_charts(df, results)


if __name__ == "__main__":
    main()

    

    

Sources

Name	Type	Why It Matters	Owner	Freshness	Caveat
otp_monthly	table	Primary analytical table used in this page's computations.	Produced by Data Ingestion.	Updated when the producing pipeline step is rerun.	Coverage depends on upstream source availability and ETL assumptions.
Upstream sources (5) file data/routes_by_month.csv — Monthly route OTP source table in wide format. file data/PRT_Current_Routes_Full_System_de0e48fcbed24ebc8b0d933e47b56682.csv — Current route metadata and mode classifications. file data/Transit_stops_(current)_by_route_e040ee029227468ebf9d217402a82fa9.csv — Current stop-to-route coverage and trip counts. file data/PRT_Stop_Reference_Lookup_Table.csv — Historical stop reference file with geography attributes. file data/average-ridership/12bb84ed-397e-435c-8d1b-8ce543108698.csv — Average ridership by route and month.
route_stops	table	Primary analytical table used in this page's computations.	Produced by Data Ingestion.	Updated when the producing pipeline step is rerun.	Coverage depends on upstream source availability and ETL assumptions.
Upstream sources (5) file data/routes_by_month.csv — Monthly route OTP source table in wide format. file data/PRT_Current_Routes_Full_System_de0e48fcbed24ebc8b0d933e47b56682.csv — Current route metadata and mode classifications. file data/Transit_stops_(current)_by_route_e040ee029227468ebf9d217402a82fa9.csv — Current stop-to-route coverage and trip counts. file data/PRT_Stop_Reference_Lookup_Table.csv — Historical stop reference file with geography attributes. file data/average-ridership/12bb84ed-397e-435c-8d1b-8ce543108698.csv — Average ridership by route and month.
routes	table	Primary analytical table used in this page's computations.	Produced by Data Ingestion.	Updated when the producing pipeline step is rerun.	Coverage depends on upstream source availability and ETL assumptions.
Upstream sources (5) file data/routes_by_month.csv — Monthly route OTP source table in wide format. file data/PRT_Current_Routes_Full_System_de0e48fcbed24ebc8b0d933e47b56682.csv — Current route metadata and mode classifications. file data/Transit_stops_(current)_by_route_e040ee029227468ebf9d217402a82fa9.csv — Current stop-to-route coverage and trip counts. file data/PRT_Stop_Reference_Lookup_Table.csv — Historical stop reference file with geography attributes. file data/average-ridership/12bb84ed-397e-435c-8d1b-8ce543108698.csv — Average ridership by route and month.
polars	dependency	Runtime dependency required for this page's pipeline or analysis code.	Open-source Python ecosystem maintainers.	Version pinned by project environment until dependency updates are applied.	Library updates may change behavior or defaults.
scipy	dependency	Runtime dependency required for this page's pipeline or analysis code.	Open-source Python ecosystem maintainers.	Version pinned by project environment until dependency updates are applied.	Library updates may change behavior or defaults.