Analysis

09: Incline Investigation

Core OTP Patterns

Coverage: 2019-01 to 2025-11 (from otp_monthly).

Built 2026-04-03 20:09 UTC · Commit 7c56b9a

Page Navigation

Analysis Navigation

Data Provenance

flowchart LR
  09_incline_investigation(["09: Incline Investigation"])
  t_otp_monthly[("otp_monthly")] --> 09_incline_investigation
  01_data_ingestion[["Data Ingestion"]] --> t_otp_monthly
  u1_01_data_ingestion[/"data/routes_by_month.csv"/] --> 01_data_ingestion
  u2_01_data_ingestion[/"data/PRT_Current_Routes_Full_System_de0e48fcbed24ebc8b0d933e47b56682.csv"/] --> 01_data_ingestion
  u3_01_data_ingestion[/"data/Transit_stops_(current)_by_route_e040ee029227468ebf9d217402a82fa9.csv"/] --> 01_data_ingestion
  u4_01_data_ingestion[/"data/PRT_Stop_Reference_Lookup_Table.csv"/] --> 01_data_ingestion
  u5_01_data_ingestion[/"data/average-ridership/12bb84ed-397e-435c-8d1b-8ce543108698.csv"/] --> 01_data_ingestion
  t_route_stops[("route_stops")] --> 09_incline_investigation
  01_data_ingestion[["Data Ingestion"]] --> t_route_stops
  t_routes[("routes")] --> 09_incline_investigation
  01_data_ingestion[["Data Ingestion"]] --> t_routes
  t_stop_reference[("stop_reference")] --> 09_incline_investigation
  01_data_ingestion[["Data Ingestion"]] --> t_stop_reference
  t_stops[("stops")] --> 09_incline_investigation
  01_data_ingestion[["Data Ingestion"]] --> t_stops
  d1_09_incline_investigation(("polars (lib)")) --> 09_incline_investigation
  classDef page fill:#dbeafe,stroke:#1d4ed8,color:#1e3a8a,stroke-width:2px;
  classDef table fill:#ecfeff,stroke:#0e7490,color:#164e63;
  classDef dep fill:#fff7ed,stroke:#c2410c,color:#7c2d12,stroke-dasharray: 4 2;
  classDef file fill:#eef2ff,stroke:#6366f1,color:#3730a3;
  classDef api fill:#f0fdf4,stroke:#16a34a,color:#14532d;
  classDef pipeline fill:#f5f3ff,stroke:#7c3aed,color:#4c1d95;
  class 09_incline_investigation page;
  class t_otp_monthly,t_route_stops,t_routes,t_stop_reference,t_stops table;
  class d1_09_incline_investigation dep;
  class u1_01_data_ingestion,u2_01_data_ingestion,u3_01_data_ingestion,u4_01_data_ingestion,u5_01_data_ingestion file;
  class 01_data_ingestion pipeline;

Findings

Findings: Incline Investigation

Summary

The Monongahela Incline is a data pipeline artifact. It appears in the route catalog but has zero OTP measurements. OTP was never recorded for either of Pittsburgh's two inclines.

What the Data Shows

Table MI Records Details
routes 1 row route_id=MI, mode=INCLINE
otp_monthly 0 rows No OTP data whatsoever
route_stops 2 rows Upper Station, Lower Station (78 weekday trips, 549 weekly)
stops 2 Incline stops W15307 (Upper, Mount Washington), W15308 (Lower, South Shore)
stop_reference 4 Incline stops Both inclines present historically (first_served=1503, last_served=2510)

Observations

  • Both the Monongahela Incline (MI) and the Duquesne Incline (DQI) exist in the routes table with mode=INCLINE, but neither has any OTP data.
  • The inclines are physically operational and appear in PRT's GTFS feed with regular service (78 weekday trips).
  • The OTP measurement system (whatever generates routes_by_month.csv) simply does not cover incline routes. This is likely because inclines run on a fixed schedule with no traffic interference, making OTP measurement less meaningful.
  • The ANALYSIS-PROPOSAL.md described this as "zero values" -- in fact, there are no values at all, not even zeros.

Conclusion

This is not a data quality issue to fix. The inclines were included in the route catalog because they're part of PRT's GTFS feed, but were excluded from OTP measurement, likely by design. No further action needed.

Review History

Output

Methods

Methods: Monongahela Incline Investigation

Question

The Monongahela Incline may have zero or missing OTP data. Is this a data pipeline artifact, or was OTP never measured for the Incline?

Approach

  • Query the otp_monthly table for both incline route_ids ('MI' and 'DQI') and examine all values.
  • Check if 'MI' appears in route_stops and what stops are associated.
  • Check the routes table for mode and name (including all INCLINE-mode routes).
  • Cross-reference with the stop_reference table for historical Incline stops.
  • Summarize findings as a data quality report.

Data

Name Description Source
otp_monthly Any MI records prt.db table
routes MI metadata prt.db table
route_stops MI stop associations prt.db table
stops Physical Incline stops prt.db table
stop_reference Historical Incline stops prt.db table

Output

  • output/incline_report.csv -- all data found for MI across tables
  • output/incline_report.txt -- plain-text summary of findings

Source Code

"""Data audit of the Monongahela Incline's presence in OTP data."""

import polars as pl

from prt_otp_analysis.common import analysis_dir, phase, query_to_polars, run_analysis, save_csv

OUT = analysis_dir(__file__)


def investigate() -> str:
    """Query all tables for Incline-related data and build a report."""
    lines = []
    lines.append("=" * 60)
    lines.append("Monongahela Incline Investigation Report")
    lines.append("=" * 60)

    # 1. Routes table
    lines.append("\n## 1. Routes table (route_id = 'MI' or mode = 'INCLINE')")
    routes = query_to_polars("""
        SELECT * FROM routes WHERE route_id = 'MI' OR mode = 'INCLINE'
    """)
    if len(routes) > 0:
        for row in routes.iter_rows(named=True):
            lines.append(f"  route_id={row['route_id']}, name={row['route_name']}, mode={row['mode']}")
    else:
        lines.append("  No matching rows found.")

    # 2. OTP monthly
    lines.append("\n## 2. OTP monthly (route_id IN ('MI', 'DQI'))")
    otp = query_to_polars("""
        SELECT * FROM otp_monthly WHERE route_id IN ('MI', 'DQI')
    """)
    if len(otp) > 0:
        lines.append(f"  {len(otp)} rows found:")
        for row in otp.iter_rows(named=True):
            lines.append(f"    month={row['month']}, otp={row['otp']}")
    else:
        lines.append("  No OTP data found for MI.")

    # 3. Route stops
    lines.append("\n## 3. Route stops (route_id = 'MI')")
    route_stops = query_to_polars("""
        SELECT rs.*, s.stop_name, s.lat, s.lon
        FROM route_stops rs
        LEFT JOIN stops s ON rs.stop_id = s.stop_id
        WHERE rs.route_id = 'MI'
    """)
    if len(route_stops) > 0:
        lines.append(f"  {len(route_stops)} stop-route pairs found:")
        for row in route_stops.iter_rows(named=True):
            lines.append(
                f"    stop_id={row['stop_id']}, name={row['stop_name']}, "
                f"dir={row['direction']}, trips_wd={row['trips_wd']}, trips_7d={row['trips_7d']}"
            )
    else:
        lines.append("  No stops associated with MI in route_stops.")

    # 4. Stops table (search for incline-related stops)
    lines.append("\n## 4. Stops table (name containing 'incline' or 'monongahela')")
    stops = query_to_polars("""
        SELECT * FROM stops
        WHERE LOWER(stop_name) LIKE '%incline%'
           OR LOWER(stop_name) LIKE '%monongahela%'
    """)
    if len(stops) > 0:
        lines.append(f"  {len(stops)} stops found:")
        for row in stops.iter_rows(named=True):
            lines.append(
                f"    stop_id={row['stop_id']}, name={row['stop_name']}, "
                f"hood={row['hood']}, muni={row['muni']}"
            )
    else:
        lines.append("  No incline-related stops found in current stops.")

    # 5. Stop reference (historical)
    lines.append("\n## 5. Stop reference (mode = 'INCLINE' or name containing 'incline')")
    stop_ref = query_to_polars("""
        SELECT * FROM stop_reference
        WHERE mode = 'INCLINE'
           OR LOWER(stop_name) LIKE '%incline%'
           OR LOWER(stop_name) LIKE '%monongahela%'
    """)
    if len(stop_ref) > 0:
        lines.append(f"  {len(stop_ref)} historical stops found:")
        for row in stop_ref.iter_rows(named=True):
            lines.append(
                f"    stop_id={row['stop_id']}, name={row['stop_name']}, "
                f"mode={row['mode']}, first={row['first_served']}, last={row['last_served']}"
            )
    else:
        lines.append("  No incline-related stops in historical reference.")

    # 6. Conclusion
    lines.append("\n## 6. Conclusion")
    if len(otp) == 0:
        lines.append("  Neither the Monongahela Incline (MI) nor the Duquesne Incline (DQI)")
        lines.append("  has ANY entries in otp_monthly. OTP was never recorded for either")
        lines.append("  incline route in the dataset. This is a data pipeline artifact --")
        lines.append("  the inclines were included in the route catalog but excluded from")
        lines.append("  OTP measurement.")
    else:
        otp_values = otp["otp"].to_list()
        non_null = [v for v in otp_values if v is not None and v > 0]
        if not non_null:
            lines.append("  The Incline has OTP rows but all values are zero or null.")
            lines.append("  This suggests OTP measurement was set up but never populated.")
        else:
            lines.append(f"  The Incline has {len(non_null)} months with non-zero OTP data.")
            lines.append("  Further analysis needed to evaluate its performance.")

    return "\n".join(lines)


@run_analysis(9, "Incline Investigation")
def main() -> None:
    """Entry point: investigate the Incline data and produce a report."""
    with phase("Investigating"):
        report = investigate()
        print(report)

        # Save report
        report_path = OUT / "incline_report.txt"
        report_path.write_text(report, encoding="utf-8")
        print(f"\n  Report saved to {report_path}")

        # Save any OTP data as CSV
        otp = query_to_polars("SELECT * FROM otp_monthly WHERE route_id IN ('MI', 'DQI')")
        route_stops = query_to_polars("""
            SELECT rs.*, s.stop_name
            FROM route_stops rs
            LEFT JOIN stops s ON rs.stop_id = s.stop_id
            WHERE rs.route_id = 'MI'
        """)
        stop_ref = query_to_polars("""
            SELECT * FROM stop_reference
            WHERE mode = 'INCLINE' OR LOWER(stop_name) LIKE '%incline%'
        """)

        # Combine all incline data into one CSV
        all_data = []
        if len(otp) > 0:
            all_data.append(otp.with_columns(source=pl.lit("otp_monthly")))
        if len(route_stops) > 0:
            # Select compatible columns
            rs_summary = route_stops.select("route_id", "stop_id", "stop_name", "direction", "trips_wd", "trips_7d")
            save_csv(rs_summary, OUT / "incline_route_stops.csv")
        if len(stop_ref) > 0:
            save_csv(stop_ref, OUT / "incline_stop_reference.csv")

        if len(otp) > 0:
            save_csv(otp, OUT / "incline_report.csv")
        else:
            # Write empty CSV with headers
            save_csv(pl.DataFrame({"route_id": [], "month": [], "otp": []}), OUT / "incline_report.csv")


if __name__ == "__main__":
    main()

Sources

NameTypeWhy It MattersOwnerFreshnessCaveat
otp_monthly table Primary analytical table used in this page's computations. Produced by Data Ingestion. Updated when the producing pipeline step is rerun. Coverage depends on upstream source availability and ETL assumptions.
Upstream sources (5)
  • file data/routes_by_month.csv — Monthly route OTP source table in wide format.
  • file data/PRT_Current_Routes_Full_System_de0e48fcbed24ebc8b0d933e47b56682.csv — Current route metadata and mode classifications.
  • file data/Transit_stops_(current)_by_route_e040ee029227468ebf9d217402a82fa9.csv — Current stop-to-route coverage and trip counts.
  • file data/PRT_Stop_Reference_Lookup_Table.csv — Historical stop reference file with geography attributes.
  • file data/average-ridership/12bb84ed-397e-435c-8d1b-8ce543108698.csv — Average ridership by route and month.
route_stops table Primary analytical table used in this page's computations. Produced by Data Ingestion. Updated when the producing pipeline step is rerun. Coverage depends on upstream source availability and ETL assumptions.
Upstream sources (5)
  • file data/routes_by_month.csv — Monthly route OTP source table in wide format.
  • file data/PRT_Current_Routes_Full_System_de0e48fcbed24ebc8b0d933e47b56682.csv — Current route metadata and mode classifications.
  • file data/Transit_stops_(current)_by_route_e040ee029227468ebf9d217402a82fa9.csv — Current stop-to-route coverage and trip counts.
  • file data/PRT_Stop_Reference_Lookup_Table.csv — Historical stop reference file with geography attributes.
  • file data/average-ridership/12bb84ed-397e-435c-8d1b-8ce543108698.csv — Average ridership by route and month.
routes table Primary analytical table used in this page's computations. Produced by Data Ingestion. Updated when the producing pipeline step is rerun. Coverage depends on upstream source availability and ETL assumptions.
Upstream sources (5)
  • file data/routes_by_month.csv — Monthly route OTP source table in wide format.
  • file data/PRT_Current_Routes_Full_System_de0e48fcbed24ebc8b0d933e47b56682.csv — Current route metadata and mode classifications.
  • file data/Transit_stops_(current)_by_route_e040ee029227468ebf9d217402a82fa9.csv — Current stop-to-route coverage and trip counts.
  • file data/PRT_Stop_Reference_Lookup_Table.csv — Historical stop reference file with geography attributes.
  • file data/average-ridership/12bb84ed-397e-435c-8d1b-8ce543108698.csv — Average ridership by route and month.
stop_reference table Primary analytical table used in this page's computations. Produced by Data Ingestion. Updated when the producing pipeline step is rerun. Coverage depends on upstream source availability and ETL assumptions.
Upstream sources (5)
  • file data/routes_by_month.csv — Monthly route OTP source table in wide format.
  • file data/PRT_Current_Routes_Full_System_de0e48fcbed24ebc8b0d933e47b56682.csv — Current route metadata and mode classifications.
  • file data/Transit_stops_(current)_by_route_e040ee029227468ebf9d217402a82fa9.csv — Current stop-to-route coverage and trip counts.
  • file data/PRT_Stop_Reference_Lookup_Table.csv — Historical stop reference file with geography attributes.
  • file data/average-ridership/12bb84ed-397e-435c-8d1b-8ce543108698.csv — Average ridership by route and month.
stops table Primary analytical table used in this page's computations. Produced by Data Ingestion. Updated when the producing pipeline step is rerun. Coverage depends on upstream source availability and ETL assumptions.
Upstream sources (5)
  • file data/routes_by_month.csv — Monthly route OTP source table in wide format.
  • file data/PRT_Current_Routes_Full_System_de0e48fcbed24ebc8b0d933e47b56682.csv — Current route metadata and mode classifications.
  • file data/Transit_stops_(current)_by_route_e040ee029227468ebf9d217402a82fa9.csv — Current stop-to-route coverage and trip counts.
  • file data/PRT_Stop_Reference_Lookup_Table.csv — Historical stop reference file with geography attributes.
  • file data/average-ridership/12bb84ed-397e-435c-8d1b-8ce543108698.csv — Average ridership by route and month.
polars dependency Runtime dependency required for this page's pipeline or analysis code. Open-source Python ecosystem maintainers. Version pinned by project environment until dependency updates are applied. Library updates may change behavior or defaults.