Pipeline
Scheduled Trips ETL
Coverage: 2012-03 to 2026-02 (from schedule_periods, scheduled_trips_monthly).
Built 2026-03-03 02:23 UTC ยท Commit defd5c8
Page Navigation
Data Provenance
flowchart LR
02_scheduled_trips(["Scheduled Trips ETL"])
f1_02_scheduled_trips[/"data/wprdc-schedule/schedule_monthly_agg.csv"/] --> 02_scheduled_trips
f2_02_scheduled_trips[/"data/wprdc-schedule/paac_pick_lookup.csv"/] --> 02_scheduled_trips
a1_02_scheduled_trips{"WPRDC Schedule Monthly Aggregate"} --> 02_scheduled_trips
a2_02_scheduled_trips{"WPRDC Pick Lookup"} --> 02_scheduled_trips
t_routes[("routes")] --> 02_scheduled_trips
01_data_ingestion[["Data Ingestion"]] --> t_routes
02_scheduled_trips --> tp_scheduled_trips_monthly[("scheduled_trips_monthly")]
02_scheduled_trips --> tp_schedule_periods[("schedule_periods")]
classDef page fill:#dbeafe,stroke:#1d4ed8,color:#1e3a8a,stroke-width:2px;
classDef table fill:#ecfeff,stroke:#0e7490,color:#164e63;
classDef dep fill:#fff7ed,stroke:#c2410c,color:#7c2d12,stroke-dasharray: 4 2;
classDef file fill:#eef2ff,stroke:#6366f1,color:#3730a3;
classDef api fill:#f0fdf4,stroke:#16a34a,color:#14532d;
classDef pipeline fill:#f5f3ff,stroke:#7c3aed,color:#4c1d95;
class 02_scheduled_trips page;
class t_routes,tp_schedule_periods,tp_scheduled_trips_monthly table;
class f1_02_scheduled_trips,f2_02_scheduled_trips file;
class a1_02_scheduled_trips,a2_02_scheduled_trips api;
class 01_data_ingestion pipeline;
Findings
Findings: Scheduled Trips ETL
Summary
Scheduled trip and pick-period tables are loaded into prt.db for overlap months with OTP coverage.
Notes
- Route matching and overlap diagnostics are emitted during execution.
- Cached files under
data/wprdc-schedule/are used when available.
Methods
Methods: Scheduled Trips ETL
Question
How do we add monthly service-level schedule data needed for longitudinal service and causality analyses?
Approach
- Fetch or read cached WPRDC schedule exports.
- Normalize route IDs, month keys, day type, and schedule period fields.
- Deduplicate overlapping schedule periods per route/month/day type.
- Rebuild
scheduled_trips_monthlyandschedule_periodsinprt.db.
Data
- WPRDC monthly schedule aggregates (
schedule_monthly_agg.csv) - WPRDC pick lookup (
paac_pick_lookup.csv) - Route IDs from
routestable inprt.db
Output
scheduled_trips_monthlytable indata/prt.dbschedule_periodstable indata/prt.db
Tables Produced
| Table | Description |
|---|---|
scheduled_trips_monthly |
Monthly route/day-type scheduled trip counts and distance metrics. |
schedule_periods |
Pick period start/end dates keyed by pick ID. |
Sources
| Name | Type | Why It Matters | Owner | Freshness | Caveat |
|---|---|---|---|---|---|
| data/wprdc-schedule/schedule_monthly_agg.csv | file | Monthly route/day-type schedule aggregates (cached copy when available). | Local project data owner not specified. | Snapshot file; refresh by rerunning its pipeline step. | May lag upstream source updates. |
| data/wprdc-schedule/paac_pick_lookup.csv | file | Pick period lookup metadata (cached copy when available). | Local project data owner not specified. | Snapshot file; refresh by rerunning its pipeline step. | May lag upstream source updates. |
| WPRDC Schedule Monthly Aggregate | api | Public dataset of route-level monthly schedule aggregates. | Hosted by data.wprdc.org. | Queried during pipeline execution; freshness depends on upstream updates. | Availability and schema can change without notice. |
| WPRDC Pick Lookup | api | Public lookup for pick period date ranges. | Hosted by data.wprdc.org. | Queried during pipeline execution; freshness depends on upstream updates. | Availability and schema can change without notice. |
| routes | table | Primary analytical table used in this page's computations. | Produced by Data Ingestion. | Updated when the producing pipeline step is rerun. | Coverage depends on upstream source availability and ETL assumptions. |