Reference
Sources Inventory
Unified catalog of files, APIs, tables, and library dependencies used in this project.
Built 2026-03-03 02:23 UTC ยท Commit defd5c8
Inventory
Includes source ownership, freshness expectations, and caveats inferred from manifests.
| Name | Type | Description | Owner | Freshness | Caveat |
|---|---|---|---|---|---|
| NOAA NCEI Daily Summaries API | api | Daily weather observations for station USW00094823 (Pittsburgh). | Hosted by www.ncei.noaa.gov. | Queried during pipeline execution; freshness depends on upstream updates. | Availability and schema can change without notice. |
| PennDOT ArcGIS Roadway Traffic Layer | api | Public roadway segment AADT and truck percentage attributes. | Hosted by gis.penndot.gov. | Queried during pipeline execution; freshness depends on upstream updates. | Availability and schema can change without notice. |
| WPRDC Pick Lookup | api | Public lookup for pick period date ranges. | Hosted by data.wprdc.org. | Queried during pipeline execution; freshness depends on upstream updates. | Availability and schema can change without notice. |
| WPRDC Schedule Monthly Aggregate | api | Public dataset of route-level monthly schedule aggregates. | Hosted by data.wprdc.org. | Queried during pipeline execution; freshness depends on upstream updates. | Availability and schema can change without notice. |
| branca | dependency | Utility library used by Folium for map templating, colormaps, and HTML components. | Open-source Python ecosystem maintainers. | Version pinned by project environment until dependency updates are applied. | Library updates may change behavior or defaults. |
| folium | dependency | Mapping library used to render interactive geospatial visualizations. | Open-source Python ecosystem maintainers. | Version pinned by project environment until dependency updates are applied. | Library updates may change behavior or defaults. |
| matplotlib | dependency | Plotting library used to generate static charts. | Open-source Python ecosystem maintainers. | Version pinned by project environment until dependency updates are applied. | Library updates may change behavior or defaults. |
| numpy | dependency | Numerical computing library for vectorized arrays and matrix operations. | Open-source Python ecosystem maintainers. | Version pinned by project environment until dependency updates are applied. | Library updates may change behavior or defaults. |
| polars | dependency | Dataframe library used for fast tabular data transformations and aggregation. | Open-source Python ecosystem maintainers. | Version pinned by project environment until dependency updates are applied. | Library updates may change behavior or defaults. |
| scipy | dependency | Scientific computing library used for statistical tests and numerical routines. | Open-source Python ecosystem maintainers. | Version pinned by project environment until dependency updates are applied. | Library updates may change behavior or defaults. |
| statsmodels | dependency | Statistical modeling library used for regression and time-series methods. | Open-source Python ecosystem maintainers. | Version pinned by project environment until dependency updates are applied. | Library updates may change behavior or defaults. |
| data/GTFS/shapes.txt | file | GTFS route shape geometry points. | Local project data owner not specified. | Snapshot file; refresh by rerunning its pipeline step. | May lag upstream source updates. |
| data/GTFS/trips.txt | file | GTFS shape-to-route mapping. | Local project data owner not specified. | Snapshot file; refresh by rerunning its pipeline step. | May lag upstream source updates. |
| data/PRT_Current_Routes_Full_System_de0e48fcbed24ebc8b0d933e47b56682.csv | file | Current route metadata and mode classifications. | Local project data owner not specified. | Snapshot file; refresh by rerunning its pipeline step. | May lag upstream source updates. |
| data/PRT_Stop_Reference_Lookup_Table.csv | file | Historical stop reference file with geography attributes. | Local project data owner not specified. | Snapshot file; refresh by rerunning its pipeline step. | May lag upstream source updates. |
| data/Transit_stops_(current)_by_route_e040ee029227468ebf9d217402a82fa9.csv | file | Current stop-to-route coverage and trip counts. | Local project data owner not specified. | Snapshot file; refresh by rerunning its pipeline step. | May lag upstream source updates. |
| data/average-ridership/12bb84ed-397e-435c-8d1b-8ce543108698.csv | file | Average ridership by route and month. | Local project data owner not specified. | Snapshot file; refresh by rerunning its pipeline step. | May lag upstream source updates. |
| data/bus-stop-usage/wprdc_stop_data.csv | file | Referenced via DATA_DIR path composition in analysis script. | Local project data owner not specified. | Snapshot file; refresh by rerunning its pipeline step. | May lag upstream source updates. |
| data/noaa-weather/daily_raw.csv | file | Cached NOAA daily observations for Pittsburgh station. | Local project data owner not specified. | Snapshot file; refresh by rerunning its pipeline step. | May lag upstream source updates. |
| data/ntd-monthly-ridership/December 2025 Complete Monthly Ridership (with adjustments and estimates)_260202.xlsx | file | NTD monthly ridership workbook containing agency metadata and UPT series. | Local project data owner not specified. | Snapshot file; refresh by rerunning its pipeline step. | May lag upstream source updates. |
| data/penndot-traffic/aadt_raw.json | file | Cached PennDOT ArcGIS feature response for Allegheny County roadway segments. | Local project data owner not specified. | Snapshot file; refresh by rerunning its pipeline step. | May lag upstream source updates. |
| data/routes_by_month.csv | file | Monthly route OTP source table in wide format. | Local project data owner not specified. | Snapshot file; refresh by rerunning its pipeline step. | May lag upstream source updates. |
| data/wprdc-schedule/paac_pick_lookup.csv | file | Pick period lookup metadata (cached copy when available). | Local project data owner not specified. | Snapshot file; refresh by rerunning its pipeline step. | May lag upstream source updates. |
| data/wprdc-schedule/schedule_monthly_agg.csv | file | Monthly route/day-type schedule aggregates (cached copy when available). | Local project data owner not specified. | Snapshot file; refresh by rerunning its pipeline step. | May lag upstream source updates. |
| ntd_agency | table | Agency dimension table keyed by NTD ID, mode, and TOS. Produced by NTD Ridership ETL. | Produced by NTD Ridership ETL. | Updated when the producing pipeline step is rerun. | Coverage depends on upstream source availability and ETL assumptions. |
| ntd_ridership | table | Monthly UPT facts by NTD ID, mode, and TOS. Produced by NTD Ridership ETL. | Produced by NTD Ridership ETL. | Updated when the producing pipeline step is rerun. | Coverage depends on upstream source availability and ETL assumptions. |
| otp_monthly | table | Monthly OTP values by route. Produced by Data Ingestion. | Produced by Data Ingestion. | Updated when the producing pipeline step is rerun. | Coverage depends on upstream source availability and ETL assumptions. |
| ridership_monthly | table | Monthly ridership by route and day type. Produced by Data Ingestion. | Produced by Data Ingestion. | Updated when the producing pipeline step is rerun. | Coverage depends on upstream source availability and ETL assumptions. |
| route_stops | table | Route-stop bridge with service frequency metrics. Produced by Data Ingestion. | Produced by Data Ingestion. | Updated when the producing pipeline step is rerun. | Coverage depends on upstream source availability and ETL assumptions. |
| route_traffic | table | Route-level traffic exposure metrics including weighted AADT and match quality. Produced by Traffic Overlay ETL. | Produced by Traffic Overlay ETL. | Updated when the producing pipeline step is rerun. | Coverage depends on upstream source availability and ETL assumptions. |
| routes | table | Route dimension table. Produced by Data Ingestion. | Produced by Data Ingestion. | Updated when the producing pipeline step is rerun. | Coverage depends on upstream source availability and ETL assumptions. |
| scheduled_trips_monthly | table | Monthly route/day-type scheduled trip counts and distance metrics. Produced by Scheduled Trips ETL. | Produced by Scheduled Trips ETL. | Updated when the producing pipeline step is rerun. | Coverage depends on upstream source availability and ETL assumptions. |
| stop_reference | table | Historical stop metadata and geography. Produced by Data Ingestion. | Produced by Data Ingestion. | Updated when the producing pipeline step is rerun. | Coverage depends on upstream source availability and ETL assumptions. |
| stops | table | Physical stop dimension table. Produced by Data Ingestion. | Produced by Data Ingestion. | Updated when the producing pipeline step is rerun. | Coverage depends on upstream source availability and ETL assumptions. |
| weather_monthly | table | Monthly precipitation, temperature, snowfall, and wind summary features. Produced by Weather ETL. | Produced by Weather ETL. | Updated when the producing pipeline step is rerun. | Coverage depends on upstream source availability and ETL assumptions. |