Reference

Glossary

Common statistical and transit terms used throughout the analyses.

Built 2026-04-03 20:09 UTC · Commit 7c56b9a

Transit Operations

Term Definition Used in
On-Time Performance (OTP) Share of trips that meet the agency's on-time threshold, usually expressed from 0 to 1. Data Ingestion, Scheduled Trips ETL, Weather ETL, 01 - System-Wide OTP Trend, 02 - Mode Comparison, 03 - Route Ranking, 04 - Neighborhood Equity, 05 - Anomaly Investigation, 06: Seasonal Patterns, 07: Stop Count vs OTP, 08: Hot-Spot Map, 09: Incline Investigation, 10: Trip Frequency vs OTP, 11: Directional Asymmetry, 12: Route Geographic Span vs OTP, 13: Cross-Route Correlation Clustering, 14: COVID Recovery Trajectories, 15: Municipal/County Equity, 16: Transfer Hub Performance, 17: Weekend vs Weekday Service Profile, 18: Multivariate OTP Model, 19 - Ridership-Weighted OTP, 20 - OTP → Ridership Causality, 21 - COVID Ridership vs OTP Recovery, 22 - Passenger-Weighted Delay Burden, 23 - Garage-Level Performance, 24 - Weekday vs Weekend Ridership Trends, 25 - Ridership Concentration & Equity, 26 - Ridership in Multivariate OTP Model, 27 - Traffic Congestion and OTP, 28 - Weather Impact, 29 - Service Change Impact on OTP, 30 - Service Level vs OTP Longitudinal, 31 - Stop Consolidation Candidates, 32 - Shelter Equity, 34 - Ridership Concentration (Pareto), 35 - Boarding/Alighting Flow Analysis, 40 - Peer City Dashboard
Trip-Weighted OTP OTP average where each route is weighted by its scheduled trip volume.
Ridership-Weighted OTP OTP average where each route is weighted by passenger volume rather than trip count. 19 - Ridership-Weighted OTP, 26 - Ridership in Multivariate OTP Model
Day Type Service category such as weekday, Saturday, or Sunday that affects schedule and demand patterns. Scheduled Trips ETL, 01 - System-Wide OTP Trend, 24 - Weekday vs Weekend Ridership Trends, 29 - Service Change Impact on OTP, 31 - Stop Consolidation Candidates
Route Span Maximum geographic distance covered by a route across its served stops. 12: Route Geographic Span vs OTP
Busway Route Route operating on dedicated right-of-way for part of its alignment. 02 - Mode Comparison

Descriptive Statistics

Term Definition Used in
Mean Arithmetic average of values, computed as sum divided by count. 01 - System-Wide OTP Trend, 02 - Mode Comparison, 03 - Route Ranking, 04 - Neighborhood Equity, 05 - Anomaly Investigation, 06: Seasonal Patterns, 08: Hot-Spot Map, 13: Cross-Route Correlation Clustering, 14: COVID Recovery Trajectories, 16: Transfer Hub Performance, 17: Weekend vs Weekday Service Profile, 19 - Ridership-Weighted OTP, 20 - OTP → Ridership Causality, 21 - COVID Ridership vs OTP Recovery, 23 - Garage-Level Performance, 24 - Weekday vs Weekend Ridership Trends, 25 - Ridership Concentration & Equity, 28 - Weather Impact, 29 - Service Change Impact on OTP, 30 - Service Level vs OTP Longitudinal, 32 - Shelter Equity, 33 - Pandemic Ridership Geography, 36 - National Ridership Growth (2019 vs 2024)
Median Middle value of an ordered distribution, robust to extreme outliers. 02 - Mode Comparison, 03 - Route Ranking, 13: Cross-Route Correlation Clustering, 14: COVID Recovery Trajectories, 15: Municipal/County Equity, 16: Transfer Hub Performance, 19 - Ridership-Weighted OTP, 20 - OTP → Ridership Causality, 21 - COVID Ridership vs OTP Recovery, 24 - Weekday vs Weekend Ridership Trends, 27 - Traffic Congestion and OTP, 31 - Stop Consolidation Candidates, 32 - Shelter Equity, 33 - Pandemic Ridership Geography, 34 - Ridership Concentration (Pareto), 35 - Boarding/Alighting Flow Analysis, 36 - National Ridership Growth (2019 vs 2024), 37 - Peer City Ridership Comparison, 38 - Downtown Recovery Gap, 39 — National Service Cuts (2019 vs 2024), 40 - Peer City Dashboard
Standard Deviation (SD) Dispersion measure showing how far values tend to deviate from the mean. 03 - Route Ranking, 05 - Anomaly Investigation
Variance Squared standard deviation used in many statistical formulas. 03 - Route Ranking, 12: Route Geographic Span vs OTP, 18: Multivariate OTP Model, 23 - Garage-Level Performance, 26 - Ridership in Multivariate OTP Model, 27 - Traffic Congestion and OTP, 28 - Weather Impact, 29 - Service Change Impact on OTP, 30 - Service Level vs OTP Longitudinal, 34 - Ridership Concentration (Pareto), 38 - Downtown Recovery Gap
Percentile Value below which a given percentage of observations falls. 25 - Ridership Concentration & Equity
Interquartile Range (IQR) Difference between the 75th and 25th percentiles, capturing the middle half of data. 20 - OTP → Ridership Causality, 36 - National Ridership Growth (2019 vs 2024)

Correlation and Regression

Term Definition Used in
Pearson Correlation Linear association metric ranging from -1 to 1. 10: Trip Frequency vs OTP, 13: Cross-Route Correlation Clustering
Spearman Correlation Rank-based monotonic association metric less sensitive to nonlinearity and outliers. 07: Stop Count vs OTP, 29 - Service Change Impact on OTP, 38 - Downtown Recovery Gap
Partial Correlation Correlation between two variables after controlling for one or more covariates. 12: Route Geographic Span vs OTP, 38 - Downtown Recovery Gap
Ordinary Least Squares (OLS) Regression method that estimates coefficients by minimizing squared residuals. 03 - Route Ranking, 18: Multivariate OTP Model, 23 - Garage-Level Performance, 26 - Ridership in Multivariate OTP Model, 27 - Traffic Congestion and OTP
R-squared (R2) Fraction of outcome variance explained by a regression model. 18: Multivariate OTP Model, 27 - Traffic Congestion and OTP, 28 - Weather Impact
Adjusted R-squared R2 variant that penalizes unnecessary predictors. 18: Multivariate OTP Model, 27 - Traffic Congestion and OTP
Standardized Coefficient (Beta) Regression coefficient scaled in standard deviation units for cross-predictor comparison. 18: Multivariate OTP Model, 26 - Ridership in Multivariate OTP Model
Variance Inflation Factor (VIF) Diagnostic for multicollinearity among predictors. 18: Multivariate OTP Model, 26 - Ridership in Multivariate OTP Model, 27 - Traffic Congestion and OTP
Nested Model F-Test Test comparing a restricted model to an expanded model to assess added explanatory value.
Degrees of Freedom (df) Number of independent pieces of information remaining after estimating model parameters.

Hypothesis Testing

Term Definition Used in
P-Value Probability of observing results at least this extreme under the null hypothesis. 18: Multivariate OTP Model, 20 - OTP → Ridership Causality, 23 - Garage-Level Performance, 26 - Ridership in Multivariate OTP Model, 29 - Service Change Impact on OTP
Confidence Interval (CI) Interval estimate that captures plausible parameter values at a chosen confidence level. 02 - Mode Comparison, 03 - Route Ranking
Paired t-Test Mean-difference test for matched observations measured on the same units. 02 - Mode Comparison, 19 - Ridership-Weighted OTP
Welch t-Test Two-sample t-test variant that does not assume equal variances.
Mann-Whitney U Test Non-parametric two-group comparison based on rank ordering. 02 - Mode Comparison, 32 - Shelter Equity
Kruskal-Wallis Test Non-parametric multi-group comparison using ranked observations. 14: COVID Recovery Trajectories, 23 - Garage-Level Performance, 29 - Service Change Impact on OTP
Wilcoxon Signed-Rank Test Non-parametric paired comparison based on signed ranks of differences.
Bonferroni Correction Multiple-testing adjustment that scales p-value thresholds by the number of tests. 20 - OTP → Ridership Causality, 30 - Service Level vs OTP Longitudinal, 38 - Downtown Recovery Gap

Time Series and Forecasting Concepts

Term Definition Used in
Rolling Mean Moving-window average used to smooth short-term volatility. 06: Seasonal Patterns, 28 - Weather Impact
Rolling Z-Score Standardized deviation from a rolling mean used to detect anomalies. 05 - Anomaly Investigation
Seasonal Decomposition Separation of a time series into trend, seasonal, and residual components. 06: Seasonal Patterns, 28 - Weather Impact
Detrending Removal of long-run trend to isolate short-run or relative variation. 05 - Anomaly Investigation, 06: Seasonal Patterns, 13: Cross-Route Correlation Clustering, 20 - OTP → Ridership Causality, 28 - Weather Impact, 30 - Service Level vs OTP Longitudinal
Lagged Cross-Correlation Correlation of two series at offset time lags.
Granger Causality Test of whether past values of one series improve prediction of another series. 20 - OTP → Ridership Causality
Baseline Indexing Rescaling series to a reference period equal to 100 for comparability.

Clustering and Concentration

Term Definition Used in
Hierarchical Clustering Iterative grouping method that forms a nested tree of clusters. 13: Cross-Route Correlation Clustering
Dendrogram Tree visualization showing hierarchical cluster merges and distances. 13: Cross-Route Correlation Clustering
Silhouette Score Cluster-quality metric measuring cohesion within clusters and separation between clusters. 13: Cross-Route Correlation Clustering
Gini Coefficient Inequality metric on a 0 to 1 scale used for concentration analysis. 34 - Ridership Concentration (Pareto)
Lorenz Curve Cumulative-share plot used to visualize distributional inequality. 25 - Ridership Concentration & Equity
Pareto Concentration Pattern where a small share of units accounts for a large share of outcomes.

Data Quality and Causal Caveats

Term Definition Used in
Simpson's Paradox Aggregated trends that reverse or change direction after stratification. 04 - Neighborhood Equity, 07: Stop Count vs OTP, 12: Route Geographic Span vs OTP
Regression to the Mean Tendency of extreme observations to move closer to average on repeated measurement. 03 - Route Ranking, 14: COVID Recovery Trajectories, 21 - COVID Ridership vs OTP Recovery
Ecological Fallacy Error of inferring individual-level behavior from group-level aggregates. 15: Municipal/County Equity, 16: Transfer Hub Performance
Selection Bias Distortion caused by non-random inclusion of observations or intervention targets.
Statistical Power Probability of detecting a true effect when it exists. 02 - Mode Comparison, 06: Seasonal Patterns, 20 - OTP → Ridership Causality, 30 - Service Level vs OTP Longitudinal
Survivorship Bias Bias introduced by observing only units that remain after attrition or filtering.