Reference
Glossary
Common statistical and transit terms used throughout the analyses.
Built 2026-04-03 20:09 UTC · Commit 7c56b9a
Transit Operations
Descriptive Statistics
Correlation and Regression
| Term | Definition | Used in |
|---|---|---|
| Pearson Correlation | Linear association metric ranging from -1 to 1. | 10: Trip Frequency vs OTP, 13: Cross-Route Correlation Clustering |
| Spearman Correlation | Rank-based monotonic association metric less sensitive to nonlinearity and outliers. | 07: Stop Count vs OTP, 29 - Service Change Impact on OTP, 38 - Downtown Recovery Gap |
| Partial Correlation | Correlation between two variables after controlling for one or more covariates. | 12: Route Geographic Span vs OTP, 38 - Downtown Recovery Gap |
| Ordinary Least Squares (OLS) | Regression method that estimates coefficients by minimizing squared residuals. | 03 - Route Ranking, 18: Multivariate OTP Model, 23 - Garage-Level Performance, 26 - Ridership in Multivariate OTP Model, 27 - Traffic Congestion and OTP |
| R-squared (R2) | Fraction of outcome variance explained by a regression model. | 18: Multivariate OTP Model, 27 - Traffic Congestion and OTP, 28 - Weather Impact |
| Adjusted R-squared | R2 variant that penalizes unnecessary predictors. | 18: Multivariate OTP Model, 27 - Traffic Congestion and OTP |
| Standardized Coefficient (Beta) | Regression coefficient scaled in standard deviation units for cross-predictor comparison. | 18: Multivariate OTP Model, 26 - Ridership in Multivariate OTP Model |
| Variance Inflation Factor (VIF) | Diagnostic for multicollinearity among predictors. | 18: Multivariate OTP Model, 26 - Ridership in Multivariate OTP Model, 27 - Traffic Congestion and OTP |
| Nested Model F-Test | Test comparing a restricted model to an expanded model to assess added explanatory value. | — |
| Degrees of Freedom (df) | Number of independent pieces of information remaining after estimating model parameters. | — |
Hypothesis Testing
| Term | Definition | Used in |
|---|---|---|
| P-Value | Probability of observing results at least this extreme under the null hypothesis. | 18: Multivariate OTP Model, 20 - OTP → Ridership Causality, 23 - Garage-Level Performance, 26 - Ridership in Multivariate OTP Model, 29 - Service Change Impact on OTP |
| Confidence Interval (CI) | Interval estimate that captures plausible parameter values at a chosen confidence level. | 02 - Mode Comparison, 03 - Route Ranking |
| Paired t-Test | Mean-difference test for matched observations measured on the same units. | 02 - Mode Comparison, 19 - Ridership-Weighted OTP |
| Welch t-Test | Two-sample t-test variant that does not assume equal variances. | — |
| Mann-Whitney U Test | Non-parametric two-group comparison based on rank ordering. | 02 - Mode Comparison, 32 - Shelter Equity |
| Kruskal-Wallis Test | Non-parametric multi-group comparison using ranked observations. | 14: COVID Recovery Trajectories, 23 - Garage-Level Performance, 29 - Service Change Impact on OTP |
| Wilcoxon Signed-Rank Test | Non-parametric paired comparison based on signed ranks of differences. | — |
| Bonferroni Correction | Multiple-testing adjustment that scales p-value thresholds by the number of tests. | 20 - OTP → Ridership Causality, 30 - Service Level vs OTP Longitudinal, 38 - Downtown Recovery Gap |
Time Series and Forecasting Concepts
| Term | Definition | Used in |
|---|---|---|
| Rolling Mean | Moving-window average used to smooth short-term volatility. | 06: Seasonal Patterns, 28 - Weather Impact |
| Rolling Z-Score | Standardized deviation from a rolling mean used to detect anomalies. | 05 - Anomaly Investigation |
| Seasonal Decomposition | Separation of a time series into trend, seasonal, and residual components. | 06: Seasonal Patterns, 28 - Weather Impact |
| Detrending | Removal of long-run trend to isolate short-run or relative variation. | 05 - Anomaly Investigation, 06: Seasonal Patterns, 13: Cross-Route Correlation Clustering, 20 - OTP → Ridership Causality, 28 - Weather Impact, 30 - Service Level vs OTP Longitudinal |
| Lagged Cross-Correlation | Correlation of two series at offset time lags. | — |
| Granger Causality | Test of whether past values of one series improve prediction of another series. | 20 - OTP → Ridership Causality |
| Baseline Indexing | Rescaling series to a reference period equal to 100 for comparability. | — |
Clustering and Concentration
| Term | Definition | Used in |
|---|---|---|
| Hierarchical Clustering | Iterative grouping method that forms a nested tree of clusters. | 13: Cross-Route Correlation Clustering |
| Dendrogram | Tree visualization showing hierarchical cluster merges and distances. | 13: Cross-Route Correlation Clustering |
| Silhouette Score | Cluster-quality metric measuring cohesion within clusters and separation between clusters. | 13: Cross-Route Correlation Clustering |
| Gini Coefficient | Inequality metric on a 0 to 1 scale used for concentration analysis. | 34 - Ridership Concentration (Pareto) |
| Lorenz Curve | Cumulative-share plot used to visualize distributional inequality. | 25 - Ridership Concentration & Equity |
| Pareto Concentration | Pattern where a small share of units accounts for a large share of outcomes. | — |
Data Quality and Causal Caveats
| Term | Definition | Used in |
|---|---|---|
| Simpson's Paradox | Aggregated trends that reverse or change direction after stratification. | 04 - Neighborhood Equity, 07: Stop Count vs OTP, 12: Route Geographic Span vs OTP |
| Regression to the Mean | Tendency of extreme observations to move closer to average on repeated measurement. | 03 - Route Ranking, 14: COVID Recovery Trajectories, 21 - COVID Ridership vs OTP Recovery |
| Ecological Fallacy | Error of inferring individual-level behavior from group-level aggregates. | 15: Municipal/County Equity, 16: Transfer Hub Performance |
| Selection Bias | Distortion caused by non-random inclusion of observations or intervention targets. | — |
| Statistical Power | Probability of detecting a true effect when it exists. | 02 - Mode Comparison, 06: Seasonal Patterns, 20 - OTP → Ridership Causality, 30 - Service Level vs OTP Longitudinal |
| Survivorship Bias | Bias introduced by observing only units that remain after attrition or filtering. | — |