All Through the Town

This week vs. last week

—

No full weeks yet.

Automatic collection started mid-week on Apr 22, 2026, so the current ISO week will never have seven full days of data. The first complete Monday–Sunday week is Apr 27 – May 3, which rolls up the morning of Mon, May 4, 2026. Partial weeks appear in the table below with an asterisk.

Week by week

Every row shows the underlying sample size and date range — nothing is hidden.

ISO week	Dates	Days	Snapshots	Speed (mph)	Wait (min)	Reliab. (%)	Bunch / snap	20+ gaps	30+ gaps	Buses (avg)	Buses (peak)	Routes
No weekly rows yet.

Data collection — what's captured and what to be skeptical of

Loading collection stats…

Every metric the live page reports is captured at the daily, weekly, and monthly level — nothing is computed only at view time. Specifically:

System-wide: avg speed, avg rider wait, % reliability, bunching events per snapshot, routes-per-snapshot with 20+ and 30+ min gaps, active bus count (avg / peak / min), total snapshots, total bunching events, total routes seen.
Per hour of day (0–23): avg speed, avg wait, avg buses, bunch pairs/snap, big-gap counts. Lets us compare rush hour vs. off-peak.
Per borough (M, B, Bx, Q, S, X-express): avg speed, avg wait, avg buses, bunch pairs/snap, route count.
Per route, every day: avg speed, avg wait, median & 90th-percentile gap, max observed gap, bunching events (per-hour buckets), reliability, big-gap counts, avg/min/max active buses.
Weekday vs weekend sub-aggregates inside every weekly and monthly row.

Where the data lives (all viewable on GitHub):

data/snapshots/YYYY-MM-DD.jsonl — raw 5-min collections, one JSON line per snapshot, ~1,500 vehicles each. Never deleted.
data/daily/YYYY-MM-DD.json — full daily roll-up including the per-route, per-hour, and per-borough slices above.
data/summary/weekly.json & monthly.json — system-level period rollups (the source for the table above).
data/summary/weekly-routes.json & monthly-routes.json — per-route history keyed by route shortname.
data/summary/latest.json — convenience file with current/last-week/last-month plus 12 weeks & 12 months of context.

How collection actually runs: a GitHub Actions workflow fires once at the top of every hour from 6 AM to 11 PM ET. Inside each run, the collector polls the MTA API 18 times at 30-second intervals, producing dense, evenly spaced samples (~306 snapshots per day). The hourly cron is reliable in a way short-interval crons aren't, and the in-run loop gets us the 30-second cadence the speed math actually needs. Cost: $0 (public repos get unlimited Actions minutes).

Known limitations — read these numbers with appropriate skepticism:

Speed uses straight-line (haversine) distance between consecutive samples. At 30-second intervals this is close to actual road distance for most routes (a bus rarely makes a sharp turn within 30s), but it still slightly underestimates on winding routes. The MTA's official segment-speed methodology uses route-geometry distance between scheduled timepoints — we approximate by snapping to the nearest along-route point when the route shape is loaded, and falling back to haversine when it isn't.
Big-gap and wait-time numbers are observational. They use the same queuing-theory formula and 20+/30+ thresholds the live page uses, but inter-bus gaps are computed by sorting buses along the route's dominant axis and dividing distance by observed speed — not by reading actual stop arrival times. Use these for trend tracking within this dataset; don't compare absolute numbers to MTA-published statistics.
No GTFS schedule data. Reliability means ">1 bus on the route this snapshot," not "on time vs. schedule." Wait time is queuing-formula on observed gaps, not "additional bus stop time" the MTA reports.
Bunching uses 250m proximity, not headway-percentage. Different definition than the NYC Comptroller's "actual headway < 25% of scheduled."
Workflow gaps still happen. If GitHub Actions experiences an outage or a runner hangs, an hourly window may produce zero samples. Such hours are simply absent from the day's data; nothing is interpolated. The "Snapshots" column in the table above shows the actual sample count so you can see when collection was thin.
Collection started 2026-04-22, mid-week. The first complete ISO week (Mon–Sun) is Apr 27 – May 3, rolled up the morning of May 4, 2026. Days before the dense-sampling workflow change (commits before late Apr 25, 2026) used a `*/5` cron that GitHub throttled heavily — treat speed and gap numbers from those early days as unreliable.

If something looks wrong, the audit trail is the raw JSONL plus the daily JSON — both are public and dated. Open an issue, and the math is in collector/process.js and collector/rollup.js.

How these numbers are computed

Every 5 minutes between 6 AM and 11 PM ET, a GitHub Action pulls every active bus from the MTA BusTime SIRI API and appends one JSON line to data/snapshots/YYYY-MM-DD.jsonl. Each line captures position, heading, route, direction, phase, and timestamp for ~1,500 vehicles.

Daily roll-up (process.js, runs 1 AM ET) captures every metric the live page shows, plus slices for later analysis:

Avg speed — per-bus speeds between consecutive snapshots (haversine distance ÷ elapsed time, ≥ 0.5 and < 60 mph). Per-route mean, then unweighted mean of route means — so the M15 doesn't dominate the system number.
Avg rider wait — for every snapshot, sort same-route same-direction buses along their dominant axis and convert the inter-bus distance into minutes using that route's observed speed (cap 60 min). Wait = E[gap²] ÷ (2 · E[gap]) over the day's gap distribution. Same formula as the live page.
Bunching — same-route same-direction bus pairs within 250 m. Stored as both a per-snapshot rate and total events.
Big gaps (20+ / 30+ min) — routes-per-snapshot whose maximum inter-bus gap meets the threshold.
Active buses — average and peak vehicle count across the day's snapshots.
Reliability — share of snapshots where a route has more than one bus running.
Slices captured for later use: per-hour (24 buckets) speed/wait/bunching/buses/big-gaps; per-borough (M, B, Bx, Q, S, X) the same metrics; per-route history; weekday vs weekend tagged.

Every input count is preserved in data/daily/YYYY-MM-DD.json — nothing is thrown away.

Weekly & monthly roll-ups (rollup.js): group days by ISO week (Mon–Sun) and calendar month. Take the simple mean of daily means for each metric; sum totals (snapshots, bunching events). Roll up the per-hour, per-borough, and per-route slices the same way and write them to data/summary/weekly.json, monthly.json, weekly-routes.json, and monthly-routes.json. Weekday-only and weekend-only sub-aggregates are also stored, so we can compare like-with-like across weeks.

Weeks with fewer than 7 days appear in the table with a partial-week marker (3/7 *) and are excluded from the "this week" headline cards. Change vs. last week = this week's mean minus last week's mean (one decimal). Good direction (up for speed/reliability/buses, down for wait/bunching/gaps) renders green.

See the full methodology for how each metric compares to MTA official numbers and the limits of this pipeline.

Speed

Bunching

Gap estimation

Average rider wait time

General caveats