eComHD · StickyMetrics Forecast Review

May 2026: Forecast vs Actuals

The May forecast was locked on April 30, before the month began. May has now closed and the actuals are in. This is how the model did, where it missed, and what the system has already done about it.

Generated 2026-06-04 · Forecast: StickyMetrics Chronos-Bolt-mini month-end baseline (run 8, locked 2026-04-30) · Actuals: Amazon SP-API all-orders report, pulled fresh 2026-06-04
The May forecast badly under-shot. The model called 7,863 units / $57,203. Actual was 14,531 units / $104,616. The forecast captured only 54% of units and 55% of revenue, a miss of +6,667 units and +$47,413 above the median. Actual did land inside the model's p10 to p90 band (1,552 to 24,527 units), but that band is so wide it was never decision useful.
Units forecast (p50)
7,863
vs actual 14,531  54% capture
Revenue forecast (p50)
$57,203
vs actual $104,616  55% capture
Units miss
+6,667
actual ran 185% of forecast
Forecast band (units)
1.6k–24.5k
actual at roughly p65–p70  band too wide

Weekly pace every week ran over forecast

WeekForecastActualRatio
05/01–05/072,0103,2081.60x
05/08–05/141,9133,4551.81x
05/15–05/211,5103,7652.49x
05/22–05/281,8303,1241.71x
05/29–05/316019791.63x

The miss was not a one week spike. The under-bias is steady across the whole month, with the mid-month week (05/15 to 05/21) the worst at 2.49x.

Where the 6,667-unit miss came from

Under-forecast on tracked items
~5,081 units
168 ASINs the model forecast and that sold. It captured only 61% of their actual (7,852 forecast vs 12,933 actual). The bulk of the miss, concentrated in the top sellers.
Cold-start blind spots
1,598 units
52 ASINs that sold but had a forecast of exactly zero. New SKUs with no relaunch history, including the RAVE-205xx line. $13,746 of revenue the model never saw. About 24% of the miss.
Phantom over-forecast
12 units
57 ASINs had a forecast but did not sell, totaling just 12 units. Negligible. The model is not wasting forecast on dead inventory. The bias is purely directional under.

Top 15 ASINs by actual units

ASINSKUActualForecastActual revRead
B0C37Y1K1DMOM-10845-1753,7651,282$22,495under 66%
B0BF45FMDXMOM-10263-0752,2841,611$15,308under 29%
B0GKFL2TH9RAVE-20523-1501,1510$9,159blind spot
B0756T5VV1MOM-10721-100534506$3,110on target
B07DWRRYG9MOM-10564-350260412$2,537over
B079T9PWXRMOM-10073-210254150$1,951under
B07J35Z66BRAVE-20058-080246329$1,713over
B0FQBBHS31MOM-10248-150215178$1,287on target
B0BF4MLNFBMOM-10276-07521265$1,906under 69%
B077XPQKDVMOM-10790-10019776$1,548under 61%
B0756V9SCLMOM-10719-100179135$1,045under
B0C37Y7BL9MOM-10855-17517863$1,149under 65%
B01MV02GXCMOM-020717-10408-22514781$1,321under
B0DC1CL2BSMOM-10976-7514769$1,133under
B0GKFLQNQSRAVE-20522-1501400$1,377blind spot

One SKU, B0C37Y1K1D, drove the single biggest dollar miss: 3,765 sold against 1,282 forecast. Two RAVE items forecast to zero account for $10,536 of unforecast revenue between them.

Stock and ad spend were not in the model

The forecast was pure demand. It did not know about inventory or our own ad decisions, and that contaminated the comparison. The tote line (all RAVE SKUs) is the clearest example: 11 of 15 totes were out of stock the entire month (available 0 on all 21 May snapshots), and the one fast mover, B0GKFL2TH9, sold 1,151 units while we cut its ad spend roughly 82% (from $853 the week of Apr 27 to $154 the week of May 25) to protect low stock. So its recorded demand was deliberately suppressed.

Across the whole catalog, 81% of SKU-snapshots in May were out of stock (18,935 of 23,316). Those zero-sales days were teaching the model zero demand, and the accuracy score was penalizing the model for sales it was never allowed to make.

The reassuring part: stockouts did not cause the 46% May miss. On the active forecast set, out-of-stock days were only 128 of 3,195 scored days and were mostly under-bias anyway. The big miss was on in-stock items selling more than predicted. If anything, true May demand was higher than 14,531, because the out-of-stock totes had demand we could not fill.

Shipped today: the system is now stock-aware. (1) Scoring tags every day with its stock status (out of stock, low, ok, unknown) and reports a stock-fair accuracy that excludes days the SKU could not sell its forecast. (2) Training now uncensors out-of-stock days, imputing the trailing in-stock run-rate instead of a false zero, so the model learns real demand. At today's data this lifts the forward 245-day forecast +3.4% (419 out-of-stock days across 39 SKUs). Both go live on the next nightly run. It deliberately does NOT touch deliberate ad cuts or days where stock was available, so we never fabricate demand.

My read Woz

The baseline is the Chronos-Bolt-mini model, which validated on an April holdout at 32.6% weighted error with a known 13 to 15% portfolio under-bias. May came in at 46% under, three times the validated bias. The reason is straightforward: April was not a seasonal peak and May is. These are party and event products (koozies, yard signs, suspenders, bow ties), and May stacks graduation, wedding season, Mother's Day, and Memorial Day on top of each other. The per-ASIN May seasonal multipliers were set too low.

There are two distinct failure modes, and they need different fixes:

1. Systematic under-bias on tracked items. The model knew these SKUs and still under-called them, worst mid-month. This is a seasonal-multiplier problem, and it is self-healing (see below).

2. Cold-start blind spots. Brand-new SKUs with no relaunch history forecast to exactly zero. The RAVE line is the example. This will not self-heal, because there is no history to learn from. It needs a structural fix.

The quantile band is directionally fine, actual was in-band, but it is too wide to act on: p90 was three times p50. A band that says "somewhere between 1,600 and 24,500 units" does not help anyone cut a PO.

Good news, already verified: the system self-calibrates on the 1st of each month. The June 1 run scored the May residuals and raised the May multiplier for 179 of 199 ASINs, average 0.894 up to 1.099 (a +23% lift). So the directional under-bias on tracked items is genuinely correcting itself. The blind-spot problem is the part that still needs a code change.

What I recommend

1
Add a cold-start floor.
Brand-new SKUs with no history should inherit a category or sibling-ASIN level forecast instead of zero. This is the one failure the monthly calibration cannot fix on its own. It cost $13,746 in unforecast revenue in May alone.
2
Tighten the quantile band.
A p10 to p90 spread of 1.6k to 24.5k is not actionable. Calibrate the band width per ASIN so it tracks real observed variance, not model uncertainty.
3
Validate on a seasonal peak month, not just April.
The 13 to 15% validated bias gave false confidence. Hold out a known peak month so the reported accuracy reflects the months that matter for purchasing.
4
Re-score next month to confirm the +23% lift landed.
The June calibration raised the May priors. The proof is whether the June and next-May forecasts come in tighter. I will track it.