A retail CFO sits in a Tuesday lease committee. The team brings a one-pager. The candidate site, the rent, the build-out cost, the projected first-year revenue. The revenue number has three decimals. Nobody in the room can defend how those three decimals were arrived at.
In This Article
This is the moment most new-store programs quietly break. The lease gets signed because the math looks like math. The store opens fourteen months later. Revenue lands at 62% of plan. Cannibalization of the existing store nine kilometres east takes another 8% off the portfolio. The committee writes off the gap as “soft category”, or “weather”, or “post-launch ramp longer than expected”. The model that produced the original number is never recalibrated, because nobody owns it and nobody trusts it.
The fix is not a better spreadsheet. The fix is a gravity model. Properly built, it predicts revenue for a candidate site within a 10 to 15% band, accounts for cannibalization of your existing footprint, and gets recalibrated every time a store opens and twelve months of actuals come in. It is the only forecasting method we have seen survive a CFO and a board over multiple expansion cycles.
A gravity model is not exotic. The math is older than most retail finance teams. The data is mostly available. But almost nobody builds one, because each piece on its own looks like a small project, and the team that owns site selection is rarely the team that owns analytics. So expansion runs on comp-store averages and three-variable regressions, and the portfolio quietly takes the loss.
This is the playbook we use when a CFO asks us to put a defendable forecast under their next ten openings. Six weeks, one calibrated model, a board memo on the day of the lease decision.
What you actually walk into a lease committee with
Before we get to the model, define the decision. The thing the lease committee needs to do is allocate roughly $1 to $5M of capex per site. They will ask, in this order:
- What is the expected revenue at this site over the first three years.
- What is the confidence interval. Specifically, what is the 10th and 90th percentile.
- How much of that revenue is cannibalized from existing stores.
- Net of cannibalization, what is the contribution to the portfolio.
- What has to be true for our forecast to land within plan.
A comp-store average answers question one badly and leaves the other four unanswered. A regression on (footfall, square footage, format) answers questions one and two, but treats every site as an island and ignores the fact that your nine kilometres east store and your candidate share half a population. A gravity model answers all five.
The deliverable from the modeling team is not “the forecast.” It is a memo with five numbers and a 1-page map. Point estimate, low, high, cannibalization, net portfolio impact. Anything more is theatre. Anything less is reckless.
The reason this matters is that capital allocators behave very differently when they get all five numbers than when they get only one. With one number, every decision is a yes-or-no. With five, the committee can run trade-offs. Should we open this site at 60% confidence, or wait for the data, or hold the capex for a higher-confidence candidate three months from now. That is the conversation that allows a board to defend the expansion plan, and it cannot happen without a model that produces all five.
The four inputs nobody collects properly
A gravity model rests on four data layers. Most retailers have versions of all four. Almost no retailer has all four at the resolution the model needs. Here is the difference between what is in the network planning folder and what the model actually consumes.

Road network and traffic. Network teams have a road map. The model needs an average-daily-traffic estimate per road segment, plus an estimate of pedestrian flow at the segment closest to the site. The latter is often the bottleneck. Foot traffic counts at a candidate intersection are usually purchased from a specialist data provider, occasionally built from mobile-device pings, and almost never collected in-house consistently. Without pedestrian flow, the model overweights drivers and underweights walk-in customers, which is how a 60-store consumer electronics omnichannel retailer in a major European market under-forecast a 16,000 sqm city store by more than 30%.
Competitor footprint, with catchment overlap. Network teams count competitors per district. The model needs each competitor as a single point with its own approximate catchment, so that overlap with your candidate catchment can be computed. Counting competitors is not the same as modeling their pull. Two electronics stores in a district mean very different things when one of them sits on the same arterial as your site and the other is across a river with no easy crossing.
Real market capacity, not market share. Network teams reference market share figures from a syndicated source. The model needs the absolute market capacity in the catchment, computed from household counts, average income, average share of wallet on the category, and a corrective for the catchment’s distance from the regional income centre. Market share figures already assume the market is the size somebody else said it was. The model recomputes the market on the candidate’s own catchment, because that is what determines the ceiling.
The gravity model itself
The math is older than the discipline. Reilly’s law of retail gravitation, 1931. Huff’s probabilistic version, 1964. The version that goes into a modern site-selection model looks roughly like this:
For each candidate site i, projected sales V are a function of three things added together over every catchment cell j: the population in cell j, the gravitational pull of the candidate site on cell j, and the gravitational pull of every other competitor or own-store on cell j. The pull terms decay with distance and adjust for the road segment carrying traffic to the site. Calibration coefficients turn these terms into a sales figure, and they are fit by minimising squared error against the actuals of your existing stores.
In a slightly less abstract form:
> V(i) = Σ over j of [ Population(j) × K(i, j) ], where K(i, j) = Pull(i, j) ÷ ( Pull(i, j) + Σ Pull(c, j) for every competitor and own-store c )
The pull term itself is where most teams stop reading. It is the product of two ratios. The first ratio captures direct distance and decays sharply. The second ratio captures access, weighted by traffic volume and the perpendicular distance from the store to its nearest road segment, also adjusted by pedestrian flow. Each of these subterms has a coefficient, and the whole apparatus is calibrated by maximising the correlation between modelled and actual sales across the existing portfolio. Conjugate gradient or any modern optimiser converges in seconds.
What matters is what changes when you flip the inputs. Add a competitor next door, and K(i, j) for every nearby cell drops sharply, because the pull term in the denominator grows. Remove an own-store, and K(i, j) for cells that were partly absorbed by it goes up, which is the cannibalization recapture coming back. Open a new own-store, and every cell that the new store can reach pulls some of its share away from existing stores, which is cannibalization being computed automatically, in the same equation that produces the revenue forecast.
This is the single most important property of the model. Cannibalization is not a separate post-hoc adjustment. It falls out of the same gravity calculation that produces the revenue number for the new site. A regression model has to be retro-fitted with a cannibalization assumption that is almost always wrong. The gravity model produces the cannibalization number as a byproduct of the forecast.
A practical example. The 60-store consumer electronics retailer mentioned earlier had to decide whether to open a 2,000 sqm cybermarket on a major avenue in a regional capital, given that two existing stores sat within a 20-minute drive. The model returned a point estimate of roughly 109M in monthly revenue for the new store, and an estimated 6.5% portfolio uplift after recapture. It also estimated, in the same equation, that the two nearest existing stores would each lose 18 to 22% of their revenue to the new opening. That number, in particular, was what made the committee comfortable. They had a portfolio-net answer, not a single-store wish.
What a 6-week build actually looks like
The reason teams stall on this is that the data work looks like a year of pipeline-building before any model is calibrated. It is not. A focused team of two analysts, one geomarketing specialist, and a CFO-level sponsor can deliver a calibrated model and a board memo in six weeks. The cadence is what makes it possible.

Week 2: catchment geometry. Build catchment zones for every existing store and every candidate. Three zones each, 10, 20, and 30-minute drive isochrones, computed off a routing engine that respects time-of-day, not a Euclidean buffer. Overlay them on the population layer. The deliverable is a map per store and per candidate, with population, household count, and average income summed per zone.
Week 3: calibration. Pull two years of actuals from the data warehouse. Build the gravity model in code, not Excel, because you will recalibrate. Fit coefficients against the existing portfolio. Target R² north of 0.7. Anything lower means an input layer is broken, almost always the pedestrian flow or the income proxy, and the answer is to fix the input, not loosen the threshold. The deliverable is a calibration report with residuals per existing store.
Week 4: candidate scoring. Run the calibrated model against each candidate site. Produce point, low, and high estimates for first-year, second-year, and third-year revenue. The low and high are not invented. They come from sensitivity analysis on the three biggest uncertainty drivers, which the calibration step will have identified. The deliverable is a ranked candidate list with three-year revenue ranges.
Week 5: cannibalization and portfolio. For each candidate, run the model twice. Once with the candidate added to the network, once without. The delta on every existing store is the cannibalization. Net the candidate’s revenue against the cannibalization to get the portfolio contribution. This is the number the CFO actually cares about. The deliverable is a portfolio impact table per candidate.
Week 6: board memo and recalibration plan. Two pages. Page one, the five numbers and the map. Page two, the assumptions, the confidence intervals, the cannibalization estimates, and the recalibration cadence going forward. The recalibration plan is critical. Every time a store opens and twelve months of actuals come in, the model should refit, the residual on the new store should be reviewed, and the coefficients should be updated. A gravity model that is never recalibrated decays into a regression model within a year.
The team behind the build matters. The analytics owner cannot be the network planning team’s spreadsheet champion, because that person has the wrong incentives. The owner should be a fractional or full-time geomarketing analyst reporting into Strategy or Finance. The network planning team consumes the model. They do not own it. This separation is what stops the model from quietly bending to support deals that are already politically decided.
The three numbers a board actually wants
The output of the model is many things. The output of the project is three numbers per candidate, plus a fourth for the portfolio.
Point estimate. First-year revenue at the candidate site, in absolute currency. This is what gets compared to lease economics and capex.
90% confidence interval. Low and high bounds on the first-year revenue. The interval should come from a sensitivity analysis, not a guess. Sensitivity is run on the three inputs that drive most of the model’s variance: usually competitor footprint, pedestrian flow, and household income in the inner catchment. If your confidence interval is wider than 30% of the point estimate, something in the calibration is broken. If it is narrower than 10%, your sensitivity analysis is too tight and you are pretending to know more than you do.
Cannibalization estimate. Revenue lost from existing stores due to the new opening. This is the number that almost no retailer reports and almost every board ends up asking about. With a gravity model, it falls out automatically. Without one, it is invented. We have reviewed expansion plans for retailers in three different verticals where the cannibalization assumption was either zero or a single round number applied to every opening, and in every case the portfolio impact was overstated by 8 to 15%.
Net portfolio contribution. The candidate’s revenue minus the cannibalization. This is the number the board memo leads with. A candidate with 100M in revenue and 40M in cannibalization contributes 60M to the portfolio. A candidate with 70M in revenue and 5M in cannibalization contributes 65M. The second candidate is better, even though its standalone forecast is smaller. This is the conversation that the gravity model enables, and the conversation that a comp-store average makes impossible.

Four failure modes that quietly kill the model
These are the patterns we see when we audit existing gravity models. Any one of them is enough to break the forecast quietly.
Cannibalization is treated as a separate adjustment, not a model output. The team builds a clean gravity model for the candidate. Then they apply a flat 5% cannibalization haircut, because someone has read somewhere that 5% is a reasonable assumption. The actual cannibalization in a dense network can be 20 to 30% on the closest existing store. The portfolio impact is overstated, sometimes by more than the contribution of the new store itself.
Income and population data are five years old. A 200-store grocery hypermarket chain we reviewed had a calibration baseline tied to wages from a single historical year, never updated. The model still ran, the numbers still looked plausible, but the relative income ranking across districts had shifted enough that the model was systematically over-recommending the wealthier zones from the baseline year and under-recommending the new growth corridors. Refresh income and population data every two years, minimum. Three years is the upper limit before you should rebuild the calibration.
Competitor count instead of competitor catchment overlap. A district has eight electronics retailers. The model uses that count, weighted by some heuristic. But seven of those eight are in a part of the district that does not share a catchment with your candidate. The real competitive pressure on the candidate is one, not eight. This error is what makes site selection refuse otherwise excellent locations and what makes it accept others that quietly lose share to a competitor the model never saw.
The model is never recalibrated after openings. This is the most common one. A model is built, used to support a lease decision, then archived. Eighteen months later, the store opens, six months of actuals come in, and nobody refits the model. The next lease decision uses the unchanged model, even though four stores have opened since and the calibration would have moved. By the second expansion cycle, the model is decoration.
What to do this week
If you are a CFO, COO, or strategy lead with an expansion plan coming up, three things you can do without engaging anyone external.
Pull the last twelve months of new-store revenue forecasts and the last twelve months of actuals. Compute the average absolute deviation. If it is more than 15%, your forecasting method is not defendable, and the next lease committee should know that before they vote.
Pull the same period and compute the change in revenue for stores located within 10 km of any new opening in that twelve months. The number you get is your real cannibalization rate. Compare it to whatever assumption sits in your current forecast model. If they are far apart, the portfolio impact in your three-year plan is wrong.
Ask the network planning team to write the four-input checklist on a single page: population resolution, road and pedestrian flow source, competitor catchment method, market capacity construction. If any of these answers is “we use district-level numbers,” the data layer needs to be fixed before any forecasting method is going to land within plan.
That diagnostic, done over two afternoons, will tell you whether you have a forecasting problem, a data problem, or both. The answer determines whether the next 6-week build is feasible internally, or whether the apparatus has to be built end-to-end before the next opening.