Data Engineering

Modeling Customer LTV in SQL: From Naive to Defensible

· 16 min read

LTV is the metric most likely to get a senior data person fired, in either direction. Report a number that’s too high and the company over-spends on CAC for two quarters before anyone notices the cohort isn’t paying back. Report a number that’s too low and finance approves a smaller marketing budget than the business can absorb, and a competitor takes the share. Both happen routinely, and both come from the same root cause: LTV is treated as one number when it’s really a model with assumptions.

In This Article

  1. Two ways LTV breaks
  2. Lifetime is the area under the retention curve
  3. Linearizing the fit
  4. The full query
  5. Reference numbers, not your numbers
  6. When the power fit doesn’t fit
  7. More defensible variants
  8. Mistakes we still see
  9. Modern tooling notes
  10. Checklist
  11. Further reading

This is the working LTV approach we use on consulting engagements when a board is asking “what’s our LTV” and the answer in the deck is wrong. It walks from the naive formula every analyst writes first, through the regression-fit version that survives auditing, to the predictive variants that finance teams actually accept. All of it in SQL, with PostgreSQL as the primary dialect because its regression aggregates make the math clean.

Two ways LTV breaks

The naive formula is ARPU × average customer tenure. It produces a single number that looks credible on a slide. It also has two specific failure modes that show up in every engagement:

Failure mode 1: ARPU is a moving target. ARPU computed over the whole user base is biased toward whatever your highest-revenue cohort happens to be doing this month. ARPU computed per cohort and averaged is fine in principle but underweights recent cohorts that haven’t matured yet. ARPU computed from a sample of long-tenured users is the most defensible, and the least intuitive.

Failure mode 2: Average tenure undercounts the tail. Most analytics products average tenure over churned users only, which means anyone still active counts as zero. A “1.4-month average tenure” computed this way is wrong by a factor of two for any product where the tail matters. The fix is to integrate under the retention curve, not to average a survival time.

The rest of this article is the second fix, done in SQL.

Lifetime is the area under the retention curve

If retention on day N is the fraction of a cohort still active on day N, the lifetime of a typical user is the integral of that curve from day 0 to whatever horizon you care about (30, 90, 180, 365 days). LTV is then lifetime × revenue per active user per day (ARPDAU). The single number splits into two factors that can be measured independently, and that’s the whole reason this version is more defensible.

Lifetime as the shaded area under the retention curve; LTV as that area multiplied by ARPDAU
Lifetime as the shaded area under the retention curve; LTV as that area multiplied by ARPDAU
The retention curve is rarely a clean shape, but for most products it’s well approximated by a power function: retention(day) = a × day^b, with b typically between -0.3 and -0.6 for B2C and -0.1 and -0.3 for B2B SaaS. The reason this works is that retention drop-off is roughly proportional to the inverse of time-already-survived (today’s drop-off is smaller than yesterday’s because the users still around are the ones who already passed yesterday’s filter).

Fitting the power function to observed retention gives two parameters. Once you have them, projecting out to any horizon is arithmetic.

Linearizing the fit

A power function y = a × x^b linearizes under logarithms: ln(y) = ln(a) + b × ln(x). That’s the simplest possible regression — one slope, one intercept — and PostgreSQL has it built in.

regr_intercept(ln(retention), ln(day)) returns ln(a). regr_slope(ln(retention), ln(day)) returns b. Exponentiating the intercept recovers a. The whole fit is three lines of SQL.

The choice to fit on log-log isn’t optional, and it has a consequence: day 0 is excluded (ln(0) is undefined), and so is any day with retention exactly 0 (same reason). Most engagements where the fit looks wrong have one of these two issues — the analyst either kept day 0 and the regression chokes on -inf, or they let zero-retention days slip through the join.

Observed retention (dots) vs. fitted power curve (line); the fit is run on log-log axes to keep the regression linear
Observed retention (dots) vs. fitted power curve (line); the fit is run on log-log axes to keep the regression linear

The full query

Pick a cohort window (here, March 2025 registrations on a SaaS product), measure retention for 90 days, fit the power curve on the log-log version, project to 180 days, multiply by ARPDAU.

WITH cohort AS (
  -- one number: how many users registered in the window
  SELECT count(*) AS cohort_size
  FROM users
  WHERE date(registered_at) BETWEEN date '2025-03-01' AND date '2025-03-31'
),
daily_active AS (
  -- distinct active users on each day after registration
  SELECT
    date_part('day', s.session_at - u.registered_at)::int AS day,
    count(DISTINCT u.id) AS active_users
  FROM users u
  JOIN sessions s ON s.user_id = u.id
  WHERE date(u.registered_at) BETWEEN date '2025-03-01' AND date '2025-03-31'
  GROUP BY 1
  HAVING date_part('day', s.session_at - u.registered_at)::int BETWEEN 1 AND 90
),
retention AS (
  -- retention = active / cohort_size, day by day
  SELECT
    da.day,
    da.active_users::real / c.cohort_size AS retention
  FROM daily_active da
  CROSS JOIN cohort c
),
coef AS (
  -- linearized fit on log-log: ln(retention) = ln(a) + b * ln(day)
  SELECT
    exp(regr_intercept(ln(retention), ln(day))) AS a,
    regr_slope(ln(retention), ln(day))           AS b
  FROM retention
  WHERE retention > 0
),
projection AS (
  -- project the fitted curve out to day 180, accumulate lifetime
  SELECT
    gs AS day,
    a * power(gs, b) AS predicted_retention,
    sum(a * power(gs, b)) OVER (ORDER BY gs) AS lifetime_days
  FROM generate_series(1, 180) gs
  CROSS JOIN coef
)
SELECT
  day,
  predicted_retention,
  lifetime_days,
  lifetime_days * 4.20 AS ltv_180  -- ARPDAU in USD, measured separately
FROM projection
ORDER BY day;

ARPDAU is computed from a separate query on the same cohort: total revenue from the cohort over the observation window, divided by total user-days active. Keep it separate, and compute it on the same date window as the retention query. The most common LTV bug is computing ARPDAU from the current month’s revenue while the retention curve is fit on a 6-month-old cohort. The single most common bug is computing ARPDAU on a different cohort or different time window than retention, and then multiplying them, producing a number that means neither thing.

Reference numbers, not your numbers

On the gaming product the original analysis was built from, ARPDAU was about $4.20 per active user per day. On B2B SaaS we engage with, ARPDAU is usually wrong as a unit — the product isn’t used every day. Switch to ARPM (average revenue per active month) and recompute lifetime in months. Switch to AOV × purchase frequency for e-commerce.

The math is the same. The unit of “active” needs to match the product. A weekly-usage SaaS product fit with daily retention curves gets a power-law b parameter much closer to zero than reality because the curve is structurally flat on days users wouldn’t engage anyway.

90 observed days plus 90 projected days of retention; cumulative lifetime line growing alongside
90 observed days plus 90 projected days of retention; cumulative lifetime line growing alongside

When the power fit doesn’t fit

The power-law approximation is good for the median consumer product. It breaks predictably in a few cases:

  • Bimodal retention. Some products have two distinct user populations — power users who are still around at day 180 at 40%+ retention, and casual users who churn in week one. A single power fit averages them and underprojects the power users. Split the cohort by an early-engagement signal (e.g. did they complete the second-session milestone), fit separately.
  • Strong recurrence patterns. Subscription products with monthly billing cycles have retention dips on each renewal date. The power fit smooths through them and overprojects between billing cycles. For these, fit on monthly retention, not daily.
  • Viral / network products. Products where retention is driven by other users coming back tend to have super-linear retention early (cohort effects) and then crash. A power fit catches the average and misses both ends.

When the fit clearly disagrees with the observed curve, the right move isn’t to force a better fit. It’s to ship the observed curve to day-90 directly and to flag everything beyond it as “projected, modeling assumption: power law.” A range with confidence intervals is more useful to finance than a single number that turns out to be wrong.

More defensible variants

The power-fit version is the right default. For products where LTV joins the rest of the SaaS KPIs your board tracks, two upgrades are worth the engineering investment:

Cohort-stratified LTV. Run the same fit per acquisition cohort, not on a pooled curve. Acquisition source, geography, device, and acquisition month all change the curve shape. Reporting “weighted average LTV” with a hidden cohort mix means the number drifts with channel mix, even if no individual cohort changed. Stratifying surfaces the drift.

Probabilistic models (BG/NBD + Gamma-Gamma). For non-subscription products with repeat purchases (e-commerce, gaming with IAP), the Buy-Til-You-Die family of models predicts both purchase frequency and monetary value per customer. Python has lifelines and lifetimes implementations; you can productionize via a UDF or by precomputing in a notebook and joining the per-user results back into the warehouse. Don’t try to do BG/NBD in pure SQL — the gamma-distribution math doesn’t compile cleanly.

Survival regression. When churn is censored (we don’t know if the user “really” churned or just isn’t active this week), survival-analysis methods (Cox PH, Kaplan-Meier) handle the censoring honestly. For SaaS with monthly subscriptions and a clear “churned = canceled” event, Kaplan-Meier curves are nearly identical to retention curves but treat the still-active tail correctly.

The version we recommend most often: cohort-stratified power fit by acquisition channel, plus a quarterly back-test where you compute the day-180 prediction made 6 months ago against the now-actual day-180 retention. If the back-test consistently misses in one direction, the model needs updating, not the dashboard.

Mistakes we still see

Frequently enough to be worth naming:

  • Mixing pre-paid and post-paid revenue in ARPDAU. Annual pre-paid contracts collapse 12 months of revenue into one day. Either amortize, or split into separate revenue streams with separate LTV models.
  • Using gross revenue when finance wants contribution margin. LTV-to-CAC ratios at the C-level should use the same revenue definition as the CAC denominator — and finance almost always means contribution margin, not gross.
  • Reporting LTV as a single number when CAC is broken out by channel. Channel-level LTV is the right unit for marketing. A blended LTV-to-CAC ratio that drops because channel mix shifted toward worse retention but no channel actually got worse is the analyst version of Simpson’s Paradox.
  • Projecting out to “lifetime” infinity. Don’t. Agree on a horizon with finance before fitting (90, 180, or 365 days), label the output “LTV-180” not “LTV”, and add a footnote showing what LTV-90 is. The ratio LTV-180 / LTV-90 tells finance how much of the value depends on the extrapolation. A ratio above 1.5 is a signal the model is speculating.

Modern tooling notes

Cube, dbt-metrics, and the LookML metric layer all support LTV-style metrics directly, but they all default to the naive formula. They’re useful for shipping the result, not for computing it. The good pattern is: compute retention and ARPDAU in dbt models, expose them as semantic-layer metrics, build LTV as a derived metric using the formula above. That way the assumptions are in version control and any change shows up in code review, not in a deck.

Hex, Mode, and Deepnote work well for the back-test and the model-fit step. Keep the production dashboard on whatever your BI tool is, but the modeling work belongs in a notebook where you can show the fitted curve next to the observed one, the confidence interval, and the back-test history.

Checklist

Before LTV lands in front of investors or a finance team:

  • Retention curve is fit on log-log; day 0 excluded; zero-retention days excluded
  • ARPDAU computed on the same cohort and window as retention
  • Cohort window has a lower bound (cohorts have time to mature for the horizon)
  • Horizon is named explicitly (LTV-90, LTV-180, LTV-365 — not “lifetime”)
  • Stratified by at least one cohort dimension (channel, geography, or plan)
  • Power fit is sanity-checked against observed retention at days 30, 60, 90
  • Back-test from at least one prior horizon is in the appendix
  • Revenue definition matches the CAC definition (contribution margin vs gross)

Further reading

The retention curve is the input to this whole analysis; if your retention numbers are wrong, the LTV is wrong by a multiple. The cohort dimension you stratify on is the second-biggest lever. Both are covered in detail:

Keep reading

Enjoyed this article?

Get weekly data strategy insights delivered to your inbox.

Get in Touch

Let's Discuss Your Project

Book a 30-minute discovery call. We'll assess your data maturity and recommend the right approach — no strings attached.

Book a Discovery Call →
Need help with your data strategy? Book a Discovery Call →