Skip to content
6 min

Pricing optimisation is a game of elasticity, not prediction

Pricing is not a prediction problem. It is an elasticity problem, and these are not the same thing. The shape of the failure: a team builds a beautiful demand-prediction model conditioned on customer segment, time of year, inventory, and competitor pricing. The historical fit is excellent. Revenue goes down once the business starts setting prices off it.

A demand prediction model tells you what volume you will sell at a given price, based on historical patterns of what you sold at that price. This is not what a pricing system needs to know. A pricing system needs to know what volume you would sell at a price you haven't tried yet, which is a counterfactual question, not a predictive one. Confusing these two questions is the single most common failure mode in enterprise pricing projects.

Three versions of the mistake

Three specific versions of the mistake.

The first is treating pricing as a supervised learning problem where the target is volume. A model trained this way learns the correlation between price and volume in the historical data. The trouble is that the historical price was chosen by someone (a product manager, a pricing committee, a competitive-match rule), and that choice is correlated with the demand conditions of the moment. You charged a lower price when inventory was high; the model sees the correlation between low price and high inventory and 'learns' that low prices cause high inventory. The correct direction of causation is the reverse. A prediction model based on this data is a model of your existing pricing policy, not a model of the market.

The second is ignoring self-selection in the historical data. If you only ever charged the premium price in the premium segment, the data will tell you that premium customers tolerate premium prices, but that's not a useful claim, because you never tested what they'd do at a discount price. The counterfactual you need is exactly the one you never ran. A pricing model trained without thinking about this tends to recommend more of what you already do, because the data only describes what you already do.

The third is failing to distinguish between-customer and within-customer elasticity. Across customers, lower prices tend to correlate with higher-volume buyers, because volume buyers are price-sensitive. Within a customer, lower prices tend to cause higher volumes, because the same customer buys more when it's cheaper. These two effects can look identical in aggregated data, and they have different implications for pricing. A model that uses only cross-sectional variation will recommend price cuts to customers who already buy a lot, which often destroys margin without gaining volume. Within-customer variation, where you have it, is the signal you actually want.

The right approach starts with experimental design, not better models. Before you trust any elasticity estimate, you want to have run some controlled price variation and measured the response. This is uncomfortable; it feels like leaving money on the table to deliberately charge the wrong price for two weeks. In practice it's the only way to know, and the information gained is worth many multiples of the short-term revenue cost. Start with a small segment, a bounded range of prices, and a clean experimental design. Once you have two or three such experiments, you can start to estimate elasticity in a way that isn't just fitting noise.

The modelling that follows should be elasticity-aware, not prediction-aware. Counterfactual estimators, propensity-score weighting, doubly-robust estimation, and uplift models are built for this. They're less flashy than a gradient-boosted forecaster and they underperform it on the purely predictive task. That's the point. Predictive performance is not what you need. You need an answer to 'how would volume change if I moved price from X to Y', and the flashy predictor cannot answer that question.

The last piece is knowing what you don't know. A confident recommendation to raise prices by fifteen percent in a segment where you've never tested above ten is a confident extrapolation, and should be treated with the suspicion any extrapolation deserves. The most profitable pricing systems are the ones that flag their own uncertainty; a pricing recommendation with honest confidence intervals is more valuable than a narrow recommendation that's wrong. The model should know, and communicate, that it doesn't really know what happens beyond the range of the training data.

Pricing is a causal inference problem dressed as a prediction problem. Teams that get this right tend to have data scientists who trained in econometrics before they trained in deep learning. Teams that get it wrong tend to have forecasters running what should be experimentalists.

// The artefact
# pricing/elasticity.py: IPW counterfactual, not a prediction
def elasticity(df: pd.DataFrame, treatment: str, outcome: str) -> float:
    propensity = LogisticRegression().fit(df[FEATURES], df[treatment]).predict_proba(df[FEATURES])[:, 1]
    weights = df[treatment] / propensity + (1 - df[treatment]) / (1 - propensity)
    treated_mean = (df[outcome] * df[treatment] * weights).sum() / weights.sum()
    control_mean = (df[outcome] * (1 - df[treatment]) * weights).sum() / weights.sum()
    return (treated_mean - control_mean) / control_mean

Inverse-propensity weighting: estimate what would have happened at a price you didn't try.