Regression Coefficient: R-squared, Decline Curve Fits, and Type-Curve Calibration in WCSB Reservoirs

The regression coefficient, most commonly reported as the coefficient of determination R² (or as the correlation coefficient r in linear contexts), is a dimensionless statistic that quantifies how well a fitted curve or surface explains the variance in an observed data set. A value of 0 means the fit explains none of the variance and the data are effectively random with respect to the model, a value of 1 means the model accounts for 100 percent of the observed variance (a perfect fit), and negative R² values, which arise in nonlinear fits where the model performs worse than simply taking the mean of the observations, indicate that the chosen model is structurally wrong. For petroleum engineers and geoscientists, regression coefficients appear in nearly every interpretation workflow: in decline-curve analysis they grade the fit of an Arps hyperbolic or modified-hyperbolic to a producing well's monthly volumes; in pressure transient testing they validate semilog and log-log derivative matches; in routine core analysis they tie porosity to permeability through the Kozeny-Carman framework; in 4D seismic reservoir monitoring they quantify the correlation between modeled and observed time-strain attributes; and in petrophysical workflows they calibrate Archie cementation and saturation exponents to lab-measured Sw on plug samples from formations including the Montney, Duvernay, and Cardium. A high R² is not by itself proof of a correct model; small data sets, autocorrelated time series (typical of monthly production), and overfitting through too many free parameters can all push R² toward 1 without any underlying physical meaning. AER Directive 058 reserves submissions and CSA NI 51-101 reporting both expect the regression statistics underpinning a type curve or decline forecast to be documented and reproducible, and a competent person's report (CPR) for a public WCSB asset will typically reject a decline fit with R² below approximately 0.85 on a normalized monthly oil rate, depending on the noise characteristics of the well. Conversely, an R² of 0.999 on only 4 to 6 data points should be treated with extreme suspicion because the model has too many degrees of freedom relative to the observations. Operators such as Canadian Natural Resources Limited and Tourmaline Oil publish type curves for Montney and Duvernay developments where R² values across hundreds of wells are reported alongside P10, P50, and P90 distributions to defend the central tendency forecast.

Key Takeaways

  • R-squared definition: R² is 1 minus the ratio of residual sum of squares (RSS) to total sum of squares (TSS), so it measures the fraction of variance the model explains. R² near 1 indicates a tight fit, R² near 0 indicates the model is no better than the mean, and negative R² in nonlinear fits means the model is worse than the mean and should be rejected outright.
  • Not a goodness test on its own: A high R² does not prove the model is correct, only that it tracks the observed data. Overfitting (too many parameters), autocorrelation in time series production data, and small sample sizes can all inflate R² without physical meaning. Adjusted R², AIC, BIC, and visual residual plots are needed to confirm the fit is honest.
  • WCSB type-curve standard: AER Directive 058 reserves filings and NI 51-101 competent-persons reports require documented regression statistics for declines and type curves. Industry practice in the Montney and Duvernay rejects decline fits with R² below roughly 0.85 on normalized monthly oil rate unless the noise structure of the data justifies a lower threshold.
  • Petrophysical calibration: Archie's saturation equation parameters m (cementation exponent) and n (saturation exponent) are calibrated from core porosity, formation resistivity, and lab Sw measurements via linear regression on log-log plots. R² above 0.90 across at least 15 plug samples is the typical bar for using core-derived m and n in field-wide Sw calculations on Cardium, Viking, or Bakken wells.
  • Reserves-report rejection threshold: Reserves auditors (Sproule, GLJ, McDaniel, Deloitte Reserves) routinely return a CPR for revision if a published type curve includes wells with R² below 0.70 on the hyperbolic match, or if forecast uncertainty bands exclude the observed variance. The regression statistic, in other words, is not just an internal QA metric; it is a contractual deliverable in WCSB reserves disclosure.

Computing R-squared on a Decline Curve

For a typical Montney horizontal well producing for 36 months, the engineer fits an Arps hyperbolic with initial rate qi, decline rate Di, and b-exponent (usually 0.8 to 1.2 for tight unconventional gas). RSS is the sum of squared differences between observed monthly rates and the fitted rate at each month; TSS is the sum of squared deviations of the observed rates from their mean. R² equals 1 minus RSS over TSS. A clean Montney well will produce R² values of 0.92 to 0.97 over 36 months; a well with mid-life shut-ins or refrac interventions may drop to 0.70 to 0.80, prompting segmentation of the forecast into pre- and post-event periods rather than forcing a single hyperbolic across the event.

Where High R-squared Misleads

The most common WCSB pitfall is fitting an Arps hyperbolic to only 6 to 9 months of early-time data, where the curvature of the decline is poorly constrained and R² of 0.99 is trivial to achieve with any b between 0.5 and 1.8. The forecast tail (months 60 to 240) is then determined almost entirely by the choice of b, not by the data. Sproule and GLJ both publish guidance that less than 12 months of production warrants a probabilistic forecast using offset analog type curves rather than a single hyperbolic regression, regardless of the in-sample R² value.

Fast Facts

The coefficient of determination was introduced by British statistician Sewall Wright in 1921, the same year as his foundational paper on path analysis, but R² did not enter petroleum reserves practice until the 1956 publication of J. J. Arps's "Estimation of Primary Oil Reserves" in JPT, where Arps used early forms of least-squares regression to fit his hyperbolic and exponential decline equations to vertical conventional Texas wells. The same regression framework, sixty-eight years later, still grades every Montney and Duvernay type curve filed with the AER.

Regression coefficients are the grading metric for any decline curve analysis forecast, where the Arps hyperbolic, exponential, and modified-hyperbolic models are fit to historical rate data. They underpin the calibration of Archie's equation parameters from core data, anchor the type curve deliverables in reserves disclosures, and grade the success of pressure transient analysis matches on semilog and log-log derivative plots. Each of these workflows reports an R² as the headline goodness statistic, but in every case the visual residual pattern and the physical reasonableness of the parameters must be checked alongside the number.

Montney Type-Curve Calibration Scenario

A reserves engineer at a mid-cap Montney operator builds a 2026 type curve across 142 horizontal wells in the Karr-Kakwa fairway for inclusion in the year-end NI 51-101 filing. She fits an Arps hyperbolic with qi = 1,200 boe/d, Di = 78%, and b = 1.05 to the normalized monthly oil rates, achieving R² = 0.91 against the 142-well P50 envelope. Sproule, the third-party auditor, reviews the submission and confirms the R² is above the 0.85 internal threshold but requests that the engineer flag the 19 wells with individual R² below 0.70 (most of which had multi-month operational outages) and present them separately. The total capital exposure on the type curve is roughly CAD $1.1 billion over the next five years of drilling.

The CPR is approved with the segmented forecast, and the documented R² statistics support the bookable proved-plus-probable (2P) reserves estimate at year-end, anchoring the operator's CAD $480 million 2027 capital budget approval at the board level.