Geostatistics: Spatial Statistics for Reservoir Modeling

What Is Geostatistics?

Geostatistics (also called spatial statistics or stochastic reservoir modeling) is a branch of statistics adapted for data that exhibits spatial correlation — meaning that nearby measurements tend to be more similar than distant ones. Developed originally by Georges Matheron and Danie Krige for the South African gold mining industry in the 1960s, geostatistics uses tools such as the variogram, kriging interpolation, and stochastic simulation to characterize and model spatial variability in geological properties including porosity, permeability, facies architecture, and fluid saturation between well control points. It forms the mathematical foundation for most modern reservoir modeling software and provides the quantitative framework for uncertainty quantification in subsurface characterization.

Key Takeaways

  • The variogram quantifies how spatial correlation between property values changes with distance and direction, providing the statistical structure that drives all kriging and simulation methods.
  • Kriging produces a best linear unbiased estimate (BLUE) of a property at unsampled locations, honoring all data points exactly but producing smooth maps that underrepresent natural variability.
  • Stochastic simulation methods (SGS, SIS) reproduce the full variability of the property by adding random noise constrained by the variogram, generating multiple equally probable realizations of the reservoir.
  • Multiple realizations from stochastic simulation enable probabilistic uncertainty quantification of reserves, recovery factors, and development economics.
  • Co-kriging and collocated co-simulation integrate secondary data (commonly 3D seismic attributes) with sparse well data to constrain reservoir models between wells.

How Geostatistics Works

The geostatistical workflow begins with computing the experimental variogram from available data (well logs, core measurements, or interpreted seismic). The variogram measures the average squared difference between data values separated by a given lag distance and direction: gamma(h) = [1/(2N)] * sum[(z(x) - z(x+h))^2]. As lag distance h increases, gamma(h) typically rises from near zero (closely spaced points are similar) to a plateau called the sill, which equals the variance of the data. The distance at which the variogram reaches the sill is the range — the maximum distance over which spatial correlation exists. A non-zero value at zero lag distance is the nugget, representing measurement error or variability at scales smaller than the sampling interval. The experimental variogram is then fitted with a model (spherical, exponential, Gaussian, or power) that can be evaluated at any lag distance and direction.

With the variogram model defined, kriging uses it to estimate the unknown value at any unsampled location as a weighted average of neighboring data points. The weights are chosen to minimize estimation variance given the spatial correlation structure described by the variogram. Ordinary kriging (the most common form) constrains the weights to sum to one, making it robust when the mean of the property is unknown. Simple kriging uses a known stationary mean. Universal kriging adds a drift component for properties with spatial trends. All kriging estimators are exact interpolators: they reproduce the measured values at data locations exactly. The limitation of kriging is that it produces smooth maps that honor the data but suppress the geological heterogeneity that is critical for flow simulation.

Fast Facts: Geostatistics
  • Originator: Danie Krige and Georges Matheron, 1960s gold mining applications
  • Primary tool: variogram — measures spatial correlation structure
  • Kriging output: smooth best-estimate map honoring data exactly
  • Simulation output: multiple rough realizations reproducing full variability
  • Most common simulation method: sequential Gaussian simulation (SGS)
  • Secondary data integration: collocated co-simulation with seismic attributes
  • Uncertainty quantification: P10/P50/P90 from ensemble of realizations
  • Common reservoir modeling software: Petrel, RMS, IRAP RMS, Isatis
Field Tip:

Always check whether the experimental variogram has enough data pairs at each lag. A variogram computed from fewer than 30 pairs per lag is unreliable. If well spacing is wide relative to correlation length, the variogram cannot be estimated from wells alone — consider using variograms derived from analog outcrops or seismic attribute data at well locations to constrain the model.

Kriging Versus Stochastic Simulation

The choice between kriging and simulation depends on the end use. Kriging minimizes the local estimation error and is appropriate when you need the single best estimate of a property — for example, a porosity map for volumetric calculations where the spatial average matters more than local variability. However, kriging smooths the property: extreme high and low values are dampened, and the variance of the kriged map is lower than the variance of the original data. For flow simulation, this smoothing is problematic because it underestimates the effect of permeability heterogeneity on fluid displacement efficiency.

Stochastic simulation methods address the smoothing problem by adding back the variance that kriging removes. Sequential Gaussian simulation (SGS) visits each grid node in a random sequence, estimates the local distribution using kriging (mean and variance), and draws a random value from that distribution constrained to honor previously simulated nearby values. The result reproduces the variogram and histogram of the original data. Sequential indicator simulation (SIS) works similarly but operates on indicator-transformed data, making it appropriate for categorical variables such as facies or rock type. Running 50 to 200 realizations produces an ensemble from which P10, P50, and P90 outcomes can be computed for any production forecast metric.

Integrating Seismic Data Through Co-Simulation

Wells provide precise measurements of reservoir properties but are sparsely spaced, typically hundreds of meters to several kilometers apart. Seismic data provides dense spatial coverage but only indirectly relates to reservoir properties such as porosity. Collocated co-kriging and collocated co-simulation use the statistical relationship between the primary variable (e.g., porosity from wells) and a secondary variable (e.g., seismic acoustic impedance) established at well locations to constrain the reservoir model between wells. The seismic attribute is sampled at every grid node, and its correlation coefficient with the primary variable acts as a soft constraint that steers the simulation toward high-porosity values where impedance is low (or high, depending on the fluid and rock physics relationship). This integration is one of the most valuable applications of geostatistics in exploration and development, dramatically improving the geological realism of reservoir models in data-sparse areas.

Upscaling and Uncertainty in Reservoir Simulation

Geological models built through geostatistical simulation typically contain millions to tens of millions of grid cells at fine resolution (1–5 m vertically, 25–50 m laterally). Dynamic reservoir simulators cannot practically run at this resolution; models must be upscaled to coarser simulation grids of tens of thousands to a few hundred thousand cells. Upscaling averages porosity arithmetically and permeability using power-law averaging or tensor methods that account for flow direction. Each stochastic realization produces a different upscaled simulation model, enabling uncertainty propagation from the geological model through to production forecasts. This workflow — geological model to upscaled simulation grid to production history matching to forecast — is the standard approach for development planning and reserves booking in major operating companies.

Geostatistics is also referred to as:

  • Spatial statistics — the broader academic discipline from which petroleum geostatistics draws its methods
  • Stochastic reservoir modeling — emphasizes the simulation (probabilistic) aspect rather than kriging estimation
  • Probabilistic reservoir characterization — used in reserves reporting contexts to describe the uncertainty quantification output
  • Kriging — often used loosely to mean all geostatistical methods, though technically kriging refers only to the estimation (not simulation) branch

Related terms: reservoir modeling, variogram, porosity, permeability, seismic attribute

Frequently Asked Questions About Geostatistics

What is the difference between kriging and inverse distance weighting?

Both kriging and inverse distance weighting (IDW) are interpolation methods that estimate values at unsampled locations as weighted averages of nearby data. IDW assigns weights based simply on the inverse of distance (or distance squared), with no regard for the spatial correlation structure of the data. Kriging assigns weights based on the variogram, which encodes the actual spatial continuity of the property being interpolated. Kriging also provides an estimation variance (kriging standard deviation) at every location, giving a measure of uncertainty that IDW cannot provide. Kriging will outperform IDW whenever the variogram has been reliably estimated and the data honors the assumptions of stationarity.

How many realizations are needed for reliable uncertainty quantification?

The required number of realizations depends on the metric being estimated. For P10/P50/P90 of a smoothly varying property like STOIIP (stock tank oil initially in place), 50–100 realizations are usually sufficient for stable statistics. For metrics sensitive to extreme realizations — such as breakthrough time in a waterflood, which depends on the existence of connected high-permeability pathways — 200 or more realizations may be needed to sample the tails of the distribution adequately. Running computational experiments (plotting the convergence of P10 and P90 as realization count increases) is the rigorous way to determine when enough realizations have been generated.

Can geostatistics be applied to fractured reservoirs?

Yes, but fractured reservoirs require specialized geostatistical methods because fractures are geometrical objects (planes), not continuous scalar fields. Discrete fracture network (DFN) modeling uses stochastic simulation to generate populations of fracture sets conditioned to borehole image log observations, seismic attribute maps, and structural geology interpretations. The variogram still plays a role in characterizing the spatial variation of fracture density or aperture along each set, but the simulation algorithms differ from those used for continuous properties like porosity.

Why Geostatistics Matters in Oil and Gas

Geostatistics is the quantitative backbone of modern reservoir characterization, providing the only rigorous framework for interpolating sparse well data into full 3D reservoir models and for quantifying the uncertainty in those models. In an industry where development decisions worth hundreds of millions of dollars hinge on subsurface predictions, the ability to generate probabilistic reserves estimates, test sensitivity of production forecasts to geological uncertainty, and rank development scenarios is indispensable. Every major reservoir modeling workflow today relies on geostatistical methods at its core.