Varimax Rotation

Varimax rotation is an orthogonal factor rotation method in multivariate statistical analysis that rotates the coordinate axes of a principal component analysis (PCA) or factor analysis solution to achieve a "simple structure" in which each observed variable (such as a well log measurement, geochemical parameter, or reservoir attribute) loads predominantly on only one or a small number of the rotated factors, rather than having moderate loadings on many factors as is typical of the unrotated principal component solution; the varimax criterion (originally developed by Henry Kaiser in 1958) maximizes the variance of the squared factor loadings within each factor (the "varimax" criterion: VAR of squared loadings is maximized), which drives the rotation toward a solution where each factor is defined by a few variables with high loadings and many variables with near-zero loadings -- a configuration that is easier to interpret geologically or geochemically than the unrotated principal components which each account for the maximum possible variance and are therefore influenced by many variables simultaneously; in petroleum geochemistry, varimax rotation is applied to log data matrices (with rows representing depth samples and columns representing log measurements such as gamma ray, density, neutron, resistivity, and photoelectric factor) to identify fundamental lithofacies, fluid, or diagenetic factors that control the log response variation, with the rotated factor scores providing a simplified characterization of each depth interval that can be used for petrophysical facies mapping, reservoir zonation, and well-to-well correlation.

Key Takeaways

  • The mathematical justification for varimax rotation over other orthogonal rotations (quartimax, equimax, orthomax) is that it maximizes the simplicity of the factor pattern by making each variable as much as possible a pure measure of a single factor: for a matrix of factor loadings L (with rows representing variables and columns representing factors), varimax maximizes the sum over all factors of the variance of the squared loadings within that factor, subject to the constraint that the rotated factors remain orthogonal (uncorrelated); the solution is computed iteratively by a series of pairwise plane rotations (each rotation optimizes the varimax criterion for two factors at a time, cycling through all factor pairs until convergence), implemented in standard statistical software (SAS PROC FACTOR, R factanal, MATLAB rotatefactors) as the default rotation method for exploratory factor analysis; the orthogonality constraint of varimax (which keeps the rotated factors uncorrelated) is a limitation when the true underlying geological variables are correlated -- for example, porosity and permeability are correlated in most reservoir rock types, so a varimax solution that forces them onto different orthogonal factors may produce rotated factors that do not correspond cleanly to individual physical properties; oblique rotations (promax, oblimin, direct oblimin) allow the rotated factors to be correlated and may provide a better match to correlated geological phenomena, but at the cost of greater interpretive complexity.
  • Petroleum geochemistry applications of varimax rotation include the source rock facies discrimination, maturity assessment, and oil-oil and oil-source correlation problems that involve large datasets of biomarker ratios, isotopic measurements, and gross composition parameters (API gravity, GOR, sulfur content) from many samples; a typical oil-oil correlation study might measure 30 to 50 geochemical parameters on 100 to 500 oil samples from a basin, and PCA with varimax rotation reduces this high-dimensional dataset to 3 to 8 rotated factors that can be plotted as cross-plots or ternary diagrams to visually distinguish sample clusters corresponding to different source rock types (marine carbonate vs. lacustrine shale), maturity levels (early oil vs. condensate), or migration pathways (short-chain vs. long-chain migration from the same source); the rotated factor scores for each sample (the coordinates of the sample in the rotated factor space) are used as input to hierarchical clustering or k-means clustering algorithms to group samples into geochemical families, with the factor score cross-plots providing a visual confirmation that the proposed families are distinct in the reduced-dimension space defined by the varimax rotation.
  • Log facies analysis using varimax rotation applies PCA and rotation to a matrix of normalized well log values (typically depth-sampled at 0.1 to 0.5 meter intervals, with each log curve normalized to zero mean and unit variance before PCA to prevent high-variance measurements from dominating the first principal component) to extract rotated factors that represent the independent sources of log variance in the formation; in a clastic reservoir with interbedded sandstone, shale, and carbonate cement, the varimax rotation typically identifies a shale/clay factor (dominated by gamma ray and neutron response), a porosity factor (dominated by density-neutron separation and resistivity), and a mineralogy or fluid factor (dominated by the photoelectric factor PE and the density/neutron crossover); the factor scores at each depth level are used to classify each sample into a petrophysical facies (the geomechanical or flow unit that the varimax analysis identifies), with the facies boundaries corresponding to significant changes in factor score that can be correlated between wells using the factor score depth profiles; this facies classification from varimax rotation provides a systematic, data-driven alternative to the subjective cutoff-based petrophysical facies classification (where the engineer applies fixed gamma ray, density, and saturation cutoffs to define sand, shale, and net pay), potentially revealing petrophysical facies boundaries that are not obvious from individual log curves examined separately.
  • Seismic attribute analysis using varimax rotation addresses the problem of multi-collinearity in seismic attribute datasets: modern 3D seismic processing generates dozens of attributes (amplitude, frequency, coherence, curvature, impedance inversion, AVO gradient, instantaneous phase) that are often highly correlated with each other because they are computed from the same underlying seismic trace data; applying PCA with varimax rotation to the attribute matrix (rows are seismic bins, columns are attributes) extracts a smaller number of rotated factors that capture the independent sources of attribute variance, reducing the dimensionality from 20 to 40 attributes to 4 to 8 rotated factors that can be cross-plotted or used as inputs to multi-attribute neural network or self-organizing map (SOM) classification; the rotated attribute factors are input to seismic facies classification that identifies geological objects (channels, lobes, mass-transport deposits) or fluid effects (bright spots, flat spots) from the seismic data without requiring explicit threshold setting on any individual attribute, using the combination of attribute signatures in the rotated factor space to classify each seismic bin into a lithofacies or fluid type category.
  • Limitations of varimax rotation in reservoir characterization arise from the fundamental assumptions of PCA: that the relationships between variables are linear (the correlation matrix captures the pairwise linear relationships, but many geological relationships are nonlinear -- for example, the exponential permeability-porosity transform or the Archie saturation equation are strongly nonlinear), that all variables should be weighted equally in the PCA (which is addressed by standardization but which still may not reflect the relative geological importance of different log measurements), and that the orthogonality constraint forces the rotated factors to be uncorrelated even when the underlying geological variables are correlated; alternatives to PCA/varimax for log facies analysis include independent component analysis (ICA, which seeks sources that are maximally independent rather than maximally uncorrelated, more appropriate when the geological end-members are independent mixture components such as pure sandstone and pure shale), and non-negative matrix factorization (NMF, which constrains the factor scores and loadings to be non-negative, appropriate when the observed log values are non-negative mixtures of mineral end-members such as in mineralogy from X-ray fluorescence logging).

Fast Facts

Varimax rotation was introduced by Henry F. Kaiser in his 1958 paper "The varimax criterion for analytic rotation in factor analysis" (Psychometrika, 23, 187-200), which derived the mathematical criterion and iterative algorithm for computing the rotation and demonstrated that the resulting simple structure was more easily interpretable than the unrotated principal components for psychological test datasets. Kaiser's varimax method quickly became the default rotation in factor analysis across all quantitative social and natural sciences, implemented in the first generation of mainframe statistical packages (BMDP, SAS) in the 1960s and remaining the most widely used factor rotation method over 65 years later. The application of varimax rotation to petroleum geochemical datasets was systematized in the 1980s by Palacas, Daws, and colleagues at the USGS, and to petrophysical log analysis by Gill (1993) and by Doveton (1994) in his textbook on geological log analysis, which established varimax rotation as a standard multivariate tool in the quantitative petrophysicist's toolkit alongside cross-plotting and multi-mineral log analysis.

What Is Varimax Rotation?

Varimax rotation is an orthogonal factor rotation method applied after principal component analysis (PCA) that rotates the factor axes to maximize the simplicity of the factor loading pattern, making each observed variable load predominantly on only one or two factors rather than moderately on all. In petroleum geochemistry, it extracts interpretable factors from biomarker or log measurement datasets that correspond to geological phenomena (lithofacies, maturity, fluid type). In log facies analysis, varimax-rotated PCA identifies fundamental petrophysical factors (clay, porosity, mineralogy) that drive log response variation, enabling systematic data-driven facies classification between wells.

Varimax rotation is also called orthogonal factor rotation, Kaiser varimax, or simply varimax. Related terms include principal component analysis (PCA, a multivariate statistical method that transforms a set of correlated variables into a smaller set of uncorrelated principal components that account for maximum variance in the data, ordered from the largest variance component to the smallest; PCA provides the unrotated factor solution that serves as the input to varimax rotation, which then reorients the factor axes to improve interpretability while preserving the total variance explained), factor analysis (a statistical method closely related to PCA that models the observed correlations between variables as arising from a smaller number of unobserved latent factors plus unique variance for each variable; varimax rotation is most commonly associated with factor analysis rather than PCA, but is applied to both; the distinction between factor analysis and PCA lies in whether unique variance is modeled separately (factor analysis) or included in all components (PCA)), log facies (a petrophysical classification of depth intervals in a well based on their well log response pattern (gamma ray, density, neutron, resistivity, PE), used to identify lithological or flow-unit boundaries; varimax-rotated PCA provides a data-driven method for defining log facies from the multidimensional log response space rather than applying subjective cutoffs to individual log curves), biomarker (a compound in crude oil or source rock extract that is derived from biological precursor molecules and retains structural features that allow identification of the source organism type, depositional environment, and thermal maturity; biomarker ratios (such as pristane/phytane, sterane/hopane) are the input variables for PCA/varimax analysis in oil-oil and oil-source rock correlation studies), and cluster analysis (a multivariate statistical method that groups data samples into clusters based on their similarity in a multi-dimensional attribute space; varimax-rotated factor scores are commonly used as input to cluster analysis in log facies mapping and geochemical family classification, reducing the original high-dimensional attribute space to the smaller set of interpretable factors before applying the clustering algorithm).