API Reference

Correlation Computation

Statistics.cor — Function

cor(cortype, x [,y])

Compute the correlation matrix of a given type.

The possible correlation types are:

Pearson
Spearman
Kendall

Examples

julia> x = [-1.62169     0.0158613   0.500375  -0.794381
             2.50689     3.31666    -1.3049     2.16058
             0.495674    0.348621   -0.614451  -0.193579
             2.32149     2.18847    -1.83165    2.08399
            -0.0573697   0.39908     0.270117   0.658458
             0.365239   -0.321493   -1.60223   -0.199998
            -0.55521    -0.898513    0.690267   0.857519
            -0.356979   -1.03724     0.714859  -0.719657
            -3.38438    -1.93058     1.77413   -1.23657
             1.57527     0.836351   -1.13275   -0.277048];

julia> cor(Pearson, x)
4×4 Matrix{Float64}:
  1.0        0.86985   -0.891312   0.767433
  0.86985    1.0       -0.767115   0.817407
 -0.891312  -0.767115   1.0       -0.596762
  0.767433   0.817407  -0.596762   1.0

julia> cor(Spearman, x)
4×4 Matrix{Float64}:
  1.0        0.866667  -0.854545   0.709091
  0.866667   1.0       -0.781818   0.684848
 -0.854545  -0.781818   1.0       -0.612121
  0.709091   0.684848  -0.612121   1.0

julia> cor(Kendall, x)
4×4 Matrix{Float64}:
  1.0        0.733333  -0.688889   0.555556
  0.733333   1.0       -0.688889   0.555556
 -0.688889  -0.688889   1.0       -0.422222
  0.555556   0.555556  -0.422222   1.0

source

Bigsimr.cor_fast — Function

cor_fast([cortype,] X)

Calculate the correlation matrix in parallel using available threads.

source

Bigsimr.cor_bounds — Function

cor_bounds(d1, d2, cortype, samples)

Compute the stochastic lower and upper correlation bounds between two marginal distributions.

This method relies on sampling from each distribution and then estimating the specified correlation between the sorted samples. Because the samples are random, there will be some variation in the answer for each call to cor_bounds. Increasing the number of samples will increase the accuracy of the estimate, but will also take longer to sort. Therefore ≈100,000 samples (the default) are recommended so that it runs fast while still returning a good estimate.

The possible correlation types are:

Pearson
Spearman
Kendall

Examples

julia> using Distributions

julia> A = Normal(78, 10); B = LogNormal(3, 1);

julia> cor_bounds(A, B)
(lower = -0.7646512417819491, upper = 0.7649206069306482)

julia> cor_bounds(A, B, 1e6)
(lower = -0.765850375468031, upper = 0.7655170605697716)

julia> cor_bounds(A, B, Pearson)
(lower = -0.7631871539735006, upper = 0.7624398609255689)

julia> cor_bounds(A, B, Spearman, 250_000)
(lower = -1.0, upper = 1.0)

source

cor_bounds(margins, cortype, samples)

Compute the stochastic pairwise lower and upper correlation bounds between a set of marginal distributions.

The possible correlation types are:

Pearson
Spearman
Kendall

Examples

julia> using Distributions

julia> margins = [Normal(78, 10), Binomial(20, 0.2), LogNormal(3, 1)];

julia> lower, upper = cor_bounds(margins, Pearson);

julia> lower
3×3 Matrix{Float64}:
  1.0       -0.983111  -0.768184
 -0.983111   1.0       -0.702151
 -0.768184  -0.702151   1.0

julia> upper
3×3 Matrix{Float64}:
 1.0       0.982471  0.766727
 0.982471  1.0       0.798379
 0.766727  0.798379  1.0

source

Random Correlation Matrix

Bigsimr.cor_randPSD — Function

cor_randPSD([T,] d, k=d-1)

Return a random positive semidefinite correlation matrix where d is the dimension ($d ≥ 1$) and k is the number of factor loadings ($1 ≤ k < d$).

Converting Correlation Types

Bigsimr.cor_convert — Function

cor_convert(X, from, to)

Convert from one type of correlation matrix to another.

The role of conversion in this package is typically used from either Spearman or Kendall to Pearson where the Pearson correlation is used in the generation of random multivariate normal samples. After converting, the correlation matrix may not be positive semidefinite, so it is recommended to check using LinearAlgebra.isposdef, and subsequently calling cor_nearPD.

Correlation Utils

Convert a correlation matrix using other utilities.

Bigsimr.cor_constrain — Function

cor_constrain(X::AbstractMatrix{<:Real})

Constrain a matrix so that its diagonal elements are 1, off-diagonal elements are bounded between -1 and 1, and a symmetric view of the upper triangle is made.

Nearest Correlation Matrix

Provided by NearestCorrelationMatrix.jl

NearestCorrelationMatrix.Newton — Type

NewtonNew(; tau, tol_cg, tol_ls, iter_cg, iter_ls)

Parameters

tau: a tuning parameter controlling the smallest eigenvalue of the resulting matrix
tol_cg: the tolerance used in the conjugate gradient method
tol_ls: the tolerance used in the line search method
iter_cg: the max number of iterations in the conjugate gradient method
iter_ls: the max number of refinements during the Newton step

source

NearestCorrelationMatrix.AlternatingProjections — Type

AlternatingProjections(; tau=0)

The alternating projections algorithm developed by Nick Higham.

source

NearestCorrelationMatrix.DirectProjection — Type

DirectProjection(; tau=eps())

Single step projection of the input matrix into the set of correlation matrices. Useful when a "close" correlation matrix is needed without concern for it being the most optimal.

Parameters

tau: a tuning parameter controlling the smallest eigenvalue of the resulting matrix

source

NearestCorrelationMatrix.nearest_cor — Function

nearest_cor(A, alg; kwargs...)

Return the nearest positive definite correlation matrix to $A$.

This is a "batteries included" method, and is designed to just work with as little thought as possible. Unlike solve!, this method will return the nearest correlation matrix directly instead of a NCMSolution object. Additionally the solution is checked to be positive definite, and corrected if it is not.

Examples

julia> import LinearAlgebra: isposdef

julia> r = [
    1.00 0.82 0.56 0.44
    0.82 1.00 0.28 0.85
    0.56 0.28 1.00 0.22
    0.44 0.85 0.22 1.00
];

julia> isposdef(r)
false

julia> p = nearest_cor(r)
4×4 Matrix{Float64}:
 1.0       0.817095  0.559306  0.440514
 0.817095  1.0       0.280196  0.847352
 0.559306  0.280196  1.0       0.219582
 0.440514  0.847352  0.219582  1.0

julia> isposdef(p)
true

source

NearestCorrelationMatrix.nearest_cor! — Function

nearest_cor!(A, alg=nothing; kwargs...)

Return the nearest positive definite correlation matrix to $A$. This method will overwrite $A$.

Examples

julia> import LinearAlgebra: isposdef

julia> r = [
    1.00 0.82 0.56 0.44
    0.82 1.00 0.28 0.85
    0.56 0.28 1.00 0.22
    0.44 0.85 0.22 1.00
];

julia> isposdef(r)
false

julia> nearest_cor!(r)
4×4 Matrix{Float64}:
 1.0       0.817095  0.559306  0.440514
 0.817095  1.0       0.280196  0.847352
 0.559306  0.280196  1.0       0.219582
 0.440514  0.847352  0.219582  1.0

julia> isposdef(r)
true

source

Simplified Wrappers

Bigsimr.cor_nearPD — Function

cor_nearPD(X)

Return the nearest positive definite correlation matrix to X.

Pearson Correlation Matching

PearsonCorrelationMatch.pearson_match — Function

pearson_match(p::Real, d1, d2; n::Real=12, m::Real=128, kwargs...)

Compute the Pearson correlation coefficient to be used in a bivariate Gaussian copula.

Fields

p: The target correltation between the marginal distributions.
d1: The first marginal distribution.
d2: The second marginal distribution.
n: The degree of the polynomial used to estimate the matching correlation.
m: The number of points used in the hermite polynomial interpolation.
kwargs: Additional keyword arguments. Currently unused.

Examples

julia> using Distributions

julia> d1 = Beta(2, 3); d2 = Binomial(20, 0.2);

julia> pearson_match(0.6, d1, d2)
0.6127531346934495

source

pearson_match(R::AbstractMatrix{<:Real}, margins; n::Real=12, m::Real=128, kwargs...)

Pairwise compute the Pearson correlation coefficient to be used in a bivariate Gaussian copula. Ensures that the resulting matrix is a valid correlation matrix.

Fields

R: The target correltation matrix of the marginal distributions.
margins: A list of marginal distributions.
n: The degree of the polynomial used to estimate the matching correlation.
m: The number of points used in the hermite polynomial interpolation.
kwargs: Additional keyword arguments. Currently unused.

Examples

julia> using Distributions

julia> margins = [Beta(2, 3), Uniform(0, 1), Binomial(20, 0.2)];

julia> rho = [
    1.0 0.3 0.6
    0.3 1.0 0.4
    0.6 0.4 1.0
];

julia> pearson_match(rho, margins)
3×3 Matrix{Float64}:
 1.0       0.309111  0.612753
 0.309111  1.0       0.418761
 0.612753  0.418761  1.0

source

PearsonCorrelationMatch.pearson_bounds — Function

pearson_bounds(d1, d2; n::Real=20, m::Real=128, kwargs...)

Determine the range of admissible Pearson correlations between two distributions.

Fields

d1: The first marginal distribution.
d2: The second marginal distribution.
n: The degree of the polynomial used to estimate the bounds.
m: The number of points used in the hermite polynomial interpolation.
kwargs: Additional keyword arguments. Currently unused.

Details

The accuracy of the bounds depends on the degree of the polynomial and the number of hermite points. Be careful not to set the polynomial degree too high as Runge's theorem states that a polynomial of too high degree would cause oscillation at the edges of the interval and reduce accuracy.

In general raising the number of hermite points will result in better accuracy, but comes with a small performance hit. Furthermore the number of hermite points should be higher than the degree of the polynomial.

Examples

julia> using Distributions

julia> d1 = Exponential(3.14); d2 = NegativeBinomial(20, 0.2);

julia> pearson_bounds(d1, d2)
(lower = -0.8553947509241561, upper = 0.9413665073003636)

source

pearson_bounds(margins; n::Real=12, m::Real=128, kwargs...)

Determine the range of admissible Pearson correlations pairwise between a list of distributions.

Fields

margins: A list of marginal distributions.
n: The degree of the polynomial used to estimate the bounds.
m: The number of points used in the hermite polynomial interpolation.
kwargs: Additional keyword arguments. Currently unused.

Examples

julia> using Distributions

julia> margins = [Exponential(3.14), NegativeBinomial(20, 0.2), LogNormal(2.718)];

julia> lower, upper = pearson_bounds(margins);

julia> lower
3×3 Matrix{Float64}:
  1.0       -0.855395  -0.488737
 -0.855395   1.0       -0.704403
 -0.488737  -0.704403   1.0

julia> upper
3×3 Matrix{Float64}:
 1.0       0.941367  0.939671
 0.941367  1.0       0.815171
 0.939671  0.815171  1.0

source