API Reference
Random Vector Generation
Bigsimr.rvec
— Functionrvec(n, rho, margins)
Generate samples for a list of marginal distributions and a correaltion structure.
Examples
julia> using Distributions
julia> margins = [Normal(3, 1), LogNormal(3, 1), Exponential(3)]
julia> R = [
1.00 -0.23 0.12
-0.23 1.00 -0.46
0.12 -0.46 1.00
];
julia> rvec(10, R, margins)
10×3 Matrix{Float64}:
3.89423 38.6339 1.30088
5.87344 11.5582 5.25233
3.62383 20.4001 3.25627
3.65075 3.8316 4.48547
1.62223 9.95032 1.48367
3.42208 35.0998 0.644814
1.82689 58.417 0.580125
4.73678 4.75506 11.2741
1.92511 9.44913 0.651013
3.19883 39.3707 0.581781
Bigsimr.rmvn
— Functionrmvn(n [, μ], Σ)
Utilizes available threads for fast generation of multivariate normal samples.
Examples
julia> μ = [-3, 1, 10];
julia> S = cor_randPD(3)
3×3 Matrix{Float64}:
1.0 -0.663633 -0.0909108
-0.663633 1.0 0.151582
-0.0909108 0.151582 1.0
julia> rmvn(10, μ, S)
10×3 Matrix{Float64}:
-1.32616 -1.02602 11.0202
-3.59396 2.84145 8.84367
-0.441537 -1.53279 8.82931
-4.69202 2.84618 10.5977
-2.63359 2.65779 9.8374
-3.75917 2.07208 8.90139
-3.00716 0.00897664 10.1173
-3.00928 0.851214 9.74029
-3.43021 0.402382 9.51274
-1.77849 0.157933 9.15944
Correlation Types
Bigsimr.BigsimrBase.CorType
— TypeCorType
A type used for specifiying the type of correlation. Supported correlations are:
Bigsimr.BigsimrBase.Pearson
— ConstantPearson
Pearson's $r$ product-moment correlation
Bigsimr.BigsimrBase.Spearman
— ConstantSpearman
Spearman's $ρ$ rank correlation
Bigsimr.BigsimrBase.Kendall
— ConstantKendall
Kendall's $τ$ rank correlation
Correlation Computation
Statistics.cor
— Functioncor(cortype, x [,y])
Compute the correlation matrix of a given type.
The possible correlation types are:
Examples
julia> x = [-1.62169 0.0158613 0.500375 -0.794381
2.50689 3.31666 -1.3049 2.16058
0.495674 0.348621 -0.614451 -0.193579
2.32149 2.18847 -1.83165 2.08399
-0.0573697 0.39908 0.270117 0.658458
0.365239 -0.321493 -1.60223 -0.199998
-0.55521 -0.898513 0.690267 0.857519
-0.356979 -1.03724 0.714859 -0.719657
-3.38438 -1.93058 1.77413 -1.23657
1.57527 0.836351 -1.13275 -0.277048];
julia> cor(Pearson, x)
4×4 Matrix{Float64}:
1.0 0.86985 -0.891312 0.767433
0.86985 1.0 -0.767115 0.817407
-0.891312 -0.767115 1.0 -0.596762
0.767433 0.817407 -0.596762 1.0
julia> cor(Spearman, x)
4×4 Matrix{Float64}:
1.0 0.866667 -0.854545 0.709091
0.866667 1.0 -0.781818 0.684848
-0.854545 -0.781818 1.0 -0.612121
0.709091 0.684848 -0.612121 1.0
julia> cor(Kendall, x)
4×4 Matrix{Float64}:
1.0 0.733333 -0.688889 0.555556
0.733333 1.0 -0.688889 0.555556
-0.688889 -0.688889 1.0 -0.422222
0.555556 0.555556 -0.422222 1.0
Bigsimr.cor_fast
— Functioncor_fast([cortype,] X)
Calculate the correlation matrix in parallel using available threads.
Bigsimr.cor_bounds
— Functioncor_bounds(d1, d2, cortype, samples)
Compute the stochastic lower and upper correlation bounds between two marginal distributions.
This method relies on sampling from each distribution and then estimating the specified correlation between the sorted samples. Because the samples are random, there will be some variation in the answer for each call to cor_bounds
. Increasing the number of samples will increase the accuracy of the estimate, but will also take longer to sort. Therefore ≈100,000 samples (the default) are recommended so that it runs fast while still returning a good estimate.
The possible correlation types are:
Examples
julia> using Distributions
julia> A = Normal(78, 10); B = LogNormal(3, 1);
julia> cor_bounds(A, B)
(lower = -0.7646512417819491, upper = 0.7649206069306482)
julia> cor_bounds(A, B, 1e6)
(lower = -0.765850375468031, upper = 0.7655170605697716)
julia> cor_bounds(A, B, Pearson)
(lower = -0.7631871539735006, upper = 0.7624398609255689)
julia> cor_bounds(A, B, Spearman, 250_000)
(lower = -1.0, upper = 1.0)
cor_bounds(margins, cortype, samples)
Compute the stochastic pairwise lower and upper correlation bounds between a set of marginal distributions.
The possible correlation types are:
Examples
julia> using Distributions
julia> margins = [Normal(78, 10), Binomial(20, 0.2), LogNormal(3, 1)];
julia> lower, upper = cor_bounds(margins, Pearson);
julia> lower
3×3 Matrix{Float64}:
1.0 -0.983111 -0.768184
-0.983111 1.0 -0.702151
-0.768184 -0.702151 1.0
julia> upper
3×3 Matrix{Float64}:
1.0 0.982471 0.766727
0.982471 1.0 0.798379
0.766727 0.798379 1.0
Random Correlation Matrix
Bigsimr.cor_randPSD
— Functioncor_randPSD([T,] d, k=d-1)
Return a random positive semidefinite correlation matrix where d
is the dimension ($d ≥ 1$) and k
is the number of factor loadings ($1 ≤ k < d$).
See also: cor_randPD
Examples
julia> cor_randPSD(Float64, 4, 2)
4×4 Matrix{Float64}:
1.0 0.276386 0.572837 0.192875
0.276386 1.0 0.493806 -0.352386
0.572837 0.493806 1.0 -0.450259
0.192875 -0.352386 -0.450259 1.0
julia> cor_randPSD(4, 1)
4×4 Matrix{Float64}:
1.0 -0.800513 0.541379 -0.650587
-0.800513 1.0 -0.656411 0.788824
0.541379 -0.656411 1.0 -0.533473
-0.650587 0.788824 -0.533473 1.0
julia> cor_randPSD(4)
4×4 Matrix{Float64}:
1.0 0.81691 -0.27188 0.289011
0.81691 1.0 -0.44968 0.190938
-0.27188 -0.44968 1.0 -0.102597
0.289011 0.190938 -0.102597 1.0
Bigsimr.cor_randPD
— Functioncor_randPD([T,] d, k=d-1)
The same as cor_randPSD
, but calls cor_fastPD
to ensure that the returned matrix is positive definite.
Examples
julia> cor_randPD(Float64, 4, 2)
4×4 Matrix{Float64}:
1.0 0.458549 -0.33164 0.492572
0.458549 1.0 -0.280873 0.62544
-0.33164 -0.280873 1.0 -0.315011
0.492572 0.62544 -0.315011 1.0
julia> cor_randPD(4, 1)
4×4 Matrix{Float64}:
1.0 -0.0406469 -0.127517 -0.133308
-0.0406469 1.0 0.265604 0.277665
-0.127517 0.265604 1.0 0.871089
-0.133308 0.277665 0.871089 1.0
julia> cor_randPD(4)
4×4 Matrix{Float64}:
1.0 0.356488 0.701521 -0.251671
0.356488 1.0 0.382787 -0.117748
0.701521 0.382787 1.0 -0.424952
-0.251671 -0.117748 -0.424952 1.0
Converting Correlation Types
Bigsimr.cor_convert
— Functioncor_convert(X, from, to)
Convert from one type of correlation matrix to another.
The role of conversion in this package is typically used from either Spearman or Kendall to Pearson where the Pearson correlation is used in the generation of random multivariate normal samples. After converting, the correlation matrix may not be positive semidefinite, so it is recommended to check using LinearAlgebra.isposdef
, and subsequently calling cor_nearPD
.
See also: cor_nearPD
, cor_fastPD
The possible correlation types are:
Examples
julia> r = [ 1.0 -0.634114 0.551645 0.548993
-0.634114 1.0 -0.332105 -0.772114
0.551645 -0.332105 1.0 0.143949
0.548993 -0.772114 0.143949 1.0];
julia> cor_convert(r, Pearson, Spearman)
4×4 Matrix{Float64}:
1.0 -0.616168 0.533701 0.531067
-0.616168 1.0 -0.318613 -0.756979
0.533701 -0.318613 1.0 0.13758
0.531067 -0.756979 0.13758 1.0
julia> cor_convert(r, Spearman, Kendall)
4×4 Matrix{Float64}:
1.0 -0.452063 0.385867 0.383807
-0.452063 1.0 -0.224941 -0.576435
0.385867 -0.224941 1.0 0.0962413
0.383807 -0.576435 0.0962413 1.0
julia> r == cor_convert(r, Pearson, Pearson)
true
cor_convert(X::AbstractMatrix, from, to)
When the input is a matrix, it is assumed to be a correlation matrix, and the resulting matrix is also constrained to be a correlation matrix (e.g. unit diagonal and off-digonal elements constrained between -1 and 1).
Correlation Utils
Convert a correlation matrix using other utilities.
Bigsimr.cor_constrain
— Functioncor_constrain(X::AbstractMatrix{<:Real})
Constrain a matrix so that its diagonal elements are 1, off-diagonal elements are bounded between -1 and 1, and a symmetric view of the upper triangle is made.
See also: cor_constrain!
Examples
julia> a = [ 0.802271 0.149801 -1.1072 1.13451
0.869788 -0.824395 0.38965 0.965936
-1.45353 -1.29282 0.417233 -0.362526
0.638291 -0.682503 1.12092 -1.27018];
julia> cor_constrain(a)
4×4 Matrix{Float64}:
1.0 0.149801 -1.0 1.0
0.149801 1.0 0.38965 0.965936
-1.0 0.38965 1.0 -0.362526
1.0 0.965936 -0.362526 1.0
Bigsimr.cor_constrain!
— Functioncor_constrain!(X::AbstractMatrix{<:Real})
Same as cor_constrain
, except that the matrix is updated in place to save memory.
Bigsimr.cov2cor
— Functioncov2cor(X::AbstractMatrix{<:Real})
Transform a covariance matrix into a correlation matrix.
Details
If $X \in \mathbb{R}^{n \times n}$ is a covariance matrix, then
\[\tilde{X} = D^{-1/2} X D^{-1/2}, \quad D = \mathrm{diag(X)}\]
is the associated correlation matrix.
Bigsimr.cov2cor!
— Functioncov2cor!(X::AbstractMatrix{<:Real})
Same as cov2cor
, except that the matrix is updated in place to save memory.
Bigsimr.is_correlation
— Functionis_correlation(X)
Check if the given matrix passes all the checks required to be a valid correlation matrix.
Criteria
A matrix is a valid correlation matrix if:
- Square
- Symmetric
- Diagonal elements are equal to
1
- Off diagonal elements are between
-1
and1
- Is positive definite
Examples
julia> x = rand(3, 3)
3×3 Matrix{Float64}:
0.834446 0.183285 0.837872
0.637295 0.270709 0.458703
0.626566 0.736907 0.61903
julia> is_correlation(x)
false
julia> x = cor_randPD(3)
3×3 Matrix{Float64}:
1.0 0.190911 0.449104
0.190911 1.0 0.636305
0.449104 0.636305 1.0
julia> is_correlation(x)
true
julia> r_negdef = [
1.00 0.82 0.56 0.44
0.82 1.00 0.28 0.85
0.56 0.28 1.00 0.22
0.44 0.85 0.22 1.00
];
julia> is_correlation(r_negdef)
false
Nearest Correlation Matrix
Provided by NearestCorrelationMatrix.jl
NearestCorrelationMatrix.Newton
— TypeNewtonNew(; tau, tol_cg, tol_ls, iter_cg, iter_ls)
Parameters
tau
: a tuning parameter controlling the smallest eigenvalue of the resulting matrixtol_cg
: the tolerance used in the conjugate gradient methodtol_ls
: the tolerance used in the line search methoditer_cg
: the max number of iterations in the conjugate gradient methoditer_ls
: the max number of refinements during the Newton step
NearestCorrelationMatrix.AlternatingProjections
— TypeAlternatingProjections(; tau=0)
The alternating projections algorithm developed by Nick Higham.
NearestCorrelationMatrix.DirectProjection
— TypeDirectProjection(; tau=eps())
Single step projection of the input matrix into the set of correlation matrices. Useful when a "close" correlation matrix is needed without concern for it being the most optimal.
Parameters
tau
: a tuning parameter controlling the smallest eigenvalue of the resulting matrix
NearestCorrelationMatrix.nearest_cor
— Functionnearest_cor(A, alg; kwargs...)
Return the nearest positive definite correlation matrix to $A$.
This is a "batteries included" method, and is designed to just work with as little thought as possible. Unlike solve!
, this method will return the nearest correlation matrix directly instead of a NCMSolution
object. Additionally the solution is checked to be positive definite, and corrected if it is not.
Examples
julia> import LinearAlgebra: isposdef
julia> r = [
1.00 0.82 0.56 0.44
0.82 1.00 0.28 0.85
0.56 0.28 1.00 0.22
0.44 0.85 0.22 1.00
];
julia> isposdef(r)
false
julia> p = nearest_cor(r)
4×4 Matrix{Float64}:
1.0 0.817095 0.559306 0.440514
0.817095 1.0 0.280196 0.847352
0.559306 0.280196 1.0 0.219582
0.440514 0.847352 0.219582 1.0
julia> isposdef(p)
true
NearestCorrelationMatrix.nearest_cor!
— Functionnearest_cor!(A, alg=nothing; kwargs...)
Return the nearest positive definite correlation matrix to $A$. This method will overwrite $A$.
This is a "batteries included" method, and is designed to just work with as little thought as possible. Unlike solve!
, this method will return the nearest correlation matrix directly instead of a NCMSolution
object. Additionally the solution is checked to be positive definite, and corrected if it is not.
Examples
julia> import LinearAlgebra: isposdef
julia> r = [
1.00 0.82 0.56 0.44
0.82 1.00 0.28 0.85
0.56 0.28 1.00 0.22
0.44 0.85 0.22 1.00
];
julia> isposdef(r)
false
julia> nearest_cor!(r)
4×4 Matrix{Float64}:
1.0 0.817095 0.559306 0.440514
0.817095 1.0 0.280196 0.847352
0.559306 0.280196 1.0 0.219582
0.440514 0.847352 0.219582 1.0
julia> isposdef(r)
true
Simplified Wrappers
Bigsimr.cor_nearPD
— Functioncor_nearPD(X)
Return the nearest positive definite correlation matrix to X
.
See also: cor_nearPSD
, cor_fastPD
Bigsimr.cor_nearPD!
— Functioncor_nearPD!(X)
Same as cor_nearPD
, but saves space by overwriting the input X
instead of creating a copy.
See also: cor_nearPSD!
, cor_fastPD!
Bigsimr.cor_nearPSD
— Functioncor_nearPSD(X)
Return the nearest positive [semi-] definite correlation matrix to X
.
See also: cor_nearPD
, cor_fastPD
Bigsimr.cor_nearPSD!
— Functioncor_nearPSD!(X)
Same as cor_nearPSD
, but saves space by overwriting the input X
instead of creating a copy.
See also: cor_nearPD!
, cor_fastPD!
Bigsimr.cor_fastPD
— Functioncor_fastPD(X[, tau])
Return a positive definite correlation matrix that is close to X
. tau
is a tuning parameter that controls the minimum eigenvalue of the resulting matrix. τ
can be set to zero if only a positive semidefinite matrix is needed.
See also: cor_nearPD
, cor_nearPSD
Bigsimr.cor_fastPD!
— Functioncor_fastPD!(X[, tau])
Same as cor_fastPD
, but saves space by overwriting the input X
instead of creating a copy.
See also: cor_nearPD!
, cor_nearPSD!
Pearson Correlation Matching
PearsonCorrelationMatch.pearson_match
— Functionpearson_match(p::Real, d1, d2; n::Real=12, m::Real=128, kwargs...)
Compute the Pearson correlation coefficient to be used in a bivariate Gaussian copula.
Fields
p
: The target correltation between the marginal distributions.d1
: The first marginal distribution.d2
: The second marginal distribution.n
: The degree of the polynomial used to estimate the matching correlation.m
: The number of points used in the hermite polynomial interpolation.kwargs
: Additional keyword arguments. Currently unused.
Examples
julia> using Distributions
julia> d1 = Beta(2, 3); d2 = Binomial(20, 0.2);
julia> pearson_match(0.6, d1, d2)
0.6127531346934495
pearson_match(R::AbstractMatrix{<:Real}, margins; n::Real=12, m::Real=128, kwargs...)
Pairwise compute the Pearson correlation coefficient to be used in a bivariate Gaussian copula. Ensures that the resulting matrix is a valid correlation matrix.
Fields
R
: The target correltation matrix of the marginal distributions.margins
: A list of marginal distributions.n
: The degree of the polynomial used to estimate the matching correlation.m
: The number of points used in the hermite polynomial interpolation.kwargs
: Additional keyword arguments. Currently unused.
Examples
julia> using Distributions
julia> margins = [Beta(2, 3), Uniform(0, 1), Binomial(20, 0.2)];
julia> rho = [
1.0 0.3 0.6
0.3 1.0 0.4
0.6 0.4 1.0
];
julia> pearson_match(rho, margins)
3×3 Matrix{Float64}:
1.0 0.309111 0.612753
0.309111 1.0 0.418761
0.612753 0.418761 1.0
PearsonCorrelationMatch.pearson_bounds
— Functionpearson_bounds(d1, d2; n::Real=20, m::Real=128, kwargs...)
Determine the range of admissible Pearson correlations between two distributions.
Fields
d1
: The first marginal distribution.d2
: The second marginal distribution.n
: The degree of the polynomial used to estimate the bounds.m
: The number of points used in the hermite polynomial interpolation.kwargs
: Additional keyword arguments. Currently unused.
Details
The accuracy of the bounds depends on the degree of the polynomial and the number of hermite points. Be careful not to set the polynomial degree too high as Runge's theorem states that a polynomial of too high degree would cause oscillation at the edges of the interval and reduce accuracy.
In general raising the number of hermite points will result in better accuracy, but comes with a small performance hit. Furthermore the number of hermite points should be higher than the degree of the polynomial.
Examples
julia> using Distributions
julia> d1 = Exponential(3.14); d2 = NegativeBinomial(20, 0.2);
julia> pearson_bounds(d1, d2)
(lower = -0.8553947509241561, upper = 0.9413665073003636)
pearson_bounds(margins; n::Real=12, m::Real=128, kwargs...)
Determine the range of admissible Pearson correlations pairwise between a list of distributions.
Fields
margins
: A list of marginal distributions.n
: The degree of the polynomial used to estimate the bounds.m
: The number of points used in the hermite polynomial interpolation.kwargs
: Additional keyword arguments. Currently unused.
Examples
julia> using Distributions
julia> margins = [Exponential(3.14), NegativeBinomial(20, 0.2), LogNormal(2.718)];
julia> lower, upper = pearson_bounds(margins);
julia> lower
3×3 Matrix{Float64}:
1.0 -0.855395 -0.488737
-0.855395 1.0 -0.704403
-0.488737 -0.704403 1.0
julia> upper
3×3 Matrix{Float64}:
1.0 0.941367 0.939671
0.941367 1.0 0.815171
0.939671 0.815171 1.0