Main Functions
Random Multivariate Vector
Types
MvSim.MvDistribution
— TypeMultivariate mixed distribution
Functions
MvSim.rvec
— Methodrvec(n::Int, ρ::Matrix{Float64}, margins::Vector{<:UnivariateDistribution})
Generate samples for a list of marginal distributions and a correaltion structure.
Base.rand
— Functionrand(D::MvDistribution, n::Int)
More general wrapper for rvec
.
MvSim.margins
— Methodmargins(D::MvDistribution)
Return the margins of the multivariate distribution.
Statistics.cor
— Methodcor(D::MvDistribution)
Return the correlation matrix of the multivariate distribution.
MvSim.cortype
— Methodcortype(D::MvDistribution)
Return the correlation matrix type of the multivariate distribution.
Base.eltype
— Methodeltype(D::MvDistribution)
Return the eltype of the correlation matrix of the multivariate distribution.
Correlations
Types
MvSim.Pearson
— TypePearson <: Correlation
Pearson's $r$ product-moment correlation
MvSim.Spearman
— TypeSpearman <: Correlation
Spearman's $ρ$ rank correlation
MvSim.Kendall
— TypeKendall <: Correlation
Kendall's $τ$ rank correlation
Estimating
Statistics.cor
— Functioncor(x[, y], ::Type{<:Correlation})
Compute the correlation matrix of a given type.
The possible correlation types are:
Examples
julia> x = [-1.62169 0.0158613 0.500375 -0.794381
2.50689 3.31666 -1.3049 2.16058
0.495674 0.348621 -0.614451 -0.193579
2.32149 2.18847 -1.83165 2.08399
-0.0573697 0.39908 0.270117 0.658458
0.365239 -0.321493 -1.60223 -0.199998
-0.55521 -0.898513 0.690267 0.857519
-0.356979 -1.03724 0.714859 -0.719657
-3.38438 -1.93058 1.77413 -1.23657
1.57527 0.836351 -1.13275 -0.277048];
julia> cor(x, Pearson)
4×4 Array{Float64,2}:
1.0 0.86985 -0.891312 0.767433
0.86985 1.0 -0.767115 0.817407
-0.891312 -0.767115 1.0 -0.596762
0.767433 0.817407 -0.596762 1.0
julia> cor(x, Spearman)
4×4 Array{Float64,2}:
1.0 0.866667 -0.854545 0.709091
0.866667 1.0 -0.781818 0.684848
-0.854545 -0.781818 1.0 -0.612121
0.709091 0.684848 -0.612121 1.0
julia> cor(x, Kendall)
4×4 Array{Float64,2}:
1.0 0.733333 -0.688889 0.555556
0.733333 1.0 -0.688889 0.555556
-0.688889 -0.688889 1.0 -0.422222
0.555556 0.555556 -0.422222 1.0
MvSim.cor_bounds
— Functioncor_bounds(dA::UnivariateDistribution, dB::UnivariateDistribution, C::Type{<:Correlation}=Pearson; n_samples::Int=100_000)
Compute the stochastic lower and upper correlation bounds between two marginal distributions.
This method relies on sampling from each distribution and then estimating the specified correlation between the sorted samples. Because the samples are random, there will be some variation in the answer for each call to cor_bounds
. Increasing the number of samples will increase the accuracy of the estimate, but will also take longer to sort. Therefore ≈100,000 samples (the default) are recommended so that it runs fast while still returning a good estimate.
The possible correlation types are:
Examples
julia> using Distributions
julia> A = Normal(78, 10); B = LogNormal(3, 1);
julia> cor_bounds(A, B)
(lower = -0.7646512417819491, upper = 0.7649206069306482)
julia> cor_bounds(A, B, n_samples=Int(1e9))
(lower = -0.7629776825238167, upper = 0.7629762333824238)
julia> cor_bounds(A, B, n_samples=Int(1e4))
(lower = -0.7507010142250724, upper = 0.7551879647701095)
julia> cor_bounds(A, B, Spearman)
(lower = -1.0, upper = 1.0)
cor_bounds(D::MvDistribution; n_samples::Int=100_000)
Compute the pairwise stochastic lower and upper correlation bounds between all marginal distributions.
Examples
julia> using Distributions
julia> margins = [Normal(78, 10), LogNormal(3, 1)];
julia> r = [1.0 0.6; 0.6 1.0]
2×2 Array{Float64,2}:
1.0 0.6
0.6 1.0
julia> D = MvDistribution(r, margins, Pearson);
julia> bounds = cor_bounds(D);
julia> bounds.lower
2×2 Array{Float64,2}:
1.0 -0.76892
-0.76892 1.0
julia> bounds.upper
2×2 Array{Float64,2}:
1.0 0.768713
0.768713 1.0
Generating
MvSim.cor_randPSD
— Functioncor_randPSD([T::Type{<:AbstractFloat}], d::Int[, k::Int=d])
Return a random positive semidefinite correlation matrix.
See also: cor_randPD
Examples
julia> cor_randPSD(4)
4×4 Array{Float64,2}:
1.0 0.81691 -0.27188 0.289011
0.81691 1.0 -0.44968 0.190938
-0.27188 -0.44968 1.0 -0.102597
0.289011 0.190938 -0.102597 1.0
julia> cor_randPSD(4, 1)
4×4 Array{Float64,2}:
1.0 -0.800513 0.541379 -0.650587
-0.800513 1.0 -0.656411 0.788824
0.541379 -0.656411 1.0 -0.533473
-0.650587 0.788824 -0.533473 1.0
MvSim.cor_randPD
— Functioncor_randPD([T::Type{<:AbstractFloat}], d::Int[, k::Int=d])
The same as cor_randPSD
, but calls cor_fastPD
to ensure that the returned matrix is positive definite.
Examples
julia> cor_randPSD(4)
4×4 Array{Float64,2}:
1.0 0.356488 0.701521 -0.251671
0.356488 1.0 0.382787 -0.117748
0.701521 0.382787 1.0 -0.424952
-0.251671 -0.117748 -0.424952 1.0
julia> cor_randPSD(4, 1)
4×4 Array{Float64,2}:
1.0 -0.0406469 -0.127517 -0.133308
-0.0406469 1.0 0.265604 0.277665
-0.127517 0.265604 1.0 0.871089
-0.133308 0.277665 0.871089 1.0
Converting
Convert a correlation matrix by finding a positive [semi]definite representation.
MvSim.cor_nearPD
— Functioncor_nearPD(R::Matrix{Float64}[, τ::Float64=1e-6[, tol::Float64=1e-6]])
Return the nearest positive definite correlation matrix to R
.
See also: cor_fastPD
, cor_fastPD!
Examples
julia> import LinearAlgebra: isposdef
julia> r = [1.00 0.82 0.56 0.44; 0.82 1.00 0.28 0.85; 0.56 0.28 1.00 0.22; 0.44 0.85 0.22 1.00]
4×4 Array{Float64,2}:
1.0 0.82 0.56 0.44
0.82 1.0 0.28 0.85
0.56 0.28 1.0 0.22
0.44 0.85 0.22 1.0
julia> isposdef(r)
false
julia> r̃ = cor_nearPD(r)
4×4 Array{Float64,2}:
1.0 0.817494 0.559416 0.441494
0.817494 1.0 0.280852 0.847812
0.559416 0.280852 1.0 0.21949
0.441494 0.847812 0.21949 1.0
julia> isposdef(r̃)
true
MvSim.cor_fastPD
— Functioncor_fastPD(R::Matrix{<:AbstractFloat}[, τ=1e-6])
Return a positive definite correlation matrix that is close to R
.
See also: cor_fastPD!
, cor_nearPD
Examples
julia> import LinearAlgebra: isposdef
julia> r = [1.00 0.82 0.56 0.44; 0.82 1.00 0.28 0.85; 0.56 0.28 1.00 0.22; 0.44 0.85 0.22 1.00]
4×4 Array{Float64,2}:
1.0 0.82 0.56 0.44
0.82 1.0 0.28 0.85
0.56 0.28 1.0 0.22
0.44 0.85 0.22 1.0
julia> isposdef(r)
false
julia> r̃ = cor_fastPD(r)
4×4 Array{Float64,2}:
1.0 0.817095 0.559306 0.440514
0.817095 1.0 0.280196 0.847352
0.559306 0.280196 1.0 0.219582
0.440514 0.847352 0.219582 1.0
julia> isposdef(r̃)
true
MvSim.cor_fastPD!
— Functioncor_fastPD!(R::Matrix{<:AbstractFloat}[, τ=1e-6])
Same as cor_fastPD
, but saves space by overwriting the input R
, instead of creating a copy.
See also: cor_fastPD
, cor_nearPD
Examples
julia> import LinearAlgebra: isposdef
julia> r = [1.00 0.82 0.56 0.44; 0.82 1.00 0.28 0.85; 0.56 0.28 1.00 0.22; 0.44 0.85 0.22 1.00]
4×4 Array{Float64,2}:
1.0 0.82 0.56 0.44
0.82 1.0 0.28 0.85
0.56 0.28 1.0 0.22
0.44 0.85 0.22 1.0
julia> isposdef(r)
false
julia> cor_fastPD!(r)
4×4 Array{Float64,2}:
1.0 0.817095 0.559306 0.440514
0.817095 1.0 0.280196 0.847352
0.559306 0.280196 1.0 0.219582
0.440514 0.847352 0.219582 1.0
julia> isposdef(r)
true
Convert a correlation matrix using other utilities.
MvSim.cor_convert
— Functioncor_convert(R::Matrix{<:AbstractFloat}, from::Type{<:Correlation}, to::Type{<:Correlation})
Convert from one type of correlation matrix to another.
The role of conversion in this package is typically used from either Spearman or Kendall to Pearson where the Pearson correlation is used in the generation of random multivariate normal samples. After converting, the correlation matrix may not be positive semidefinite, so it is recommended to check using LinearAlgebra.isposdef
, and subsequently calling cor_nearPD
.
See also: cor_nearPD
, cor_fastPD
The possible correlation types are:
Examples
julia> r = [ 1.0 -0.634114 0.551645 0.548993
-0.634114 1.0 -0.332105 -0.772114
0.551645 -0.332105 1.0 0.143949
0.548993 -0.772114 0.143949 1.0];
julia> cor_convert(r, Pearson, Spearman)
4×4 Array{Float64,2}:
1.0 -0.616168 0.533701 0.531067
-0.616168 1.0 -0.318613 -0.756979
0.533701 -0.318613 1.0 0.13758
0.531067 -0.756979 0.13758 1.0
julia> cor_convert(r, Spearman, Kendall)
4×4 Array{Float64,2}:
1.0 -0.452063 0.385867 0.383807
-0.452063 1.0 -0.224941 -0.576435
0.385867 -0.224941 1.0 0.0962413
0.383807 -0.576435 0.0962413 1.0
julia> r == cor_convert(r, Pearson, Pearson)
true
MvSim.cov2cor
— Functioncov2cor(C::Matrix{<:AbstractFloat})
Transform a covariance matrix into a correlation matrix.
Details
If $X \in \mathbb{R}^{n \times n}$ is a covariance matrix, then
\[\tilde{X} = D^{-1/2} X D^{-1/2}, \quad D = \mathrm{diag(X)}\]
is the associated correlation matrix.
MvSim.cov2cor!
— Functioncov2cor!(C::Matrix{<:AbstractFloat})
Same as cov2cor
, except that the matrix C
is updated in place to save memory.
MvSim.cor_constrain
— Functioncor_constrain(C::Matrix{<:AbstractFloat}[, uplo=:U])
Constrain a matrix so that its diagonal elements are 1, off-diagonal elements are bounded between -1 and 1, and a symmetric view of the upper (if uplo = :U
) or lower (if uplo = :L
) triangle is made.
See also: cor_constrain!
Examples
julia> a = [ 0.802271 0.149801 -1.1072 1.13451
0.869788 -0.824395 0.38965 0.965936
-1.45353 -1.29282 0.417233 -0.362526
0.638291 -0.682503 1.12092 -1.27018];
julia> cor_constrain(a)
4×4 Array{Float64,2}:
1.0 0.149801 -1.0 1.0
0.149801 1.0 0.38965 0.965936
-1.0 0.38965 1.0 -0.362526
1.0 0.965936 -0.362526 1.0
julia> cor_constrain(a, :L)
4×4 Array{Float64,2}:
1.0 0.869788 -1.0 0.638291
0.869788 1.0 -1.0 -0.682503
-1.0 -1.0 1.0 1.0
0.638291 -0.682503 1.0 1.0
MvSim.cor_constrain!
— Functioncor_constrain!(C::Matrix{<:AbstractFloat}[, uplo=:U])
Same as cor_constrain
, except that the matrix C
is updated in place to save memory.
Examples
julia> a = [ 0.802271 0.149801 -1.1072 1.13451
0.869788 -0.824395 0.38965 0.965936
-1.45353 -1.29282 0.417233 -0.362526
0.638291 -0.682503 1.12092 -1.27018];
julia> cor_constrain!(a)
julia> a
4×4 Array{Float64,2}:
1.0 0.149801 -1.0 1.0
0.149801 1.0 0.38965 0.965936
-1.0 0.38965 1.0 -0.362526
1.0 0.965936 -0.362526 1.0
Pearson Matching
MvSim.pearson_match
— Functionpearson_match(ρ::Real, dA::UnivariateDistribution, dB::UnivariateDistribution; n::Int=7)
Compute the pearson correlation coefficient that is necessary to achieve the target correlation given a pair of marginal distributions.
See also: pearson_bounds
Examples
julia> using Distributions
julia> A = Normal(78, 10); B = LogNormal(3, 1);
julia> pearson_match(0.76, A, B)
0.9962326957691602
The target correlation may not be feasible (see pearson_bounds
), in which case the match to the nearest lower or upper bound is returned.
julia> pearson_match(0.9, A, B)
┌ Warning: The target correlation is not feasible. Returning the match to the nearest bound instead.
[...]
0.9986891675055749
pearson_match(D::MvDistribution; n::Int=7)
Return a MvDistribution type with the matched Pearson correlation coefficients and the Pearson correlation type.
If the input correlation matrix is anything other than Pearson
, then convert it to Pearson, perform the matching algorithm, and then finally ensure that the resulting correlation matrix is positive definite. The target correlation may not be feasible (see pearson_bounds
), in which case the match to the nearest lower or upper bound is returned.
See also: pearson_bounds
Examples
julia> using Distributions
julia> margins = [Normal(78, 10), LogNormal(3, 1)];
julia> r = [1.0 0.7; 0.7 1.0]
2×2 Array{Float64,2}:
1.0 0.7
0.7 1.0
julia> D = MvDistribution(r, margins, Spearman);
julia> R = pearson_match(D);
julia> cor(R)
2×2 Array{Float64,2}:
1.0 0.917583
0.917583 1.0
julia> cortype(R)
Pearson
MvSim.pearson_bounds
— Functionpearson_bounds(dA::UnivariateDistribution, dB::UnivariateDistribution, μA, μB, σA, σB; n::Int=7)
Compute the theoretical lower and upper Pearson correlation values for a pair of univariate distributions.
See also: pearson_match
Examples
julia> using Distributions
julia> A = Normal(78, 10); B = LogNormal(3, 1);
julia> pearson_bounds(A, B)
(lower = -0.7628739783665699, upper = 0.7628739783663034)
pearson_bounds(D::MvDistribution)
Compute the pairwise theoretical lower and upper Pearson correlation values for a set of univariate distributions. The correlation matrix and correlation type are ignored when using this function on the MvDistribution
type.
See also: pearson_match
Examples
julia> using Distributions
julia> margins = [Normal(78, 10), LogNormal(3, 1)];
julia> r = [1.0 0.7; 0.7 1.0]
2×2 Array{Float64,2}:
1.0 0.7
0.7 1.0
julia> D = MvDistribution(r, margins, Pearson);
julia> bounds = pearson_bounds(D);
julia> bounds.lower
2×2 Array{Float64,2}:
1.0 -0.762874
-0.762874 1.0
julia> bounds.upper
2×2 Array{Float64,2}:
1.0 0.762874
0.762874 1.0