limix.qtl.scanΒΆ
- limix.qtl.scan(G, Y, lik='normal', K=None, M=None, idx=None, A=None, A0=None, A1=None, verbose=True)[source]ΒΆ
Multi-trait association and interaction testing via linear mixed models.
Let n, c, and p be the number of samples, covariates, and traits, respectively. The outcome variable Y is a nΓp matrix distributed according to
vec(Y) ~ N((A β M) vec(π¨), Kβ = Cβ β K + Cβ β I) under Hβ.
A and M are design matrices of dimensions pΓp and nΓc provided by the user, where X is the usual matrix of covariates commonly used in single-trait models. π¨ is a cΓp matrix of fixed-effect sizes per trait. Cβ and Cβ are both symmetric matrices of dimensions pΓp, for which Cβ is guaranteed by our implementation to be of full rank. The parameters of the Hβ model are the matrices π¨, Cβ, and Cβ.
The additional models Hβ and Hβ are define as
vec(Y) ~ N((A β M) vec(π¨) + (Aβ β Gα΅’) vec(π©β), sβ Kβ)
and
vec(Y) ~ N((A β M) vec(π¨) + (Aβ β Gα΅’) vec(π©β) + (Aβ β Gα΅’) vec(π©β), sβ Kβ)
It performs likelihood-ratio tests for the following cases, where the first hypothesis is the null one while the second hypothesis is the alternative one:
Hβ vs Hβ: testing for vec(π©β) β π while vec(π©β) = π
Hβ vs Hβ: testing for [vec(π©β) vec(π©β)] β π
Hβ vs Hβ: testing for vec(π©β) β π
It supports generalized linear mixed models (GLMM) when a single trait is used. In this case, the following likelihoods are implemented:
Bernoulli
Probit
Binomial
Poisson
Formally, let p(π) be one of the supported probability distributions where π is its mean. The Hβ model is defined as follows:
yα΅’ βΌ p(πα΅’=g(zα΅’)) for π³ βΌ π(..., ...).
g(β ) is the corresponding canonical link function for the Bernoulli, Binomial, and Poisson likelihoods. The Probit likelihood, on the other hand, is a Bernoulli likelihood with probit link function.
- Parameters
G (nΓm array_like) β Genetic candidates.
Y (nΓp array_like) β Rows are samples and columns are phenotypes.
lik (tuple, "normal", "bernoulli", "probit", "binomial", "poisson") β Sample likelihood describing the residual distribution. Either a tuple or a string specifying the likelihood is required. The Normal, Bernoulli, Probit, and Poisson likelihoods can be selected by providing a string. Binomial likelihood on the other hand requires a tuple because of the number of trials:
("binomial", array_like)
. Defaults to"normal"
.K (nΓn array_like) β Sample covariance, often the so-called kinship matrix.
M (nΓc array_like) β Covariates matrix.
idx (list) β List of candidate indices that defines the set of candidates to be used in the tests.
A (pΓp array_like) β Symmetric trait-by-trait design matrix.
A0 (pΓpβ array_like, optional) β Matrix Aβ, possibility a non-symmetric one. If
None
, it defines an empty matrix, pβ=0. Defaults toNone
.A1 (pΓpβ array_like, optional) β Matrix Aβ, possibility a non-symmetric one. If
None
, it defines an identity matrix, pβ=p. Defaults toNone
.verbose (bool, optional) β
True
to display progress and summary;False
otherwise.
- Returns
result β P-values, log of marginal likelihoods, effect sizes, and associated statistics.
- Return type
limix.qtl._result.STScanResult
,limix.qtl._result.MTScanResult
Examples
>>> from limix.qtl import scan >>> from numpy import reshape, kron, eye >>> from numpy import concatenate >>> from numpy.random import RandomState >>> import scipy.stats as st >>> from limix.qc import normalise_covariance >>> >>> def vec(x): ... return reshape(x, (-1,) + x.shape[2:], order="F") >>> >>> def unvec(x, shape): ... return reshape(x, shape, order="F") >>> >>> random = RandomState(0) >>> n = 30 >>> ntraits = 2 >>> ncovariates = 3 >>> >>> A = random.randn(ntraits, ntraits) >>> A = A @ A.T >>> M = random.randn(n, ncovariates) >>> >>> C0 = random.randn(ntraits, ntraits) >>> C0 = C0 @ C0.T >>> >>> C1 = random.randn(ntraits, ntraits) >>> C1 = C1 @ C1.T >>> >>> G = random.randn(n, 4) >>> >>> A0 = random.randn(ntraits, 1) >>> A1 = random.randn(ntraits, 2) >>> A01 = concatenate((A0, A1), axis=1) >>> >>> K = random.randn(n, n + 1) >>> K = normalise_covariance(K @ K.T) >>> >>> beta = vec(random.randn(ntraits, ncovariates)) >>> alpha = vec(random.randn(A01.shape[1], G.shape[1])) >>> >>> mvn = st.multivariate_normal >>> m = kron(A, M) @ beta + kron(A01, G) @ alpha >>> Y = unvec(mvn(m, kron(C0, K) + kron(C1, eye(n))).rvs(), (n, -1)) >>> >>> idx = [[0, 1], 2, [3]] >>> r = scan(G, Y, idx=idx, K=K, M=M, A=A, A0=A0, A1=A1, verbose=False)
>>> from numpy import dot, exp, sqrt, ones >>> from numpy.random import RandomState >>> from pandas import DataFrame >>> import pandas as pd >>> from limix.qtl import scan >>> >>> random = RandomState(1) >>> pd.options.display.float_format = "{:9.6f}".format >>> >>> n = 30 >>> p = 3 >>> samples_index = range(n) >>> >>> M = DataFrame(dict(offset=ones(n), age=random.randint(10, 60, n))) >>> M.index = samples_index >>> >>> X = random.randn(n, 100) >>> K = dot(X, X.T) >>> >>> candidates = random.randn(n, p) >>> candidates = DataFrame(candidates, index=samples_index, ... columns=['rs0', 'rs1', 'rs2']) >>> >>> y = random.poisson(exp(random.randn(n))) >>> >>> result = scan(candidates, y, 'poisson', K, M=M, verbose=False) >>> >>> result.stats lml0 lml2 dof20 scale2 pv20 test 0 -48.720890 -48.536860 1 0.943532 0.544063 1 -48.720890 -47.908341 1 0.904814 0.202382 2 -48.720890 -48.534754 1 0.943400 0.541768 >>> print(result) Hypothesis 0 ------------ π³ ~ π(πΌπΆ, 0.000β πΊ + 0.788β πΈ) for yα΅’ ~ Poisson(Ξ»α΅’=g(zα΅’)) and g(x)=eΛ£ M = ['offset' 'age'] πΆ = [ 0.39528889 -0.00556797] se(πΆ) = [0.50173695 0.01505240] lml = -48.720890273519444 Hypothesis 2 ------------ π³ ~ π(πΌπΆ + Gπ, s(0.000β πΊ + 0.788β πΈ)) for yα΅’ ~ Poisson(Ξ»α΅’=g(zα΅’)) and g(x)=eΛ£ lml cov. effsizes cand. effsizes -------------------------------------------------- mean -4.833e+01 2.393e-01 -1.966e-01 std 3.623e-01 2.713e-01 1.028e-01 min -4.854e+01 -8.490e-03 -3.151e-01 25% -4.854e+01 -7.684e-03 -2.295e-01 50% -4.853e+01 2.243e-01 -1.439e-01 75% -4.822e+01 4.725e-01 -1.374e-01 max -4.791e+01 5.255e-01 -1.309e-01 Likelihood-ratio test p-values ------------------------------ πβ vs πβ ---------------- mean 4.294e-01 std 1.966e-01 min 2.024e-01 25% 3.721e-01 50% 5.418e-01 75% 5.429e-01 max 5.441e-01 >>> from numpy import zeros >>> >>> nsamples = 50 >>> >>> X = random.randn(nsamples, 2) >>> G = random.randn(nsamples, 100) >>> K = dot(G, G.T) >>> ntrials = random.randint(1, 100, nsamples) >>> z = dot(G, random.randn(100)) / sqrt(100) >>> >>> successes = zeros(len(ntrials), int) >>> for i, nt in enumerate(ntrials): ... for _ in range(nt): ... successes[i] += int(z[i] + 0.5 * random.randn() > 0) >>> >>> result = scan(X, successes, ("binomial", ntrials), K, verbose=False) >>> print(result) Hypothesis 0 ------------ π³ ~ π(πΌπΆ, 0.152β πΊ + 1.738β πΈ) for yα΅’ ~ Binom(ΞΌα΅’=g(zα΅’), nα΅’) and g(x)=1/(1+eβ»Λ£) M = ['offset'] πΆ = [0.40956942] se(πΆ) = [0.55141166] lml = -142.80784719977515 Hypothesis 2 ------------ π³ ~ π(πΌπΆ + Gπ, s(0.152β πΊ + 1.738β πΈ)) for yα΅’ ~ Binom(ΞΌα΅’=g(zα΅’), nα΅’) and g(x)=1/(1+eβ»Λ£) lml cov. effsizes cand. effsizes -------------------------------------------------- mean -1.425e+02 3.701e-01 2.271e-01 std 4.110e-01 2.296e-02 5.680e-01 min -1.427e+02 3.539e-01 -1.745e-01 25% -1.426e+02 3.620e-01 2.631e-02 50% -1.425e+02 3.701e-01 2.271e-01 75% -1.423e+02 3.782e-01 4.279e-01 max -1.422e+02 3.864e-01 6.287e-01 Likelihood-ratio test p-values ------------------------------ πβ vs πβ ---------------- mean 4.959e-01 std 3.362e-01 min 2.582e-01 25% 3.771e-01 50% 4.959e-01 75% 6.148e-01 max 7.336e-01
Notes
It will raise a
ValueError
exception if non-finite values are passed. Please, refer to thelimix.qc.mean_impute()
function for missing value imputation.