limix.stats.allele_expectation

limix.stats.allele_expectation(p, nalleles, ploidy)[source]

Allele expectation.

Compute the expectation of each allele from the given probabilities. It accepts three shapes of matrices: - unidimensional array of probabilities; - bidimensional samples-by-alleles probabilities array; - and three dimensional variants-by-samples-by-alleles array.

Parameters
  • p (array_like) – Allele probabilities.

  • nalleles (int) – Number of alleles.

  • ploidy (int) – Number of complete sets of chromosomes.

Returns

expectation – Last dimension will contain the expectation of each allele.

Return type

ndarray

Examples

>>> from texttable import Texttable
>>> from bgen_reader import read_bgen, allele_expectation, example_files
>>>
>>> sampleid = "sample_005"
>>> rsid = "RSID_6"
>>>
>>> with example_files("example.32bits.bgen") as filepath:
...     bgen = read_bgen(filepath, verbose=False)
...
...     locus = bgen["variants"].query("rsid == '{}'".format(rsid)).index
...     locus = locus.compute().values[0]
...     sample = bgen["samples"].to_frame().query("id == '{}'".format(sampleid))
...     sample = sample.index
...     sample = sample[0]
...
...     nalleles = bgen["variants"].loc[locus]["nalleles"]
...     ploidy = 2
...
...     p = bgen["genotype"][locus].compute()["probs"][sample]
...     # For unphased genotypes only.
...     e = allele_expectation(bgen, locus)[sample]
...
...     alleles = bgen["variants"].loc[locus]["allele_ids"].compute()
...     alleles = alleles.values[0].split(",")
...
...     tab = Texttable()
...
...     print(tab.add_rows(
...         [
...             ["", "AA", "AG", "GG", "E[.]"],
...             ["p"] + list(p) + [1.0],
...             ["#" + alleles[0], 2, 1, 0, e[0]],
...             ["#" + alleles[1], 0, 1, 2, e[1]],
...         ]
...     ).draw())
+----+-------+-------+-------+-------+
|    |  AA   |  AG   |  GG   | E[.]  |
+====+=======+=======+=======+=======+
| p  | 0.012 | 0.987 | 0.001 | 1     |
+----+-------+-------+-------+-------+
| #A | 2     | 1     | 0     | 1.011 |
+----+-------+-------+-------+-------+
| #G | 0     | 1     | 2     | 0.989 |
+----+-------+-------+-------+-------+
>>> print("variant: {}".format(rsid))
variant: RSID_6
>>> print("sample : {}".format(sampleid))
sample : sample_005

Note

This function supports unphased genotypes only.