limix
latest

Table of contents

  • Install
  • I/O module
  • Quality control
  • Quantitative trait locus
  • Variance decomposition
  • Heritability estimation
  • Plotting
  • Command line interface
  • Statistics
  • Model
  • API reference
    • I/O module
      • limix.io.bgen.read
      • limix.io.bimbam.read_phenotype
      • limix.io.csv.read
      • limix.io.gen.read
      • limix.io.hdf5.read_limix
      • limix.io.npy.read
      • limix.io.plink.read
    • Quality control
    • Statistics
    • Heritability estimation
    • Variance decomposition
    • Quantitative trait loci
    • Plotting & Graphics
    • Shell utilities
limix
  • »
  • API reference »
  • limix.io.plink.read
  • Edit on GitHub

limix.io.plink.read¶

limix.io.plink.read(prefix, verbose=True)[source]¶

Read PLINK files into Pandas data frames.

Parameters
  • prefix (str) – Path prefix to the set of PLINK files.

  • verbose (bool) – True for progress information; False otherwise.

Returns

  • alleles (pandas dataframe)

  • samples (pandas dataframe)

  • genotype (ndarray)

Examples

>>> from os.path import join
>>> from limix.io import plink
>>> from pandas_plink import get_data_folder
>>>
>>> (bim, fam, bed) = plink.read(join(get_data_folder(), "data"), verbose=False)
>>> print(bim.head())
           chrom         snp       cm    pos a0 a1  i
candidate
rs10399749     1  rs10399749  0.00000  45162  G  C  0
rs2949420      1   rs2949420  0.00000  45257  C  T  1
rs2949421      1   rs2949421  0.00000  45413  0  0  2
rs2691310      1   rs2691310  0.00000  46844  A  T  3
rs4030303      1   rs4030303  0.00000  72434  0  G  4
>>> print(fam.head())
               fid       iid    father    mother gender    trait  i
sample
Sample_1  Sample_1  Sample_1         0         0      1 -9.00000  0
Sample_2  Sample_2  Sample_2         0         0      2 -9.00000  1
Sample_3  Sample_3  Sample_3  Sample_1  Sample_2      2 -9.00000  2
>>> print(bed.compute())  
[[ 2.  2.  1.]
 [ 2.  1.  2.]
 [nan nan nan]
 [nan nan  1.]
 [ 2.  2.  2.]
 [ 2.  2.  2.]
 [ 2.  1.  0.]
 [ 2.  2.  2.]
 [ 1.  2.  2.]
 [ 2.  1.  2.]]

Notice the i column in bim and fam data frames. It maps to the corresponding position of the bed matrix:

>>> from os.path import join
>>> from limix.io import plink
>>> from pandas_plink import get_data_folder
>>>
>>> (bim, fam, bed) = plink.read(join(get_data_folder(), "data"), verbose=False)
>>> chrom1 = bim.query("chrom=='1'")
>>> X = bed[chrom1.i.values, :].compute()
>>> print(X)  
[[ 2.  2.  1.]
 [ 2.  1.  2.]
 [nan nan nan]
 [nan nan  1.]
 [ 2.  2.  2.]
 [ 2.  2.  2.]
 [ 2.  1.  0.]
 [ 2.  2.  2.]
 [ 1.  2.  2.]
 [ 2.  1.  2.]]
Next Previous

© Copyright 2018, Danilo Horta. Revision bed5b8e0.

Built with Sphinx using a theme provided by Read the Docs.