limix.io.plink.read¶

limix.io.plink.read(prefix, verbose=True)[source]¶

Read PLINK files into Pandas data frames.

Parameters

prefix (str) – Path prefix to the set of PLINK files.
verbose (bool) – True for progress information; False otherwise.

Returns

alleles (pandas dataframe)
samples (pandas dataframe)
genotype (ndarray)

Examples

>>> from os.path import join
>>> from limix.io import plink
>>> from pandas_plink import get_data_folder
>>>
>>> (bim, fam, bed) = plink.read(join(get_data_folder(), "data"), verbose=False)
>>> print(bim.head())
           chrom         snp       cm    pos a0 a1  i
candidate
rs10399749     1  rs10399749  0.00000  45162  G  C  0
rs2949420      1   rs2949420  0.00000  45257  C  T  1
rs2949421      1   rs2949421  0.00000  45413  0  0  2
rs2691310      1   rs2691310  0.00000  46844  A  T  3
rs4030303      1   rs4030303  0.00000  72434  0  G  4
>>> print(fam.head())
               fid       iid    father    mother gender    trait  i
sample
Sample_1  Sample_1  Sample_1         0         0      1 -9.00000  0
Sample_2  Sample_2  Sample_2         0         0      2 -9.00000  1
Sample_3  Sample_3  Sample_3  Sample_1  Sample_2      2 -9.00000  2
>>> print(bed.compute())  
[[ 2.  2.  1.]
 [ 2.  1.  2.]
 [nan nan nan]
 [nan nan  1.]
 [ 2.  2.  2.]
 [ 2.  2.  2.]
 [ 2.  1.  0.]
 [ 2.  2.  2.]
 [ 1.  2.  2.]
 [ 2.  1.  2.]]

Notice the i column in bim and fam data frames. It maps to the corresponding position of the bed matrix:

>>> from os.path import join
>>> from limix.io import plink
>>> from pandas_plink import get_data_folder
>>>
>>> (bim, fam, bed) = plink.read(join(get_data_folder(), "data"), verbose=False)
>>> chrom1 = bim.query("chrom=='1'")
>>> X = bed[chrom1.i.values, :].compute()
>>> print(X)  
[[ 2.  2.  1.]
 [ 2.  1.  2.]
 [nan nan nan]
 [nan nan  1.]
 [ 2.  2.  2.]
 [ 2.  2.  2.]
 [ 2.  1.  0.]
 [ 2.  2.  2.]
 [ 1.  2.  2.]
 [ 2.  1.  2.]]