limix.io.plink.read¶
- limix.io.plink.read(prefix, verbose=True)[source]¶
Read PLINK files into Pandas data frames.
- Parameters
- Returns
alleles (pandas dataframe)
samples (pandas dataframe)
genotype (ndarray)
Examples
>>> from os.path import join >>> from limix.io import plink >>> from pandas_plink import get_data_folder >>> >>> (bim, fam, bed) = plink.read(join(get_data_folder(), "data"), verbose=False) >>> print(bim.head()) chrom snp cm pos a0 a1 i candidate rs10399749 1 rs10399749 0.00000 45162 G C 0 rs2949420 1 rs2949420 0.00000 45257 C T 1 rs2949421 1 rs2949421 0.00000 45413 0 0 2 rs2691310 1 rs2691310 0.00000 46844 A T 3 rs4030303 1 rs4030303 0.00000 72434 0 G 4 >>> print(fam.head()) fid iid father mother gender trait i sample Sample_1 Sample_1 Sample_1 0 0 1 -9.00000 0 Sample_2 Sample_2 Sample_2 0 0 2 -9.00000 1 Sample_3 Sample_3 Sample_3 Sample_1 Sample_2 2 -9.00000 2 >>> print(bed.compute()) [[ 2. 2. 1.] [ 2. 1. 2.] [nan nan nan] [nan nan 1.] [ 2. 2. 2.] [ 2. 2. 2.] [ 2. 1. 0.] [ 2. 2. 2.] [ 1. 2. 2.] [ 2. 1. 2.]]
Notice the
i
column in bim and fam data frames. It maps to the corresponding position of the bed matrix:>>> from os.path import join >>> from limix.io import plink >>> from pandas_plink import get_data_folder >>> >>> (bim, fam, bed) = plink.read(join(get_data_folder(), "data"), verbose=False) >>> chrom1 = bim.query("chrom=='1'") >>> X = bed[chrom1.i.values, :].compute() >>> print(X) [[ 2. 2. 1.] [ 2. 1. 2.] [nan nan nan] [nan nan 1.] [ 2. 2. 2.] [ 2. 2. 2.] [ 2. 1. 0.] [ 2. 2. 2.] [ 1. 2. 2.] [ 2. 1. 2.]]