snipar.read package

Submodules

Module contents

snipar.read.get_gts_matrix(ped=None, bedfile=None, bgenfile=None, par_gts_f=None, snp_ids=None, ids=None, parsum=False, sib=False, compute_controls=False, verbose=False, print_sample_info=False)[source]

Reads observed and imputed genotypes and constructs a family based genotype matrix for the individuals with observed/imputed parental genotypes, and if sib=True, at least one genotyped sibling.

Args:
par_gts_fstr

path to HDF5 file with imputed parental genotypes

gts_fstr

path to bed file with observed genotypes

snp_idsnumpy.ndarray

If provided, only obtains the subset of SNPs specificed that are present in both imputed and observed genotypes

idsnumpy.ndarray

If provided, only obtains the ids with observed genotypes and imputed/observed parental genotypes (and observed sibling genotypes if sib=True)

sibbool

Retrieve genotypes for individuals with at least one genotyped sibling along with the average of their siblings’ genotypes and observed/imputed parental genotypes. Default False.

compute_controlsbool

Compute polygenic scores for control families (families with observed parental genotypes set to missing). Default False.

parsumbool

Return the sum of maternal and paternal observed/imputed genotypes rather than separate maternal/paternal genotypes. Default False.

Returns:
Gsnipar.gtarray

Genotype array for the subset of genotyped individuals with complete imputed/obsereved parental genotypes. The array is [N x k x L], where N is the number of individuals; k depends on whether sib=True and whether parsum=True; and L is the number of SNPs. If sib=False and parsum=False, then k=3 and this axis indexes individual’s genotypes, individual’s father’s imputed/observed genotypes, individual’s mother’s imputed/observed genotypes. If sib=True and parsum=False, then k=4, and this axis indexes the individual, the sibling, the paternal, and maternal genotypes in that order. If parsum=True and sib=False, then k=2, and this axis indexes the individual and sum of paternal and maternal genotypes; etc. If compute_controls=True, then a list is returned, where the first element is as above, and the following elements give equivalent genotyping arrays for control families where the mother has been set to missing, the father has been set to missing, and both parents have been set to missing.