snipar.scripts.pgs module
Infers direct effects, non-transmitted coefficients (NTCs), and population effects of a PGS on a phenotype.
Minimally: the script requires observed genotypes on individuals and their parents, and/or parental genotypes imputed by snipar’s impute.py script, along with a SNP weights file.
- Args:
- ‘-h’, ‘–help’, default===SUPPRESS==
show this help message and exit
- : str
Prefix for computed PGS file and/or regression results files
- ‘–bgen’str
Address of the phased genotypes in .bgen format. If there is a @ in the address, @ is replaced by the chromosome numbers in the range of chr_range for each chromosome (chr_range is an optional parameters for this script).
- ‘–bed’str
Address of the unphased genotypes in .bed format. If there is a @ in the address, @ is replaced by the chromosome numbers in the range of chr_range for each chromosome (chr_range is an optional parameters for this script).
- ‘–imp’str
Address of hdf5 files with imputed parental genotypes (without .hdf5 suffix). If there is a @ in the address, @ is replaced by the chromosome numbers in the range of chr_range (chr_range is an optional parameters for this script).
- ‘–chr_range’
number of the chromosomes to be imputed. Should be a series of ranges with x-y format or integers.
- ‘–pedigree’str
Address of pedigree file. Must be provided if not providing imputed parental genotypes.
- ‘–weights’str
Location of the PGS allele weights
- ‘–SNP’str, default=SNP
Name of column in weights file with SNP IDs
- ‘–beta_col’str, default=b
Name of column with betas/weights for each SNP
- ‘–A1’str, default=A1
Name of column with allele beta/weights are given with respect to
- ‘–A2’str, default=A2
Name of column with alternative allele
- ‘–sep’str
Column separator in weights file. If not provided, an attempt to determine this will be made.
- ‘–phenofile’str
Location of the phenotype file
- ‘–pgs’str
Location of the pre-computed PGS file
- ‘–covar’str
Path to file with covariates: plain text file with columns FID, IID, covar1, covar2, ..
- ‘–fit_sib’
Fit indirect effects from siblings
- ‘–parsum’
Use the sum of maternal and paternal PGS in the regression (useful when imputed from sibling data alone)
- ‘–grandpar’
Calculate imputed/observed grandparental PGS for individuals with both parents genotyped
- ‘–gparsum’
Use the sum of maternal grandparents and the sum of paternal grandparents in the regression (useful when no grandparents genotyped)
- ‘–gen_models’, default=1-2
Which multi-generational models should be fit. Default fits 1 and 2 generation models. Specify a range by, for example, 1-3, where 3 fits a model with parental and grandparental scores
- ‘–h2f’str
Provide heritability estimate in form h2f,h2f_SE (e.g. 0.5,0.01) from MZ-DZ comparison, RDR, or sibling realized relatedness. If provided when also fitting 2 generation model, will adjust results for assortative mating assuming equilibrium.
- ‘–rk’str
Provide estimate of the correlation between parents PGIs in the form rk,rk_SE (e.g 0.1,0.01). If provided with h2f, will use for adjusting estimates for assortative mating.
- ‘–bpg’
Restrict sample to those with both parents genotyped
- ‘–phen_index’int, default=1
If the phenotype file contains multiple phenotypes, which phenotype should be analysed (default 1, first)
- ‘–ibdrel_path’str
Path to KING IBD segment inference output (without .seg prefix).
- ‘–sparse_thresh’float, default=0.05
Threshold of GRM/IBD sparsity
- ‘–scale_phen’
Scale the phenotype to have variance 1
- ‘–scale_pgs’
Scale the PGS to have variance 1 among the phenotyped individuals
- ‘–compute_controls’
Compute PGS for control families (default False)
- ‘–missing_char’str, default=NA
Missing value string in phenotype file (default NA)
- ‘–no_am_adj’
Do not adjust imputed parental PGSs for assortative mating
- ‘–force_am_adj’
Force assortative mating adjustment even when estimated correlation is noisy/not significant
- ‘–threads’int, default=1
Number of threads to use
- ‘–batch_size’int, default=10000
Batch size for reading in SNPs (default 10000)
- Results:
- PGS file
Output when inputting observed and imputed genotype files and a weights file. A file with PGS values for each individual and their parents, with suffix .pgs.txt. Also includes sibling PGS if –fit_sib is specified, and grandparental PGS if –grandpar is specified.
- PGS effect estimates
Output when inputting a phenotype file. A file with suffix effects.txt containing estimates of the PGS effects and their standard errors, and a file with suffix vcov.txt containing the sampling variance-covariance matrix of the effect estimates