snipar.scripts.gwas module
Infers direct effects, non-transmitted coefficients (NTCs), and population effects of genome-wide SNPs on a phenotype.
Minimally: the script requires observed genotypes on phenotyped individuals and their parents, and/or parental genotypes imputed by snipar’s impute.py script, along with a phenotype file.
- Args:
- ‘-h’, ‘–help’, default===SUPPRESS==
show this help message and exit
- : str
Location of the phenotype file
- ‘–bgen’str
Address of the phased genotypes in .bgen format. If there is a @ in the address, @ is replaced by the chromosome numbers in the range of chr_range for each chromosome (chr_range is an optional parameters for this script).
- ‘–bed’str
Address of the unphased genotypes in .bed format. If there is a @ in the address, @ is replaced by the chromosome numbers in the range of chr_range for each chromosome (chr_range is an optional parameters for this script).
- ‘–imp’str
Address of hdf5 files with imputed parental genotypes (without .hdf5 suffix). If there is a @ in the address, @ is replaced by the chromosome numbers in the range of chr_range (chr_range is an optional parameters for this script).
- ‘–chr_range’
number of the chromosomes to be imputed. Should be a series of ranges with x-y format or integers.
- ‘–out’str, default=./
The summary statistics will output to this path, one file for each chromosome. If the path contains ‘@’, the ‘@’ will be replaced with the chromosome number. Otherwise, the summary statistics will be output to the given path with file names chr_1.sumstats.gz, chr_2.sumstats.gz, etc. for the text sumstats, and chr_1.sumstats.hdf5, etc. for the HDF5 sumstats
- ‘–pedigree’str
Address of pedigree file. Must be provided if not providing imputed parental genotypes.
- ‘–parsum’
Regress onto proband and sum of (imputed/observed) maternal and paternal genotypes. Default uses separate paternal and maternal genotypes when available.
- ‘–fit_sib’
Fit indirect effect from sibling
- ‘–covar’str
Path to file with covariates: plain text file with columns FID, IID, covar1, covar2, ..
- ‘–phen_index’int, default=1
If the phenotype file contains multiple phenotypes, which phenotype should be analysed (default 1, first)
- ‘–min_maf’float, default=0.01
Ignore SNPs with minor allele frequency below min_maf (default 0.01)
- ‘–threads’int
Number of threads to use for IBD inference. Uses all available by default.
- ‘–max_missing’float, default=5
Ignore SNPs with greater percent missing calls than max_missing (default 5)
- ‘–batch_size’int, default=100000
Batch size of SNPs to load at a time (reduce to reduce memory requirements)
- ‘–no_hdf5_out’
Suppress HDF5 output of summary statistics
- ‘–no_txt_out’
Suppress text output of summary statistics
- ‘–missing_char’str, default=NA
Missing value string in phenotype file (default NA)
- ‘–tau_init’float, default=1
Initial value for ratio between shared family environmental variance and residual variance
- Results:
- sumstats.gz
For each chromosome, a gzipped text file containing the SNP level summary statistics.