snipar.scripts.gwas module

Infers direct effects, non-transmitted coefficients (NTCs), and population effects of genome-wide SNPs on a phenotype.

Minimally: the script requires observed genotypes on phenotyped individuals and their parents, and/or parental genotypes imputed by snipar’s impute.py script, along with a phenotype file.

Args:
‘-h’, ‘–help’, default===SUPPRESS==

show this help message and exit

: str

Location of the phenotype file

‘–bgen’str

Address of the phased genotypes in .bgen format. If there is a @ in the address, @ is replaced by the chromosome numbers in the range of chr_range for each chromosome (chr_range is an optional parameters for this script).

‘–bed’str

Address of the unphased genotypes in .bed format. If there is a @ in the address, @ is replaced by the chromosome numbers in the range of chr_range for each chromosome (chr_range is an optional parameters for this script).

‘–imp’str

Address of hdf5 files with imputed parental genotypes (without .hdf5 suffix). If there is a @ in the address, @ is replaced by the chromosome numbers in the range of chr_range (chr_range is an optional parameters for this script).

‘–chr_range’

number of the chromosomes to be imputed. Should be a series of ranges with x-y format or integers.

‘–out’str, default=./

The summary statistics will output to this path, one file for each chromosome. If the path contains ‘@’, the ‘@’ will be replaced with the chromosome number. Otherwise, the summary statistics will be output to the given path with file names chr_1.sumstats.gz, chr_2.sumstats.gz, etc. for the text sumstats, and chr_1.sumstats.hdf5, etc. for the HDF5 sumstats

‘–pedigree’str

Address of pedigree file. Must be provided if not providing imputed parental genotypes.

‘–parsum’

Regress onto proband and sum of (imputed/observed) maternal and paternal genotypes. Default uses separate paternal and maternal genotypes when available.

‘–fit_sib’

Fit indirect effect from sibling

‘–covar’str

Path to file with covariates: plain text file with columns FID, IID, covar1, covar2, ..

‘–phen_index’int, default=1

If the phenotype file contains multiple phenotypes, which phenotype should be analysed (default 1, first)

‘–min_maf’float, default=0.01

Ignore SNPs with minor allele frequency below min_maf (default 0.01)

‘–threads’int

Number of threads to use for IBD inference. Uses all available by default.

‘–max_missing’float, default=5

Ignore SNPs with greater percent missing calls than max_missing (default 5)

‘–batch_size’int, default=100000

Batch size of SNPs to load at a time (reduce to reduce memory requirements)

‘–no_hdf5_out’

Suppress HDF5 output of summary statistics

‘–no_txt_out’

Suppress text output of summary statistics

‘–missing_char’str, default=NA

Missing value string in phenotype file (default NA)

‘–tau_init’float, default=1

Initial value for ratio between shared family environmental variance and residual variance

Results:
sumstats.gz

For each chromosome, a gzipped text file containing the SNP level summary statistics.

snipar.scripts.gwas.main(args)[source]

“Calling this function with args is equivalent to running this script from commandline with the same arguments. Args:

args: list

list of all the desired options and arguments. The possible values are all the values you can pass this script from commandline.