snipar.scripts.ibd module
Infers identity-by-descent (IBD) segments shared between full-siblings.
Minimally: the script requires observed sibling genotypes in either .bed or .bgen format, along with information on the relations present in the dataset, which can be provided using a pedigree file or the results of KING kinship inference along with age and sex information (from which a pedigree can be constructed).
- Args:
- ‘-h’, ‘–help’, default===SUPPRESS==
show this help message and exit
- ‘–bgen’str
Address of the phased genotypes in .bgen format. If there is a @ in the address, @ is replaced by the chromosome numbers in the range of chr_range for each chromosome (chr_range is an optional parameters for this script).
- ‘–bed’str
Address of the unphased genotypes in .bed format. If there is a @ in the address, @ is replaced by the chromosome numbers in the range of chr_range for each chromosome (chr_range is an optional parameters for this script).
- ‘–chr_range’
number of the chromosomes to be imputed. Should be a series of ranges with x-y format or integers.
- ‘–king’str
Address of the king file
- ‘–agesex’str
Address of file with age and sex information
- ‘–pedigree’str
Address of pedigree file
- ‘–map’str
None
- ‘–out’str, default=ibd
The IBD segments will output to this path, one file for each chromosome. If the path contains ‘#’, the ‘#’ will be replaced with the chromosome number. Otherwise, the segments will be output to the given path with file names chr_1.ibd.segments.gz, chr_2.segments.gz, etc.
- ‘–p_error’float
Probability of genotyping error. By default, this is estimated from genotyped parent-offspring pairs.
- ‘–min_length’float, default=0.01
Smooth segments with length less than min_length (cM)
- ‘–threads’int
Number of threads to use for IBD inference. Uses all available by default.
- ‘–min_maf’float, default=0.01
Minimum minor allele frequency
- ‘–max_missing’float, default=5
Ignore SNPs with greater percent missing calls than max_missing (default 5)
- ‘–max_error’float, default=0.01
Maximum per-SNP genotyping error probability
- ‘–ibdmatrix’
Output a matrix of SNP IBD states (in addition to segments file)
- ‘–ld_out’
Output LD scores of SNPs (used internally for weighting).
- ‘–chrom’int
The chromosome of the input .bgen file. Helpful if inputting a single .bgen file without chromosome information.
- ‘–batches’int, default=1
Number of batches to split the data (by sibpair) into for IBD inference. Useful for large datasets.
- Results:
- IBD segments
For each chromosome, a gzipped text file containing the IBD segments for the siblings is output.