Gencove’s proprietary pipeline for genotype imputation from low-pass whole genome sequencing utilizes a combination of open-source and custom software to deliver accurate variant calls across the genome.
Compared to standard imputation methodologies which impute from a smaller set of genotype calls to a larger variant callset, we impute directly from the sequencing reads to large haplotype reference panels in order to make use of all available data and accurately model the uncertainty inherent in sequencing reads.
Briefly, we use an adapted version of the software introduced by Rubinacci et al. 2021, which implements a fast and accurate version of the Li and Stephens (2003) imputation model. Given low-pass sequence data for a target individual and a large haplotype reference panel representing the genetic diversity of a given population or species to which the target individual belongs, genotypes are imputed into the target individual genome-wide at all sites present in the reference panel. Imputed genotypes are accompanied by associated posterior probabilities to allow for more precise downstream analyses.