-
In this article you will find information on the different types of configuration pipelines for the Human species. The configurations include:
Human low-pass GRCh38
-
Version 3
- Human low-pass GRCh38 v3.3:
- Same configuration as Human low-pass GRCh38 v3.2 but with sex detection improvements.
- Same configuration as Human low-pass GRCh38 v3.1 but with a new sex detection QC metric.
- Human low-pass GRCh38 v3.1:
-
- Same configuration as Human low-pass GRCh38 v3.0 but with performance improvements, and an updated reference genome, which can be found here.
-
- Same configuration as Human low-pass GRCh38 v2.3 but with upgrades to the underlying imputation algorithm.
-
Version 2
- Human low-pass GRCh38 v2.3:
-
- Same configuration as Human low-pass GRCh38 v2.2 but with a reference genome excluding ALT contigs.
- Human low-pass GRCh38 v2.2:
-
- Same configuration as Human low-pass GRCh38 v2.1 but with additional imputation QC metrics calculated.
- Human low-pass GRCh38 v2.1:
-
- Same configuration as Human low-pass GRCh38 v2.0 but annotated with variant IDs deriving from dbSNP build 151.
-
- Same reference genome and deliverables as Human low-pass GRCh38 v1.1 but with a new imputation reference panel comprising the phased release of genotype calls from the New York Genome Center's resequencing efforts of individuals from the 1000 Genomes Project. Comprises 3202 individuals, including the original 2504 from Phase 3 and an additional 798 relatives. See the preprint here for more details.
-
Version 1
- Human low-pass GRCh38 v1.0:
-
- Same configuration as the Human low-pass GRCh38 (beta) except for the imputation reference panel.
- Imputation reference panel: Lifted-over panel from the 1000 Genomes Phase 3 GRCh37 release.
- Human low-pass GRCh38 v1.1:
-
- Same configuration as Human low-pass GRCh38 v1.0 but with bug fixes and performance improvements.
-
- Reference genome: GRCh38 with alternative sequences, plus decoys and HLA here.
- Imputation reference panel: Variant calls from 1000 Genomes Phase 3 samples resequenced at high depth by the New York Genome Center (processing pipeline described here), after removing singletons (variants with a minor allele count of 1 in the sample), for a total of ~62M variants.
- Ancestry reference panel: We provide an ancestry analysis based on 26 reference populations described here.
- Deliverables: original FASTQ, aligned BAM (and index), imputed VCF (and index), ancestry analysis.
-
-
Version 3
- Human low-pass GRCh37 v3.1
-
- Same configuration as Human low-pass 3.0, but with performance improvements.
- Human low-pass GRCh37 v3.0
-
- Same configuration as Human low-pass 2.6, but with upgrades to the underlying imputation algorithm.
-
Version 2
- Human low-pass GRCh37 v2.6
-
- Same configuration as Human low-pass 2.5, but with optimized CNV calling parameters.
- Human low-pass GRCh37 v2.5
-
- Same configuration as Human low-pass 2.4, but with additional imputation QC metrics calculated.
- Human low-pass GRCh37 v2.4
-
- Same configuration as Human low-pass 2.3, but the CNV calling part of the pipeline now uses a panel of normals comprising 59 male normal samples. Previously, the CNV calling step did not normalize against any normal human samples.
- Human low-pass GRCh37 v2.3
-
- Same configuration as Human low-pass v2.2, but imputation now takes into account varying recombination rates across the genome. In particular, a recombination map derived from from the HapMap II project is used to interpolate recombination rates across all sites in the haplotype reference panel. This results in increased imputation accuracy compared to the configuration for Human low-pass v2.2.
- Human low-pass GRCh37 v2.2
-
- Same configuration as Human low-pass v2.1, but with bug fixes and performance improvements
- Human low-pass v2.1
-
- Same configuration as Human low-pass v2.0, but with duplicate sites removed (see 1000 Genomes website for details)
- Human low-pass v2.0
-
- Reference genome: hs37-1kg
- Imputation reference panel: 1000 Genomes Phase 3. Relative to v1.0, this includes all sites (including normalized multiallelic sites and the X chromosome.
- Ancestry reference panel: We provide an ancestry analysis based on 26 reference populations described here
- Polygenic risk scores:
- Coronary artery disease: Inouye et al. 2018
- delivery key:
cad
- delivery key:
- Breast cancer: Mavaddat et al. 2018
- delivery key:
brca
- delivery key:
- Prostate cancer: Schumacher et al. 2018
- delivery key:
prca
- delivery key:
- Coronary artery disease: Inouye et al. 2018
- Deliverables: original FASTQ, aligned BAM (and index), imputed VCF (and index), ancestry analysis, polygenic scores
- Human low-pass v1.0
-
- Reference genome: hs37-1kg
- Imputation reference panel: 1000 Genomes Phase 3, with all sites with a minor allele count less than three, with more than two alleles, or on the sex chromosomes removed.
- Ancestry reference panel: We provide an ancestry analysis based on 26 reference populations described here
- Deliverables: original FASTQ, aligned BAM (and index), imputed VCF (and index), ancestry analysis, polygenic scores