GVCF - Genomic Variant Call Format
GVCF stands for Genomic VCF. A GVCF is a kind of VCF, so the basic format specification is the same as for a regular VCF (see the spec documentation here), but a Genomic VCF contains extra information....
View ArticleVCF - Variant Call Format
This document describes "regular" VCF files produced for GERMLINE short variant (SNP and indel) calls (e.g. by HaplotypeCaller in "normal" mode and by GenotypeGVCFs). For information on the special...
View ArticleHow should I cite GATK in my own publications?
To date we have published three papers on GATK, plus a preprint in bioRxiv (citation details below). You're welcome to choose which paper is most representative of what aspect of GATK you called on in...
View ArticleVariant annotations
Variant annotations are can be produced by HaplotypeCaller, Mutect2, VariantAnnotator and GenotypeGVCFs. The available annotations are listed under Annotations in the Tool Documentation. Note that some...
View ArticleHaplotypeCaller in a nutshell
This document outlines the basic operation of the HaplotypeCaller run in its default mode on a single sample, and does not cover the additional processing and calculations done when it is run in "GVCF...
View ArticleHaplotypeCaller Reference Confidence Model (GVCF mode)
This document describes the reference confidence model applied by HaplotypeCaller to generate a per-sample GVCF, invoked by -ERC GVCF or -ERC BP_RESOLUTION. As explained here, HaplotypeCaller works by...
View ArticleCalculation of PL and GQ by HaplotypeCaller and GenotypeGVCFs
PL is a sample-level annotation calculated by HaplotypeCaller and GenotypeGVCFs, recorded in the sample-level columns of variant records in VCF files. This annotation represents the normalized...
View ArticleLocal re-assembly and haplotype determination (HaplotypeCaller & Mutect2)
This document details the procedure used by HaplotypeCaller to re-assemble read data and determine candidate haplotypes as a prelude to variant calling. For more context information on how this fits...
View ArticleActiveRegion determination (HaplotypeCaller & Mutect2)
This document details the procedure used by HaplotypeCaller to define ActiveRegions on which to operate as a prelude to variant calling. For more context information on how this fits into the overall...
View ArticleEvaluating the evidence for haplotypes and variant alleles (HaplotypeCaller &...
This document details the procedure used by HaplotypeCaller to evaluate the evidence for variant alleles based on candidate haplotypes determined in the previous step for a given ActiveRegion. For more...
View ArticleAssigning per-sample genotypes (HaplotypeCaller)
This document describes the procedure used by HaplotypeCaller to assign genotypes to individual samples based on the allele likelihoods calculated in the previous step. For more context information on...
View ArticleAllele Depth (AD) is lower than expected
The problem: You're trying to evaluate the support for a particular call, but the numbers in the DP (total depth) and AD (allele depth) fields aren't making any sense. For example, the sum of all the...
View ArticleMissing annotations in the output callset VCF
The problem You specified -A <some annotation> in a command line invoking one of the annotation-capable tools (HaplotypeCaller, MuTect2, GenotypeGVCFs and VariantAnnotator), but that annotation...
View ArticleExpected variant at a specific site was not called
This can happen when you expect a call to be made based on the output of other variant calling tools, or based on examination of the data in a genome browser like IGV. There are several possibilities,...
View ArticleBest strategy to "fix" the Haplotype Caller - GenotypeGVCF "missing DP field"...
Hi, I've run into the (already reported http://gatkforums.broadinstitute.org/dsde/discussion/5598/missing-depth-dp-after-haplotypecaller ) bug of the missing DP format field in my callings. I've run...
View ArticleCan HaplotypeCaller be used on drug treated samples?
Hello, I am working on a RNASeq data which consists of liver samples from donors. It is a case-control study where 12 samples are divided as Normal (control) and Rifampin Treated (case). I want to...
View ArticleWhat are the differences between Mutect2 and HaplotypeCaller?
They share graph assembly and haplotype determination -- but the similarities end there Operationally, Mutect2 works similarly to HaplotypeCaller in that they share the active region-based processing,...
View ArticleVariantFiltration | HaplotypeCaller - ignoring variants close (5bp) from...
Hi, I am currently working with data from HaloPlex Target Enrichment System. HaloPlex is using retriction enzymes to digest the DNA, thus producing non-random reads and often have false mutations in...
View ArticleDifferences between GATK 4.beta.5 vs 4.0.0.0 HaplotypeCaller results
Hi! I'd like to perform short germline variant calling on human DNA-seq samples (separate analysis of WES cohort and PCR-free WGS cohort, both paired end). The plan is to follow GATK best practices of...
View ArticleA logical problem with SplitCommonSuffices and MergeCommonSuffices
@Sheila @valentin @depristo For example : A+x -> y (A+x,y is a point) B+x -> y after SplitCommonSuffices A -> x -> y (A,B,x,y is a point) B -> x -> y after MergeCommonSuffices A ->...
View Article