GATK 3.3 was released on October 23, 2014. Itemized changes are listed below. For more details, see the user-friendly version highlights.
Note: at time of writing, the release process in underway. The new version is expected to be available for download within 6 to 12 hours. The version highlights mentioned above will be posted in the same timeframe.
Haplotype Caller
- Improved the accuracy of dangling head merging in the HC assembler (now enabled by default).
- Physical phasing information is output by default in new sample-level PID and PGT tags.
- Added the
--sample_name
argument. This is a shortcut for people who have multi-sample BAMs but would like to use-ERC GVCF
mode with a particular one of those samples. - Support added for generalized ploidy. The global ploidy is specified with the
-ploidy
argument. - Fixed IndexOutOfBounds error associated with tail merging.
Variant Recalibrator
- New
--ignore_all_filters
option. If specified, the variant recalibrator will ignore all input filters and treat sites as unfiltered.
GenotypeGVCFs
- Support added for generalized ploidy. The global ploidy is specified with the
-ploidy
argument. - Bug fix for the case when we assumed ADs were in the same order if the number of alleles matched.
- Changed the default GVCF GQ Bands from 5,20,60 to be 1..60 by 1s, 60...90 by 10s and 99 in order to give finer resolution.
- Bug fix in the exact model when calling multi-allelic variants. QUAL field is now more accurate.
RNAseq analysis
- Bug fixes for working with unmapped reads.
CalculateGenotypePosteriors
- New annotation for low- and high-confidence possible de novos (only annotates biallelics).
- FamilyLikelihoodsUtils now add joint likelihood and joint posterior annotations.
- Restricted population priors based on discovered allele count to be valid for 10 or more samples.
DepthOfCoverage
- Fixed rare bug triggered by hash collision between sample names.
SelectVariants
- Updated the
--keepOriginalAC functionality
in SelectVariants to work for sites that lose alleles in the selection.
PrintReads
- Read groups that are excluded by
sample_name
,platform
, orread_group
arguments no longer appear in the header. - The performance penalty associated with filtering by read group has been essentially eliminated.
Annotations
- StrandOddsRatio is now a standard annotation that is output by default.
- We used to output zero for FS if there was no data available at a site, now we omit FS.
- Extensive rewrite of the annotation documentation.
Queue
- Fixed Queue bug with bad localhost addresses.
- Fixed issue related to spaces in job names that were fine in GridEngine 6 but break in (Son of) GE8.
- Improved scatter contigs algorithm to be fairer when splitting many contigs into few parts (contributed by @smowton)
Documentation
- We now generate PHP files instead of HTML.
- We now output a JSON version of the tool documentation that can be used to generate wrappers for GATK commands.
Miscellaneous
- Output arguments
--no_cmdline_in_header
,--sites_only
, and--bcf
for VCF files, and--bam_compression
,--simplifyBAM
,--disable_bam_indexing
, and--generate_md5
for BAM files moved to the engine level. - htsjdk updated to version 1.120.1620