I am running the latest gatk 4.0.11.0 on aligned reads from whole exome sequencing from TCGA to generate gVCF files. After generating the gVCF file, gatk is crashing with a null pointer exception. I get this exception only when I try to generate gVCF, but not regular VCF, from the same exact input. I also get the exception when I use different reference genomes and input bam files. The generated gVCF looks okay, but it is still strange that the software crashes. I was wondering if you have any suggestions?
Here is how I run gatk and the relevant console output:
$ gatk HaplotypeCaller -R ../../hg38.canonical_chromosomes/hg38.canonical_chromosomes.fa -I C828.TCGA-EB-A3XB-10B-01D-A23B-08.1_gdc_realn.sorted.bam --emit-ref-confidence GVCF -O C828.TCGA-EB-A3XB-10B-01D-A23B-08.1_gdc_realn.sorted.bam.genomic.hg38_canonical_chromosomes.vcf.gz
Using GATK jar /home/pfiziev/software/gatk-4.0.11.0/gatk-package-4.0.11.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /home/pfiziev/software/gatk-4.0.11.0/gatk-package-4.0.11.0-local.jar HaplotypeCaller -R ../../hg38.canonical_chromosomes/hg38.canonical_chromosomes.fa -I C828.TCGA-EB-A3XB-10B-01D-A23B-08.1_gdc_realn.sorted.bam --emit-ref-confidence GVCF -O C828.TCGA-EB-A3XB-10B-01D-A23B-08.1_gdc_realn.sorted.bam.genomic.hg38_canonical_chromosomes.vcf.gz
11:17:56.245 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/home/pfiziev/software/gatk-4.0.11.0/gatk-package-4.0.11.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
11:17:57.952 INFO HaplotypeCaller - ------------------------------------------------------------
11:17:57.953 INFO HaplotypeCaller - The Genome Analysis Toolkit (GATK) v4.0.11.0
11:17:57.953 INFO HaplotypeCaller - For support and documentation go to
11:17:57.953 INFO HaplotypeCaller - Executing as pfiziev@node005 on Linux v3.10.0-693.11.6.el7.x86_64 amd64
11:17:57.953 INFO HaplotypeCaller - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_161-b14
11:17:57.953 INFO HaplotypeCaller - Start Date/Time: November 8, 2018 11:17:56 AM PST
11:17:57.953 INFO HaplotypeCaller - ------------------------------------------------------------
11:17:57.954 INFO HaplotypeCaller - ------------------------------------------------------------
11:17:57.954 INFO HaplotypeCaller - HTSJDK Version: 2.16.1
11:17:57.954 INFO HaplotypeCaller - Picard Version: 2.18.13
11:17:57.955 INFO HaplotypeCaller - HTSJDK Defaults.COMPRESSION_LEVEL : 2
11:17:57.955 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
11:17:57.955 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
11:17:57.955 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
11:17:57.955 INFO HaplotypeCaller - Deflater: IntelDeflater
11:17:57.955 INFO HaplotypeCaller - Inflater: IntelInflater
11:17:57.955 INFO HaplotypeCaller - GCS max retries/reopens: 20
11:17:57.955 INFO HaplotypeCaller - Requester pays: disabled
11:17:57.955 INFO HaplotypeCaller - Initializing engine
11:17:58.487 INFO HaplotypeCaller - Done initializing engine
11:17:58.489 INFO HaplotypeCallerEngine - Tool is in reference confidence mode and the annotation, the following changes will be made to any specified annotations: 'StrandBiasBySample' will be enabled. 'ChromosomeCounts', 'FisherStrand', 'StrandOddsRatio' and 'QualByDepth' annotations have been disabled
11:17:58.499 INFO HaplotypeCallerEngine - Standard Emitting and Calling confidence set to 0.0 for reference-model confidence output
11:17:58.499 INFO HaplotypeCallerEngine - All sites annotated with PLs forced to true for reference-model confidence output
11:17:58.512 INFO NativeLibraryLoader - Loading libgkl_utils.so from jar:file:/home/pfiziev/software/gatk-4.0.11.0/gatk-package-4.0.11.0-local.jar!/com/intel/gkl/native/libgkl_utils.so
11:17:58.514 INFO NativeLibraryLoader - Loading libgkl_pairhmm_omp.so from jar:file:/home/pfiziev/software/gatk-4.0.11.0/gatk-package-4.0.11.0-local.jar!/com/intel/gkl/native/libgkl_pairhmm_omp.so
11:17:58.571 WARN IntelPairHmm - Flush-to-zero (FTZ) is enabled when running PairHMM
11:17:58.572 INFO IntelPairHmm - Available threads: 56
11:17:58.572 INFO IntelPairHmm - Requested threads: 4
11:17:58.572 INFO PairHMM - Using the OpenMP multi-threaded AVX-accelerated native PairHMM implementation
11:17:58.682 INFO ProgressMeter - Starting traversal
11:17:58.683 INFO ProgressMeter - Current Locus Elapsed Minutes Regions Processed Regions/Minute
11:18:03.297 WARN DepthPerSampleHC - Annotation will not be calculated, genotype is not called or alleleLikelihoodMap is null
11:18:03.297 WARN StrandBiasBySample - Annotation will not be calculated, genotype is not called or alleleLikelihoodMap is null
…
14:10:58.160 INFO ProgressMeter - chrY:27588245 173.0 10830630 62608.0
14:11:08.290 INFO ProgressMeter - chrY:37023245 173.2 10862080 62728.5
14:11:18.292 INFO ProgressMeter - chrY:46002245 173.3 10892010 62840.9
14:11:28.292 INFO ProgressMeter - chrY:54567245 173.5 10920560 62945.1
14:11:32.729 WARN DepthPerSampleHC - Annotation will not be calculated, genotype is not called or alleleLikelihoodMap is null
14:11:32.729 WARN StrandBiasBySample - Annotation will not be calculated, genotype is not called or alleleLikelihoodMap is null
14:11:33.219 INFO VectorLoglessPairHMM - Time spent in setup for JNI call : 2.404777391
14:11:33.219 INFO PairHMM - Total compute time in PairHMM computeLogLikelihoods() : 255.64041847700003
14:11:33.219 INFO SmithWatermanAligner - Total compute time in java Smith-Waterman : 417.38 sec
14:11:33.219 INFO HaplotypeCaller - Shutting down engine
[November 8, 2018 2:11:33 PM PST] org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCaller done. Elapsed time: 173.62 minutes.
Runtime.totalMemory()=12706119680
java.lang.NullPointerException
at org.broadinstitute.hellbender.engine.AssemblyRegion.getReference(AssemblyRegion.java:443)
at org.broadinstitute.hellbender.engine.AssemblyRegion.getAssemblyRegionReference(AssemblyRegion.java:464)
at org.broadinstitute.hellbender.engine.AssemblyRegion.getAssemblyRegionReference(AssemblyRegion.java:450)
at org.broadinstitute.hellbender.tools.walkers.haplotypecaller.AssemblyBasedCallerUtils.createReferenceHaplotype(AssemblyBasedCallerUtils.java:149)
at org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCallerEngine.referenceModelForNoVariation(HaplotypeCallerEngine.java:682)
at org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCallerEngine.callRegion(HaplotypeCallerEngine.java:521)
at org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCaller.apply(HaplotypeCaller.java:240)
at org.broadinstitute.hellbender.engine.AssemblyRegionWalker.processReadShard(AssemblyRegionWalker.java:291)
at org.broadinstitute.hellbender.engine.AssemblyRegionWalker.traverse(AssemblyRegionWalker.java:267)
at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:966)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:139)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
at org.broadinstitute.hellbender.Main.main(Main.java:289)