Dear GATK team
I am calling variants using HaplotypeCaller on both WGS data form a normal tissue samle and RNA seq data on tumor tissue. Settings for HC are slightly different for the RNA seq data but the problem only arises when running HC on the WGS data. We are following Best Practices.
I am using Oracle JDK 1.8.0 144 Java HotSpot(TM) 64-Bit Server VM, but also tried Open JDK 64-Bit Server VM v1.8.0 161 and GATK version 4.0.1.1.
I am running using the WDL/Cromwell setup and scatter-gather so as you can see in the following command, I am not using --native-pair-hmm-threads (I saw in some previous posts that the old -nct could produce some errors).
It could be related to memory so I tried playing around with the Java settings like setting them from -Xmx4g to -Xms8000m which I saw was used here: https://github.com/gatk-workflows/gatk4-germline-snps-indels/blob/master/haplotypecaller-gvcf-gatk4.hg38.wgs.inputs.json. It doesn't any any difference, the error is still produced... I also tried deleting the GCLimits. Should I try something else? The -Duser.country is for some confusion between using ',' and '.' for floats, our server is set to Danish language (for no reason) and we use commas for decimals.
This is my command (some of the pats have been abbreviated for clarity:
$gatk4.0.1.1 --java-options "-XX:GCTimeLimit=50 -XX:GCHeapFreeLimit=10 -Xmx4g -Duser.country=en_US.UTF-8 -Duser.language=en_US.UTF-8" HaplotypeCaller \
-R $longpath/gatk-legacy-bundles/b37/human_g1k_v37_decoy.fasta \
-O Normal-056-WGS.vcf.gz \
-I $longpath/call-GatherBamFiles_normal/execution/Normal-056-WGS.bam \
--max-alternate-alleles 3 \
--contamination-fraction-to-filter 0.00172 \
--read-filter OverclippedReadFilter \
--standard-min-confidence-threshold-for-calling 30 \
-L $longpath/gatk-legacy-bundles/b37/scattered_wgs_intervals/scatter-50/temp_0024_of_50/scattered.interval_list
Stacktrace (sorry for the long paths but hopefully only the last part is important):
Using GATK jar /services/tools/gatk/4.0.1.1/gatk-package-4.0.1.1-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=1 -XX:GCTimeLimit=50 -XX:GCHeapFreeLimit=10 -Xmx4g -Duser.country=en_US.UTF-8 -Duser.language=en_US.UTF-8 -jar /services/tools/gatk/4.0.1.1/gatk-package-4.0.1.1-local.jar HaplotypeCaller -R /home/projects/dp_00005/apps/bonkolab_cromwell/tmp_wdir/Sample_021-056/cromwell-executions/WGS_normal_RNAseq_tumor_SNV_wf/4da5f4da-0dfc-4d55-a3bb-865eb51d6838/call-HaplotypeCaller_normal/shard-23/inputs/home/databases/gatk-legacy-bundles/b37/human_g1k_v37_decoy.fasta -O Normal-056-WGS.vcf.gz -I /home/projects/dp_00005/apps/bonkolab_cromwell/tmp_wdir/Sample_021-056/cromwell-executions/WGS_normal_RNAseq_tumor_SNV_wf/4da5f4da-0dfc-4d55-a3bb-865eb51d6838/call-HaplotypeCaller_normal/shard-23/inputs/home/projects/dp_00005/apps/bonkolab_cromwell/tmp_wdir/Sample_021-056/cromwell-executions/WGS_normal_RNAseq_tumor_SNV_wf/4da5f4da-0dfc-4d55-a3bb-865eb51d6838/call-GatherBamFiles_normal/execution/Normal-056-WGS.bam --max-alternate-alleles 3 --contamination-fraction-to-filter 0.00172 --read-filter OverclippedReadFilter --standard-min-confidence-threshold-for-calling 30 -L /home/projects/dp_00005/apps/bonkolab_cromwell/tmp_wdir/Sample_021-056/cromwell-executions/WGS_normal_RNAseq_tumor_SNV_wf/4da5f4da-0dfc-4d55-a3bb-865eb51d6838/call-HaplotypeCaller_normal/shard-23/inputs/home/databases/gatk-legacy-bundles/b37/scattered_wgs_intervals/scatter-50/temp_0024_of_50/scattered.interval_list
Picked up _JAVA_OPTIONS: -Djava.io.tmpdir=/home/projects/dp_00005/apps/bonkolab_cromwell/tmp_wdir/Sample_021-056/cromwell-executions/WGS_normal_RNAseq_tumor_SNV_wf/4da5f4da-0dfc-4d55-a3bb-865eb51d6838/call-HaplotypeCaller_normal/shard-23/execution/tmp.47DEgM
11:29:03.730 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/services/tools/gatk/4.0.1.1/gatk-package-4.0.1.1-local.jar!/com/intel/gkl/native/libgkl_compression.so
11:29:03.961 INFO HaplotypeCaller - ------------------------------------------------------------
11:29:03.962 INFO HaplotypeCaller - The Genome Analysis Toolkit (GATK) v4.0.1.1
11:29:03.963 INFO HaplotypeCaller - For support and documentation go to https://software.broadinstitute.org/gatk/
11:29:03.963 INFO HaplotypeCaller - Executing as s143372@risoe-r03-cn026 on Linux v3.10.0-514.10.2.el7.x86_64 amd64
11:29:03.963 INFO HaplotypeCaller - Java runtime: Java HotSpot(TM) 64-Bit Server VM v1.8.0_144-b01
11:29:03.963 INFO HaplotypeCaller - Start Date/Time: March 19, 2018 11:29:03 AM CET
11:29:03.963 INFO HaplotypeCaller - ------------------------------------------------------------
11:29:03.963 INFO HaplotypeCaller - ------------------------------------------------------------
11:29:03.964 INFO HaplotypeCaller - HTSJDK Version: 2.14.1
11:29:03.964 INFO HaplotypeCaller - Picard Version: 2.17.2
11:29:03.964 INFO HaplotypeCaller - HTSJDK Defaults.COMPRESSION_LEVEL : 1
11:29:03.964 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
11:29:03.964 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
11:29:03.964 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
11:29:03.964 INFO HaplotypeCaller - Deflater: IntelDeflater
11:29:03.964 INFO HaplotypeCaller - Inflater: IntelInflater
11:29:03.964 INFO HaplotypeCaller - GCS max retries/reopens: 20
11:29:03.964 INFO HaplotypeCaller - Using google-cloud-java patch 6d11bef1c81f885c26b2b56c8616b7a705171e4f from https://github.com/droazen/google-cloud-java/tree/dr_all_nio_fixes
11:29:03.965 INFO HaplotypeCaller - Initializing engine
11:29:04.807 INFO IntervalArgumentCollection - Processing 40724607 bp from intervals
11:29:04.833 INFO HaplotypeCaller - Done initializing engine
11:29:04.863 INFO HaplotypeCallerEngine - Disabling physical phasing, which is supported only for reference-model confidence output
11:29:05.604 INFO NativeLibraryLoader - Loading libgkl_utils.so from jar:file:/services/tools/gatk/4.0.1.1/gatk-package-4.0.1.1-local.jar!/com/intel/gkl/native/libgkl_utils.so
11:29:05.618 INFO NativeLibraryLoader - Loading libgkl_pairhmm_omp.so from jar:file:/services/tools/gatk/4.0.1.1/gatk-package-4.0.1.1-local.jar!/com/intel/gkl/native/libgkl_pairhmm_omp.so
11:29:05.682 WARN IntelPairHmm - Flush-to-zero (FTZ) is enabled when running PairHMM
11:29:05.683 INFO IntelPairHmm - Available threads: 1
11:29:05.683 INFO IntelPairHmm - Requested threads: 4
11:29:05.683 WARN IntelPairHmm - Using 1 available threads, but 4 were requested
11:29:05.683 INFO PairHMM - Using the OpenMP multi-threaded AVX-accelerated native PairHMM implementation
11:29:05.759 INFO ProgressMeter - Starting traversal
11:29:05.765 INFO ProgressMeter - Current Locus Elapsed Minutes Regions Processed Regions/Minute
11:29:06.858 INFO VectorLoglessPairHMM - Time spent in setup for JNI call : 0.001355153
11:29:06.859 INFO PairHMM - Total compute time in PairHMM computeLogLikelihoods() : 0.023359896
11:29:06.859 INFO SmithWatermanAligner - Total compute time in java Smith-Waterman : 0.06 sec
11:29:06.860 INFO HaplotypeCaller - Shutting down engine
[March 19, 2018 11:29:06 AM CET] org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCaller done. Elapsed time: 0.05 minutes.
Runtime.totalMemory()=2041511936
java.lang.NullPointerException
at java.util.Collections$UnmodifiableMap.<init>(Collections.java:1446)
at java.util.Collections.unmodifiableMap(Collections.java:1433)
at org.broadinstitute.hellbender.tools.walkers.genotyper.StandardCallerArgumentCollection.getSampleContamination(StandardCallerArgumentCollection.java:89)
at org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCallerGenotypingEngine.assignGenotypeLikelihoods(HaplotypeCallerGenotypingEngine.java:141)
at org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCallerEngine.callRegion(HaplotypeCallerEngine.java:566)
at org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCaller.apply(HaplotypeCaller.java:218)
at org.broadinstitute.hellbender.engine.AssemblyRegionWalker.processReadShard(AssemblyRegionWalker.java:295)
at org.broadinstitute.hellbender.engine.AssemblyRegionWalker.traverse(AssemblyRegionWalker.java:271)
at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:893)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:136)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:179)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:198)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:153)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:195)
at org.broadinstitute.hellbender.Main.main(Main.java:277)
Thank you so much for your help!
- Nanna