We are following "Calling variants on cohorts of samples using the HaplotypeCaller in GVCF mode" best practices using GATK 3.8.1 and Java 1.8. Thus we merged the raw.g.vcfs from HaplotypeCaller into one cohort.g.vcf and then carried out joint genotyping using the GenotypeGVCFs tool. We are working in a haploid model organism so we then tried to use the VariantFiltration tool on the output (which is a vcf file containing the information from all of the sequences with which we are working). However this failed and we got the error
"Line 2176: there aren't enough columns for line 102"
Others have encountered the same problem and I see that you have responded that the GATK and java versions are incompatible but this was several versions ago. Is this true for us? Please can you tell me where to go to next.
GenotypeGVCFs and VariantFiltration tools
Using NIO with GATK4 HaplotypeCaller
Is GATK4 HaplotypeCaller NIO compatible? If not, is there another version that is?
Thanks!
"UKNOWN" zygosity in CSV file
Hi There,
I am using GATK 3 . Recently i checked two CSV and bam file for couple, that both of them are carrier of one pathogenic variant, But in CSV file, the zygosity of this variant in both of them labeled as "Unknown" and not Heterozygote.
I have two question:
1- What is main criteria to determine "zygosity" of one variant in GATK?
2-How can i eliminate false negative (or false positive) variants in final VCF (by GATK)?
Than you
Mojtaba
Is UnifiedGenotyper actually better than HaplotypeCaller for this pooled sample project?
Hi, I am interested in calling variants from pooled samples. Specifically, I wish to determine SNP allele frequencies from samples that were made by pooling many individuals (1000+) together. I know that HaplotypeCaller is now recommended over UnifiedGenotyper in all cases. However, is this project an exception? I have:
- 1000s of individuals in each pooled sample
- only two possible alleles at every site
- I only need to call SNPs
- I can generate a set of known SNPs to call (does GENOTYPE_GIVEN_ALLELES work in HaplotypeCaller?)
- I have high read coverage
- I want to detect rare alleles as best as possible
If you still advise using HaplotypeCaller in this case, do you have any special suggestions? I'd like to maximize the -ploidy number to detect the rare alleles, but otherwise streamline the job. Thanks for any advice you can provide!
HaplotypeCaller warnings DepthPerSampleHC
Hi I'm trying to do a multisample variant call using several bam files in the following cmd
/mnt/fastdata/md1jale/software/gatk-4.0.1.0/gatk HaplotypeCaller -R /mnt/fastdata/md1jale/reference/hs37d5.fa -I /mnt/fastdata/md1jale/WGS_MShef7_iPS/24811_1#1.bam -I /mnt/fastdata/md1jale/WGS_MShef7_iPS/24150_1#1.bam -I /mnt/fastdata/md1jale/WGS_MShef7_iPS/24144_2#1.bam -I /mnt/fastdata/md1jale/WGS_MShef7_iPS/24712_6#1.bam -I /mnt/fastdata/md1jale/WGS_MShef7_iPS/24811_2#1.bam -O /mnt/fastdata/md1jale/WGS_MShef7_iPS/output/raw_variants.vcf
Using GATK jar /mnt/fastdata/md1jale/software/gatk-4.0.1.0/gatk-package-4.0.1.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=1 -jar /mnt/fastdata/md1jale/software/gatk-4.0.1.0/gatk-package-4.0.1.0-local.jar HaplotypeCaller -R /mnt/fastdata/md1jale/reference/hs37d5.fa -I /mnt/fastdata/md1jale/WGS_MShef7_iPS/24811_1#1.bam -I /mnt/fastdata/md1jale/WGS_MShef7_iPS/24150_1#1.bam -I /mnt/fastdata/md1jale/WGS_MShef7_iPS/24144_2#1.bam -I /mnt/fastdata/md1jale/WGS_MShef7_iPS/24712_6#1.bam -I /mnt/fastdata/md1jale/WGS_MShef7_iPS/24811_2#1.bam -O /mnt/fastdata/md1jale/WGS_MShef7_iPS/output/mshef7_wt_vs_ips_raw_variants.vcf
10:26:29.719 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/mnt/fastdata/md1jale/software/gatk-4.0.1.0/gatk-package-4.0.1.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
10:26:29.935 INFO HaplotypeCaller - ------------------------------------------------------------
10:26:29.935 INFO HaplotypeCaller - The Genome Analysis Toolkit (GATK) v4.0.1.0
10:26:29.935 INFO HaplotypeCaller - For support and documentation go to https://software.broadinstitute.org/gatk/
10:26:29.935 INFO HaplotypeCaller - Executing as md1jale@sharc-node122.shef.ac.uk on Linux v3.10.0-693.11.6.el7.x86_64 amd64
10:26:29.936 INFO HaplotypeCaller - Java runtime: Java HotSpot(TM) 64-Bit Server VM v1.8.0_102-b14
10:26:29.936 INFO HaplotypeCaller - Start Date/Time: 14 February 2018 10:26:29 GMT
10:26:29.936 INFO HaplotypeCaller - ------------------------------------------------------------
10:26:29.936 INFO HaplotypeCaller - ------------------------------------------------------------
10:26:29.936 INFO HaplotypeCaller - HTSJDK Version: 2.14.1
10:26:29.936 INFO HaplotypeCaller - Picard Version: 2.17.2
10:26:29.937 INFO HaplotypeCaller - HTSJDK Defaults.COMPRESSION_LEVEL : 1
10:26:29.937 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
10:26:29.937 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
10:26:29.937 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
10:26:29.937 INFO HaplotypeCaller - Deflater: IntelDeflater
10:26:29.937 INFO HaplotypeCaller - Inflater: IntelInflater
10:26:29.937 INFO HaplotypeCaller - GCS max retries/reopens: 20
10:26:29.937 INFO HaplotypeCaller - Using google-cloud-java patch 6d11bef1c81f885c26b2b56c8616b7a705171e4f from https://github.com/droazen/google-cloud-java/tree/dr_all_nio_fixes
10:26:29.937 INFO HaplotypeCaller - Initializing engine
10:26:30.520 INFO HaplotypeCaller - Done initializing engine
10:26:30.528 INFO HaplotypeCallerEngine - Disabling physical phasing, which is supported only for reference-model confidence output
10:26:31.119 INFO NativeLibraryLoader - Loading libgkl_utils.so from jar:file:/mnt/fastdata/md1jale/software/gatk-4.0.1.0/gatk-package-4.0.1.0-local.jar!/com/intel/gkl/native/libgkl_utils.so
10:26:31.154 INFO NativeLibraryLoader - Loading libgkl_pairhmm_omp.so from jar:file:/mnt/fastdata/md1jale/software/gatk-4.0.1.0/gatk-package-4.0.1.0-local.jar!/com/intel/gkl/native/libgkl_pairhmm_omp.so
10:26:31.259 WARN IntelPairHmm - Flush-to-zero (FTZ) is enabled when running PairHMM
10:26:31.259 INFO IntelPairHmm - Available threads: 16
10:26:31.259 INFO IntelPairHmm - Requested threads: 4
10:26:31.259 INFO PairHMM - Using the OpenMP multi-threaded AVX-accelerated native PairHMM implementation
10:26:31.298 INFO ProgressMeter - Starting traversal
10:26:31.298 INFO ProgressMeter - Current Locus Elapsed Minutes Regions Processed Regions/Minute
10:26:33.832 WARN DepthPerSampleHC - Annotation will not be calculated, genotype is not called or alleleLikelihoodMap is null
10:26:33.865 WARN DepthPerSampleHC - Annotation will not be calculated, genotype is not called or alleleLikelihoodMap is null
10:26:33.880 WARN DepthPerSampleHC - Annotation will not be calculated, genotype is not called or alleleLikelihoodMap is null
10:26:33.911 WARN DepthPerSampleHC - Annotation will not be calculated, genotype is not called or alleleLikelihoodMap is null
10:26:34.733 WARN DepthPerSampleHC - Annotation will not be calculated, genotype is not called or alleleLikelihoodMap is null
10:26:41.497 INFO ProgressMeter - 1:15485 0.2 80 470.6
Despite having slight memory issues with running the above, the now command runs on providing large amount of memory, although i do get lots of WARN DepthPerSampleHC. Is this normal?
HaplotypeCaller gives error and generate vcd file with no variant call
Dear GATK Team,
I'm using GATK and picard to call short variant from plasmodium genome paired read fastq file . I used the HaplotypeCaller package after doing duplicate marking using picard MarkDuplicates package.
This output an error during variant call by HaplotypeCaller.
Kindly help resolve this issue
below is the log of GATK HaplotypeCaller step:
Using GATK jar /home/ubuntu/gatk-4.0.5.1/gatk-package-4.0.5.1-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xmx4g -jar /home/ubuntu/gatk-4.0.5.1/gatk-package-4.0.5.1-local.jar HaplotypeCaller -R Pf_ref/pf_3D7_38_Genome.fasta -I ./bam_output/Day0_IJD_252_dedup.bam -O ./variant_output/Day0_IJD_252_raw_variants.g.vcf
22:44:31.686 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/home/ubuntu/gatk-4.0.5.1/gatk-package-4.0.5.1-local.jar!/com/intel/gkl/native/libgkl_compression.so
22:44:32.283 INFO HaplotypeCaller - ------------------------------------------------------------
22:44:32.283 INFO HaplotypeCaller - The Genome Analysis Toolkit (GATK) v4.0.5.1
22:44:32.284 INFO HaplotypeCaller - For support and documentation go to https://software.broadinstitute.org/gatk/
22:44:32.285 INFO HaplotypeCaller - Executing as ubuntu@mrcclimbserver.vms.swansea.climb.ac.uk on Linux v4.4.0-127-generic amd64
22:44:32.286 INFO HaplotypeCaller - Java runtime: Java HotSpot(TM) 64-Bit Server VM v9.0.1+11
22:44:32.286 INFO HaplotypeCaller - Start Date/Time: June 27, 2018 at 10:44:31 PM UTC
22:44:32.286 INFO HaplotypeCaller - ------------------------------------------------------------
22:44:32.287 INFO HaplotypeCaller - ------------------------------------------------------------
22:44:32.289 INFO HaplotypeCaller - HTSJDK Version: 2.15.1
22:44:32.289 INFO HaplotypeCaller - Picard Version: 2.18.2
22:44:32.290 INFO HaplotypeCaller - HTSJDK Defaults.COMPRESSION_LEVEL : 2
22:44:32.290 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
22:44:32.290 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
22:44:32.290 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
22:44:32.291 INFO HaplotypeCaller - Deflater: IntelDeflater
22:44:32.291 INFO HaplotypeCaller - Inflater: IntelInflater
22:44:32.291 INFO HaplotypeCaller - GCS max retries/reopens: 20
22:44:32.291 INFO HaplotypeCaller - Using google-cloud-java patch 6d11bef1c81f885c26b2b56c8616b7a705171e4f from https://github.com/droazen/google-cloud-java/tree/dr_all_nio_fixes
22:44:32.291 INFO HaplotypeCaller - Initializing engine
22:44:32.813 INFO HaplotypeCaller - Done initializing engine
22:44:32.828 INFO HaplotypeCallerEngine - Disabling physical phasing, which is supported only for reference-model confidence output
22:44:32.849 INFO NativeLibraryLoader - Loading libgkl_utils.so from jar:file:/home/ubuntu/gatk-4.0.5.1/gatk-package-4.0.5.1-local.jar!/com/intel/gkl/native/libgkl_utils.so
22:44:32.856 INFO NativeLibraryLoader - Loading libgkl_pairhmm_omp.so from jar:file:/home/ubuntu/gatk-4.0.5.1/gatk-package-4.0.5.1-local.jar!/com/intel/gkl/native/libgkl_pairhmm_omp.so
22:44:32.968 WARN IntelPairHmm - Flush-to-zero (FTZ) is enabled when running PairHMM
22:44:32.969 INFO IntelPairHmm - Available threads: 32
22:44:32.969 INFO IntelPairHmm - Requested threads: 4
22:44:32.969 INFO PairHMM - Using the OpenMP multi-threaded AVX-accelerated native PairHMM implementation
22:44:33.047 INFO ProgressMeter - Starting traversal
22:44:33.048 INFO ProgressMeter - Current Locus Elapsed Minutes Regions Processed Regions/Minute
22:44:33.065 INFO VectorLoglessPairHMM - Time spent in setup for JNI call : 0.0
22:44:33.066 INFO PairHMM - Total compute time in PairHMM computeLogLikelihoods() : 0.0
22:44:33.070 INFO SmithWatermanAligner - Total compute time in java Smith-Waterman : 0.00 sec
22:44:33.070 INFO HaplotypeCaller - Shutting down engine
[June 27, 2018 at 10:44:33 PM UTC] org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCaller done. Elapsed time: 0.03 minutes.
Runtime.totalMemory()=2147483648
Exception in thread "main" java.lang.IncompatibleClassChangeError: Inconsistent constant pool data in classfile for class org/broadinstitute/hellbender/transformers/ReadTransformer. Method lambda$identity$d67512bf$1(Lorg/broadinstitute/hellbender/utils/read/GATKRead;)Lorg/broadinstitute/hellbender/utils/read/GATKRead; at index 65 is CONSTANT_MethodRef and should be CONSTANT_InterfaceMethodRef
at org.broadinstitute.hellbender.transformers.ReadTransformer.identity(ReadTransformer.java:30)
at org.broadinstitute.hellbender.engine.GATKTool.makePreReadFilterTransformer(GATKTool.java:288)
at org.broadinstitute.hellbender.engine.AssemblyRegionWalker.traverse(AssemblyRegionWalker.java:266)
at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:994)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:135)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:180)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:199)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
at org.broadinstitute.hellbender.Main.main(Main.java:289)
regards
Archie
All annotations in BP_RESOLUTION mode
Hello,
I was wondering if there is a way to output all annotations for all sites when running HaplotypeCaller with BP_RESOLUTION. Currently it outputs all annotations for only called variants. Thanks in advance.
Calling invaiant sites with the new pipeline of HaplotypeCaller
Hello,
I am using the new pipeline of haplotype caller in order to obtain a vcf file containing both variant and invariant sites.
For each individual, I called variant and invariant sites :
java -Xmx300g -jar GenomeAnalysisTK.jar \
-T HaplotypeCaller \
-R ref.fasta \
-I ${INPUT}.bam \
--genotyping_mode DISCOVERY
-stand_emit_conf 0 \
-stand_call_conf 0 \
-o ${INPUT}\_VC.vcf \
--emitRefConfidence BP_RESOLUTION \
--variant_index_type LINEAR \
--variant_index_parameter 128000 \
-nct 16
In the vcf that I obtain, I indeed have every position.
The problem is that he INFO and QUAL fileds are empty (.) if the site is non variant.
KE332545.1 44 . T <NON_REF> . . . GT:AD:DP:GQ:PL 0/0:13,0:13:39:0,39,503
KE332545.1 45 . T <NON_REF> . . . GT:AD:DP:GQ:PL 0/0:13,0:13:39:0,39,518
KE332545.1 46 . C T,<NON_REF> 0 . BaseQRankSum=-2.270;ClippingRankSum=-0.691;DP=17;MLEAC=0,0;MLEAF=0.00,0.00;MQ=38.98;MQ0=0;MQRankSum=0.099;ReadPosRankSum=0.493 GT:AD:DP:GQ:PL:SB 0/0:11,2,0:13:3:0,3,379,33,385,414:0,0,0,0
KE332545.1 47 . C <NON_REF> . . . GT:AD:DP:GQ:PL 0/0:13,0:13:39:0,39,515
KE332545.1 48 . A <NON_REF> . . . GT:AD:DP:GQ:PL 0/0:13,0:13:39:0,39,540
KE332545.1 49 . C <NON_REF> . . . GT:AD:DP:GQ:PL 0/0:13,0:13:39:0,39,563
But I also wanted this information in order to use my filtering pipeline on those invariant sites as well !
Any solution ?
Thanks !
Muriel
i_variant_quality_by_depth/i_genotype_quality interpretation
When interpreting the output of HaplotypeCaller, what do the i_variant_quality_by_depth and i_genotype_quality
columns represent and which of these would be a good value on which to base an assessment of confidence in the variant call and quality? What scale are they on? Or is there a different column that would be better?
HaplotypeCaller output header and one position recode without error
I'm trying to run gatk4 HaplotypeCaller using the following command:
./gatk HaplotypeCaller -R ./reference.fasta --emit-ref-confidence GVCF --dbsnp ./samtools_gatk_common.vcf -I ./sample.bqsr.bam -O ./sample.gvcf --TMP_DIR ./tmp
the log output gives no error but the result *.gvcf file only contained header and one base recode. The dbsnp file was the intersection of samtools and gatk.
here the log file:
Using GATK jar /path/to/gatk-4.0.4.0/gatk-package-4.0.4.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /path/to/gatk-4.0.4.0/gatk-package-4.0.4.0-local.jar HaplotypeCaller -R /path/to/index/chrom23.fasta --emit-ref-confidence GVCF --dbsnp /path/to/dbsnp/sample.dbsnp.vcf -I /path/to/BQSR/sample.bqsr.bam -O /path/to/result/sample.g.vcf --TMP_DIR /path/to/tmp
18:38:47.051 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/path/to/gatk-4.0.4.0/gatk-package-4.0.4.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
18:38:47.439 INFO HaplotypeCaller - ------------------------------------------------------------
18:38:47.440 INFO HaplotypeCaller - The Genome Analysis Toolkit (GATK) v4.0.4.0
18:38:47.440 INFO HaplotypeCaller - For support and documentation go to https://software.broadinstitute.org/gatk/
18:38:47.442 INFO HaplotypeCaller - Executing as hankai@cngb-compute-e05-6.cngb.sz.hpc on Linux v2.6.32-696.el6.x86_64 amd64
18:38:47.442 INFO HaplotypeCaller - Java runtime: Java HotSpot(TM) 64-Bit Server VM v1.8.0_172-b11
18:38:47.442 INFO HaplotypeCaller - Start Date/Time: July 4, 2018 6:38:46 PM CST
18:38:47.442 INFO HaplotypeCaller - ------------------------------------------------------------
18:38:47.442 INFO HaplotypeCaller - ------------------------------------------------------------
18:38:47.443 INFO HaplotypeCaller - HTSJDK Version: 2.14.3
18:38:47.443 INFO HaplotypeCaller - Picard Version: 2.18.2
18:38:47.444 INFO HaplotypeCaller - HTSJDK Defaults.COMPRESSION_LEVEL : 2
18:38:47.444 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
18:38:47.444 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
18:38:47.444 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
18:38:47.444 INFO HaplotypeCaller - Deflater: IntelDeflater
18:38:47.444 INFO HaplotypeCaller - Inflater: IntelInflater
18:38:47.444 INFO HaplotypeCaller - GCS max retries/reopens: 20
18:38:47.444 INFO HaplotypeCaller - Using google-cloud-java patch 6d11bef1c81f885c26b2b56c8616b7a705171e4f from https://github.com/droazen/google-cloud-java/tree/dr_all_nio_fixes
18:38:47.444 INFO HaplotypeCaller - Initializing engine
18:38:50.210 INFO FeatureManager - Using codec VCFCodec to read file file:///path/to/dbsnp/sample.dbsnp.vcf
18:38:50.292 INFO HaplotypeCaller - Done initializing engine
18:38:50.303 INFO HaplotypeCallerEngine - Standard Emitting and Calling confidence set to 0.0 for reference-model confidence output
18:38:50.303 INFO HaplotypeCallerEngine - All sites annotated with PLs forced to true for reference-model confidence output
18:38:51.794 INFO NativeLibraryLoader - Loading libgkl_utils.so from jar:file:/path/to/gatk-4.0.4.0/gatk-package-4.0.4.0-local.jar!/com/intel/gkl/native/libgkl_utils.so
18:38:51.817 INFO NativeLibraryLoader - Loading libgkl_pairhmm_omp.so from jar:file:/path/to/gatk-4.0.4.0/gatk-package-4.0.4.0-local.jar!/com/intel/gkl/native/libgkl_pairhmm_omp.so
18:38:51.915 WARN IntelPairHmm - Flush-to-zero (FTZ) is enabled when running PairHMM
18:38:51.916 INFO IntelPairHmm - Available threads: 112
18:38:51.916 INFO IntelPairHmm - Requested threads: 4
18:38:51.916 INFO PairHMM - Using the OpenMP multi-threaded AVX-accelerated native PairHMM implementation
18:38:51.996 INFO ProgressMeter - Starting traversal
18:38:51.997 INFO ProgressMeter - Current Locus Elapsed Minutes Regions Processed Regions/Minute
18:39:02.152 INFO ProgressMeter - pseudochrom_23:39888 0.2 240 1418.4
18:39:12.324 INFO ProgressMeter - pseudochrom_23:112351 0.3 650 1918.6
18:39:22.383 INFO ProgressMeter - pseudochrom_23:166271 0.5 980 1935.1
18:39:32.471 INFO ProgressMeter - pseudochrom_23:208604 0.7 1240 1838.3
18:39:42.498 INFO ProgressMeter - pseudochrom_23:270983 0.8 1610 1912.8
18:39:52.827 INFO ProgressMeter - pseudochrom_23:315473 1.0 1890 1864.2
18:40:03.130 INFO ProgressMeter - pseudochrom_23:368748 1.2 2220 1872.5
18:40:13.602 INFO ProgressMeter - pseudochrom_23:430805 1.4 2590 1905.3
18:40:23.620 INFO ProgressMeter - pseudochrom_23:512763 1.5 3060 2003.9
18:40:33.781 INFO ProgressMeter - pseudochrom_23:592148 1.7 3540 2086.8
18:40:46.199 INFO ProgressMeter - pseudochrom_23:661025 1.9 3950 2075.3
18:40:56.336 INFO ProgressMeter - pseudochrom_23:731629 2.1 4380 2113.6
18:41:09.819 INFO ProgressMeter - pseudochrom_23:835707 2.3 5000 2176.7
18:41:19.874 INFO ProgressMeter - pseudochrom_23:941548 2.5 5630 2284.3
18:41:30.479 INFO ProgressMeter - pseudochrom_23:1044902 2.6 6230 2358.6
18:41:40.552 INFO ProgressMeter - pseudochrom_23:1157010 2.8 6910 2459.7
18:41:50.606 INFO ProgressMeter - pseudochrom_23:1222918 3.0 7310 2455.6
18:42:00.695 INFO ProgressMeter - pseudochrom_23:1305523 3.1 7790 2477.0
18:42:10.765 INFO ProgressMeter - pseudochrom_23:1457789 3.3 8680 2620.1
18:42:20.899 INFO ProgressMeter - pseudochrom_23:1636208 3.5 9750 2800.4
18:42:30.922 INFO ProgressMeter - pseudochrom_23:1780023 3.6 10640 2916.1
18:42:40.981 INFO ProgressMeter - pseudochrom_23:1955789 3.8 11720 3071.0
18:42:51.075 INFO ProgressMeter - pseudochrom_23:2108472 4.0 12660 3177.2
18:43:01.113 INFO ProgressMeter - pseudochrom_23:2286350 4.2 13710 3302.1
18:43:11.157 INFO ProgressMeter - pseudochrom_23:2484540 4.3 14930 3456.6
18:43:21.167 INFO ProgressMeter - pseudochrom_23:2607582 4.5 15660 3490.7
18:43:31.253 INFO ProgressMeter - pseudochrom_23:2779264 4.7 16750 3598.9
18:43:41.256 INFO ProgressMeter - pseudochrom_23:2958401 4.8 17840 3700.5
18:43:51.431 INFO ProgressMeter - pseudochrom_23:3091735 5.0 18670 3741.1
18:44:01.489 INFO ProgressMeter - pseudochrom_23:3256919 5.2 19650 3809.5
18:44:11.888 INFO ProgressMeter - pseudochrom_23:3395538 5.3 20500 3845.1
18:44:22.047 INFO ProgressMeter - pseudochrom_23:3496925 5.5 21130 3841.2
18:44:32.048 INFO ProgressMeter - pseudochrom_23:3647997 5.7 22050 3890.6
18:44:42.058 INFO ProgressMeter - pseudochrom_23:3770277 5.8 22830 3913.0
18:44:52.224 INFO ProgressMeter - pseudochrom_23:3855394 6.0 23350 3889.2
18:45:02.305 INFO ProgressMeter - pseudochrom_23:3961378 6.2 24000 3888.7
18:45:12.396 INFO ProgressMeter - pseudochrom_23:4077288 6.3 24700 3895.9
18:45:22.481 INFO ProgressMeter - pseudochrom_23:4209807 6.5 25510 3919.8
18:45:32.603 INFO ProgressMeter - pseudochrom_23:4301812 6.7 26100 3909.1
18:45:42.779 INFO ProgressMeter - pseudochrom_23:4400034 6.8 26720 3902.8
18:45:53.263 INFO ProgressMeter - pseudochrom_23:4475456 7.0 27180 3871.2
18:46:04.692 INFO ProgressMeter - pseudochrom_23:4607856 7.2 28000 3882.6
18:46:14.837 INFO ProgressMeter - pseudochrom_23:4739532 7.4 28790 3900.7
18:46:26.963 INFO ProgressMeter - pseudochrom_23:4805956 7.6 29230 3854.8
18:46:37.150 INFO ProgressMeter - pseudochrom_23:4932551 7.8 30010 3871.0
18:46:47.557 INFO ProgressMeter - pseudochrom_23:5051360 7.9 30750 3879.6
18:46:57.575 INFO ProgressMeter - pseudochrom_23:5156893 8.1 31410 3881.1
18:47:07.589 INFO ProgressMeter - pseudochrom_23:5256960 8.3 32020 3876.6
18:47:17.844 INFO ProgressMeter - pseudochrom_23:5339306 8.4 32520 3857.3
18:47:28.069 INFO ProgressMeter - pseudochrom_23:5447309 8.6 33170 3856.4
18:47:38.135 INFO ProgressMeter - pseudochrom_23:5562641 8.8 33870 3862.5
18:47:48.259 INFO ProgressMeter - pseudochrom_23:5648642 8.9 34390 3847.8
18:47:58.434 INFO ProgressMeter - pseudochrom_23:5750249 9.1 35010 3844.2
18:48:09.065 INFO ProgressMeter - pseudochrom_23:5853949 9.3 35650 3839.7
18:48:19.112 INFO ProgressMeter - pseudochrom_23:5955110 9.5 36280 3838.4
18:48:29.206 INFO ProgressMeter - pseudochrom_23:6051364 9.6 36860 3831.5
18:48:39.584 INFO ProgressMeter - pseudochrom_23:6140606 9.8 37400 3819.0
18:48:49.694 INFO ProgressMeter - pseudochrom_23:6228203 10.0 37930 3807.6
18:48:59.742 INFO ProgressMeter - pseudochrom_23:6327447 10.1 38550 3805.9
18:49:10.118 INFO ProgressMeter - pseudochrom_23:6412023 10.3 39070 3792.5
18:49:20.131 INFO ProgressMeter - pseudochrom_23:6528580 10.5 39780 3799.8
18:49:30.488 INFO ProgressMeter - pseudochrom_23:6664489 10.6 40640 3819.0
18:49:41.323 INFO ProgressMeter - pseudochrom_23:6776006 10.8 41330 3819.0
18:49:51.947 INFO ProgressMeter - pseudochrom_23:6871397 11.0 41910 3810.3
18:50:02.348 INFO ProgressMeter - pseudochrom_23:6965003 11.2 42470 3801.3
18:50:12.656 INFO ProgressMeter - pseudochrom_23:7064647 11.3 43070 3796.6
18:50:22.681 INFO ProgressMeter - pseudochrom_23:7129699 11.5 43450 3774.5
18:50:32.723 INFO ProgressMeter - pseudochrom_23:7217180 11.7 43990 3766.7
18:50:42.805 INFO ProgressMeter - pseudochrom_23:7334195 11.8 44720 3774.9
18:50:52.874 INFO ProgressMeter - pseudochrom_23:7470037 12.0 45560 3792.0
18:51:03.070 INFO ProgressMeter - pseudochrom_23:7580430 12.2 46240 3795.0
18:51:13.109 INFO ProgressMeter - pseudochrom_23:7703064 12.4 46990 3804.3
18:51:23.274 INFO ProgressMeter - pseudochrom_23:7839176 12.5 47810 3818.3
18:51:33.338 INFO ProgressMeter - pseudochrom_23:7960865 12.7 48540 3825.4
18:51:43.392 INFO ProgressMeter - pseudochrom_23:8028264 12.9 48960 3808.2
18:51:53.463 INFO ProgressMeter - pseudochrom_23:8151834 13.0 49710 3816.7
18:52:03.665 INFO ProgressMeter - pseudochrom_23:8270942 13.2 50430 3822.1
18:52:13.727 INFO ProgressMeter - pseudochrom_23:8359715 13.4 50970 3814.5
18:52:23.905 INFO ProgressMeter - pseudochrom_23:8477290 13.5 51650 3816.9
18:52:33.954 INFO ProgressMeter - pseudochrom_23:8594099 13.7 52380 3823.6
18:52:44.110 INFO ProgressMeter - pseudochrom_23:8710379 13.9 53100 3828.8
18:52:54.114 INFO ProgressMeter - pseudochrom_23:8848199 14.0 53970 3845.3
18:53:04.680 INFO ProgressMeter - pseudochrom_23:8983340 14.2 54800 3856.1
18:53:15.384 INFO ProgressMeter - pseudochrom_23:9068836 14.4 55310 3843.7
18:53:25.473 INFO ProgressMeter - pseudochrom_23:9222012 14.6 56240 3863.2
18:53:35.477 INFO ProgressMeter - pseudochrom_23:9305881 14.7 56750 3854.1
18:53:45.512 INFO ProgressMeter - pseudochrom_23:9431585 14.9 57500 3861.2
18:53:55.687 INFO ProgressMeter - pseudochrom_23:9550933 15.1 58210 3864.8
18:54:05.702 INFO ProgressMeter - pseudochrom_23:9694239 15.2 59090 3880.3
18:54:15.903 INFO ProgressMeter - pseudochrom_23:9779200 15.4 59620 3871.8
18:54:25.917 INFO ProgressMeter - pseudochrom_23:9884556 15.6 60260 3871.4
18:54:36.002 INFO ProgressMeter - pseudochrom_23:9991326 15.7 60900 3870.7
18:54:46.010 INFO ProgressMeter - pseudochrom_23:10127422 15.9 61710 3881.1
18:54:56.072 INFO ProgressMeter - pseudochrom_23:10247506 16.1 62430 3885.4
18:55:06.287 INFO ProgressMeter - pseudochrom_23:10372627 16.2 63210 3892.7
18:55:16.338 INFO ProgressMeter - pseudochrom_23:10508632 16.4 64040 3903.5
18:55:26.423 INFO ProgressMeter - pseudochrom_23:10605673 16.6 64630 3899.5
18:55:36.484 INFO ProgressMeter - pseudochrom_23:10680890 16.7 65090 3888.0
18:55:46.555 INFO ProgressMeter - pseudochrom_23:10755549 16.9 65530 3875.4
18:55:56.618 INFO ProgressMeter - pseudochrom_23:10860581 17.1 66160 3874.2
18:56:06.724 INFO ProgressMeter - pseudochrom_23:10958345 17.2 66750 3870.6
18:56:16.801 INFO ProgressMeter - pseudochrom_23:11078670 17.4 67480 3875.2
18:56:26.824 INFO ProgressMeter - pseudochrom_23:11172750 17.6 68070 3871.9
18:56:36.886 INFO ProgressMeter - pseudochrom_23:11297520 17.7 68800 3876.5
18:56:46.910 INFO ProgressMeter - pseudochrom_23:11394420 17.9 69390 3873.2
18:56:56.924 INFO ProgressMeter - pseudochrom_23:11466077 18.1 69840 3862.4
18:57:06.975 INFO ProgressMeter - pseudochrom_23:11575994 18.2 70500 3863.1
18:57:17.094 INFO ProgressMeter - pseudochrom_23:11713112 18.4 71340 3873.3
18:57:27.171 INFO ProgressMeter - pseudochrom_23:11835109 18.6 72080 3878.1
18:57:37.329 INFO ProgressMeter - pseudochrom_23:11907584 18.8 72540 3867.7
18:57:47.364 INFO ProgressMeter - pseudochrom_23:12031631 18.9 73340 3875.8
18:57:57.451 INFO ProgressMeter - pseudochrom_23:12122040 19.1 73890 3870.4
18:58:07.495 INFO ProgressMeter - pseudochrom_23:12238860 19.3 74590 3873.1
18:58:17.565 INFO ProgressMeter - pseudochrom_23:12364885 19.4 75350 3878.8
18:58:27.731 INFO ProgressMeter - pseudochrom_23:12451270 19.6 75890 3872.8
18:58:38.320 INFO ProgressMeter - pseudochrom_23:12537057 19.8 76410 3864.5
18:58:48.414 INFO ProgressMeter - pseudochrom_23:12580452 19.9 76650 3844.0
18:58:59.346 INFO ProgressMeter - pseudochrom_23:12630247 20.1 76930 3823.1
18:59:10.085 INFO ProgressMeter - pseudochrom_23:12746384 20.3 77510 3818.0
18:59:20.474 INFO ProgressMeter - pseudochrom_23:12814970 20.5 77930 3806.2
18:59:30.683 INFO ProgressMeter - pseudochrom_23:12833522 20.6 78040 3780.1
18:59:41.531 INFO ProgressMeter - pseudochrom_23:12867911 20.8 78220 3756.0
18:59:51.979 INFO ProgressMeter - pseudochrom_23:12898083 21.0 78380 3732.4
19:00:02.811 INFO ProgressMeter - pseudochrom_23:12912010 21.2 78460 3704.4
19:00:12.854 INFO ProgressMeter - pseudochrom_23:12954239 21.3 78720 3687.5
19:00:23.618 INFO ProgressMeter - pseudochrom_23:13045215 21.5 79170 3677.7
19:00:33.765 INFO ProgressMeter - pseudochrom_23:13113654 21.7 79520 3665.2
19:00:46.176 INFO ProgressMeter - pseudochrom_23:13230637 21.9 80100 3657.0
19:00:57.561 INFO ProgressMeter - pseudochrom_23:13254119 22.1 80230 3631.5
19:01:11.951 INFO ProgressMeter - pseudochrom_23:13277140 22.3 80370 3598.8
19:01:23.954 INFO ProgressMeter - pseudochrom_23:13291793 22.5 80450 3570.4
19:01:34.143 INFO ProgressMeter - pseudochrom_23:13313750 22.7 80580 3549.4
19:01:44.470 INFO ProgressMeter - pseudochrom_23:13410560 22.9 81090 3545.0
19:01:54.793 INFO ProgressMeter - pseudochrom_23:13469784 23.0 81440 3533.7
19:02:05.477 INFO ProgressMeter - pseudochrom_23:13499022 23.2 81590 3513.1
19:02:15.584 INFO ProgressMeter - pseudochrom_23:13574066 23.4 81950 3503.2
19:02:27.238 INFO ProgressMeter - pseudochrom_23:13603519 23.6 82110 3481.1
19:02:37.410 INFO ProgressMeter - pseudochrom_23:13625698 23.8 82240 3461.7
19:02:48.228 INFO ProgressMeter - pseudochrom_23:13691826 23.9 82570 3449.4
19:02:59.032 INFO ProgressMeter - pseudochrom_23:13757035 24.1 82950 3439.4
19:03:09.114 INFO ProgressMeter - pseudochrom_23:13779661 24.3 83100 3421.8
19:03:19.416 INFO ProgressMeter - pseudochrom_23:13820635 24.5 83330 3407.2
19:03:29.183 INFO HaplotypeCaller - 55869059 read(s) filtered by: ((((((((MappingQualityReadFilter AND MappingQualityAvailableReadFilter) AND MappedReadFilter) AND NotSecondaryAlignmentReadFilter) AND NotDuplicateReadFilter) AND PassesVendorQualityCheckReadFilter) AND NonZeroReferenceLengthAlignmentReadFilter) AND GoodCigarReadFilter) AND WellformedReadFilter)
55869059 read(s) filtered by: (((((((MappingQualityReadFilter AND MappingQualityAvailableReadFilter) AND MappedReadFilter) AND NotSecondaryAlignmentReadFilter) AND NotDuplicateReadFilter) AND PassesVendorQualityCheckReadFilter) AND NonZeroReferenceLengthAlignmentReadFilter) AND GoodCigarReadFilter)
55869059 read(s) filtered by: ((((((MappingQualityReadFilter AND MappingQualityAvailableReadFilter) AND MappedReadFilter) AND NotSecondaryAlignmentReadFilter) AND NotDuplicateReadFilter) AND PassesVendorQualityCheckReadFilter) AND NonZeroReferenceLengthAlignmentReadFilter)
55869059 read(s) filtered by: (((((MappingQualityReadFilter AND MappingQualityAvailableReadFilter) AND MappedReadFilter) AND NotSecondaryAlignmentReadFilter) AND NotDuplicateReadFilter) AND PassesVendorQualityCheckReadFilter)
55869059 read(s) filtered by: ((((MappingQualityReadFilter AND MappingQualityAvailableReadFilter) AND MappedReadFilter) AND NotSecondaryAlignmentReadFilter) AND NotDuplicateReadFilter)
47376329 read(s) filtered by: (((MappingQualityReadFilter AND MappingQualityAvailableReadFilter) AND MappedReadFilter) AND NotSecondaryAlignmentReadFilter)
46853127 read(s) filtered by: ((MappingQualityReadFilter AND MappingQualityAvailableReadFilter) AND MappedReadFilter)
46853127 read(s) filtered by: (MappingQualityReadFilter AND MappingQualityAvailableReadFilter)
46853127 read(s) filtered by: MappingQualityReadFilter
523202 read(s) filtered by: NotSecondaryAlignmentReadFilter
8492730 read(s) filtered by: NotDuplicateReadFilter
19:03:29.184 INFO ProgressMeter - pseudochrom_23:13859898 24.6 83586 3395.1
19:03:29.184 INFO ProgressMeter - Traversal complete. Processed 83586 total regions in 24.6 minutes.
19:03:30.381 INFO VectorLoglessPairHMM - Time spent in setup for JNI call : 0.0
19:03:30.381 INFO PairHMM - Total compute time in PairHMM computeLogLikelihoods() : 0.0
19:03:30.381 INFO SmithWatermanAligner - Total compute time in java Smith-Waterman : 0.00 sec
19:03:30.381 INFO HaplotypeCaller - Shutting down engine
[July 4, 2018 7:03:30 PM CST] org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCaller done. Elapsed time: 24.73 minutes.
Runtime.totalMemory()=372873625
and the result *.gvcf:
##fileformat=VCFv4.2
##ALT=<ID=NON_REF,Description="Represents any possible alternative allele at this location">
##FILTER=<ID=LowQual,Description="Low quality">
##FORMAT=<ID=AD,Number=R,Type=Integer,Description="Allelic depths for the ref and alt alleles in the order listed">
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Approximate read depth (reads with MQ=255 or with bad mates are filtered)">
##FORMAT=<ID=GQ,Number=1,Type=Integer,Description="Genotype Quality">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=MIN_DP,Number=1,Type=Integer,Description="Minimum DP observed within the GVCF block">
##FORMAT=<ID=PGT,Number=1,Type=String,Description="Physical phasing haplotype information, describing how the alternate alleles are phased in relation to one another">
##FORMAT=<ID=PID,Number=1,Type=String,Description="Physical phasing ID information, where each unique ID within a given sample (but not across samples) connects records within a phasing group">
##FORMAT=<ID=PL,Number=G,Type=Integer,Description="Normalized, Phred-scaled likelihoods for genotypes as defined in the VCF specification">
##FORMAT=<ID=SB,Number=4,Type=Integer,Description="Per-sample component statistics which comprise the Fisher's Exact Test to detect strand bias.">
##GATKCommandLine=<ID=HaplotypeCaller,CommandLine="HaplotypeCaller --dbsnp /path/to/dbsnp/sample.dbsnp.vcf --emit-ref-confidence GVCF --output /path/to/result/sample.g.vcf --input /path/to/BQSR/sample.bqsr.bam --reference /path/to/index/chrom23.fasta --TMP_DIR /path/to/tmp --annotation-group StandardAnnotation --annotation-group StandardHCAnnotation --disable-tool-default-annotations false --gvcf-gq-bands 1 --gvcf-gq-bands 2 --gvcf-gq-bands 3 --gvcf-gq-bands 4 --gvcf-gq-bands 5 --gvcf-gq-bands 6 --gvcf-gq-bands 7 --gvcf-gq-bands 8 --gvcf-gq-bands 9 --gvcf-gq-bands 10 --gvcf-gq-bands 11 --gvcf-gq-bands 12 --gvcf-gq-bands 13 --gvcf-gq-bands 14 --gvcf-gq-bands 15 --gvcf-gq-bands 16 --gvcf-gq-bands 17 --gvcf-gq-bands 18 --gvcf-gq-bands 19 --gvcf-gq-bands 20 --gvcf-gq-bands 21 --gvcf-gq-bands 22 --gvcf-gq-bands 23 --gvcf-gq-bands 24 --gvcf-gq-bands 25 --gvcf-gq-bands 26 --gvcf-gq-bands 27 --gvcf-gq-bands 28 --gvcf-gq-bands 29 --gvcf-gq-bands 30 --gvcf-gq-bands 31 --gvcf-gq-bands 32 --gvcf-gq-bands 33 --gvcf-gq-bands 34 --gvcf-gq-bands 35 --gvcf-gq-bands 36 --gvcf-gq-bands 37 --gvcf-gq-bands 38 --gvcf-gq-bands 39 --gvcf-gq-bands 40 --gvcf-gq-bands 41 --gvcf-gq-bands 42 --gvcf-gq-bands 43 --gvcf-gq-bands 44 --gvcf-gq-bands 45 --gvcf-gq-bands 46 --gvcf-gq-bands 47 --gvcf-gq-bands 48 --gvcf-gq-bands 49 --gvcf-gq-bands 50 --gvcf-gq-bands 51 --gvcf-gq-bands 52 --gvcf-gq-bands 53 --gvcf-gq-bands 54 --gvcf-gq-bands 55 --gvcf-gq-bands 56 --gvcf-gq-bands 57 --gvcf-gq-bands 58 --gvcf-gq-bands 59 --gvcf-gq-bands 60 --gvcf-gq-bands 70 --gvcf-gq-bands 80 --gvcf-gq-bands 90 --gvcf-gq-bands 99 --indel-size-to-eliminate-in-ref-model 10 --use-alleles-trigger false --disable-optimizations false --just-determine-active-regions false --dont-genotype false --dont-trim-active-regions false --max-disc-ar-extension 25 --max-gga-ar-extension 300 --padding-around-indels 150 --padding-around-snps 20 --kmer-size 10 --kmer-size 25 --dont-increase-kmer-sizes-for-cycles false --allow-non-unique-kmers-in-ref false --num-pruning-samples 1 --recover-dangling-heads false --do-not-recover-dangling-branches false --min-dangling-branch-length 4 --consensus false --max-num-haplotypes-in-population 128 --error-correct-kmers false --min-pruning 2 --debug-graph-transformations false --kmer-length-for-read-error-correction 25 --min-observations-for-kmer-to-be-solid 20 --likelihood-calculation-engine PairHMM --base-quality-score-threshold 18 --pair-hmm-gap-continuation-penalty 10 --pair-hmm-implementation FASTEST_AVAILABLE --pcr-indel-model CONSERVATIVE --phred-scaled-global-read-mismapping-rate 45 --native-pair-hmm-threads 4 --native-pair-hmm-use-double-precision false --debug false --use-filtered-reads-for-annotations false --bam-writer-type CALLED_HAPLOTYPES --dont-use-soft-clipped-bases false --capture-assembly-failure-bam false --error-correct-reads false --do-not-run-physical-phasing false --min-base-quality-score 10 --smith-waterman JAVA --use-new-qual-calculator false --annotate-with-num-discovered-alleles false --heterozygosity 0.001 --indel-heterozygosity 1.25E-4 --heterozygosity-stdev 0.01 --standard-min-confidence-threshold-for-calling 10.0 --max-alternate-alleles 6 --max-genotype-count 1024 --sample-ploidy 2 --genotyping-mode DISCOVERY --genotype-filtered-alleles false --contamination-fraction-to-filter 0.0 --output-mode EMIT_VARIANTS_ONLY --all-site-pls false --min-assembly-region-size 50 --max-assembly-region-size 300 --assembly-region-padding 100 --max-reads-per-alignment-start 50 --active-probability-threshold 0.002 --max-prob-propagation-distance 50 --interval-set-rule UNION --interval-padding 0 --interval-exclusion-padding 0 --interval-merging-rule ALL --read-validation-stringency SILENT --seconds-between-progress-updates 10.0 --disable-sequence-dictionary-validation false --create-output-bam-index true --create-output-bam-md5 false --create-output-variant-index true --create-output-variant-md5 false --lenient false --add-output-sam-program-record true --add-output-vcf-command-line true --cloud-prefetch-buffer 40 --cloud-index-prefetch-buffer -1 --disable-bam-index-caching false --help false --version false --showHidden false --verbosity INFO --QUIET false --use-jdk-deflater false --use-jdk-inflater false --gcs-max-retries 20 --disable-tool-default-read-filters false --minimum-mapping-quality 20",Version=4.0.4.0,Date="July 4, 2018 6:38:51 PM CST">
##GVCFBlock0-1=minGQ=0(inclusive),maxGQ=1(exclusive)
##GVCFBlock1-2=minGQ=1(inclusive),maxGQ=2(exclusive)
##GVCFBlock10-11=minGQ=10(inclusive),maxGQ=11(exclusive)
##GVCFBlock11-12=minGQ=11(inclusive),maxGQ=12(exclusive)
##GVCFBlock12-13=minGQ=12(inclusive),maxGQ=13(exclusive)
##GVCFBlock13-14=minGQ=13(inclusive),maxGQ=14(exclusive)
##GVCFBlock14-15=minGQ=14(inclusive),maxGQ=15(exclusive)
##GVCFBlock15-16=minGQ=15(inclusive),maxGQ=16(exclusive)
##GVCFBlock16-17=minGQ=16(inclusive),maxGQ=17(exclusive)
##GVCFBlock17-18=minGQ=17(inclusive),maxGQ=18(exclusive)
##GVCFBlock18-19=minGQ=18(inclusive),maxGQ=19(exclusive)
##GVCFBlock19-20=minGQ=19(inclusive),maxGQ=20(exclusive)
##GVCFBlock2-3=minGQ=2(inclusive),maxGQ=3(exclusive)
##GVCFBlock20-21=minGQ=20(inclusive),maxGQ=21(exclusive)
##GVCFBlock21-22=minGQ=21(inclusive),maxGQ=22(exclusive)
##GVCFBlock22-23=minGQ=22(inclusive),maxGQ=23(exclusive)
##GVCFBlock23-24=minGQ=23(inclusive),maxGQ=24(exclusive)
##GVCFBlock24-25=minGQ=24(inclusive),maxGQ=25(exclusive)
##GVCFBlock25-26=minGQ=25(inclusive),maxGQ=26(exclusive)
##GVCFBlock26-27=minGQ=26(inclusive),maxGQ=27(exclusive)
##GVCFBlock27-28=minGQ=27(inclusive),maxGQ=28(exclusive)
##GVCFBlock28-29=minGQ=28(inclusive),maxGQ=29(exclusive)
##GVCFBlock29-30=minGQ=29(inclusive),maxGQ=30(exclusive)
##GVCFBlock3-4=minGQ=3(inclusive),maxGQ=4(exclusive)
##GVCFBlock30-31=minGQ=30(inclusive),maxGQ=31(exclusive)
##GVCFBlock31-32=minGQ=31(inclusive),maxGQ=32(exclusive)
##GVCFBlock32-33=minGQ=32(inclusive),maxGQ=33(exclusive)
##GVCFBlock33-34=minGQ=33(inclusive),maxGQ=34(exclusive)
##GVCFBlock34-35=minGQ=34(inclusive),maxGQ=35(exclusive)
##GVCFBlock35-36=minGQ=35(inclusive),maxGQ=36(exclusive)
##GVCFBlock36-37=minGQ=36(inclusive),maxGQ=37(exclusive)
##GVCFBlock37-38=minGQ=37(inclusive),maxGQ=38(exclusive)
##GVCFBlock38-39=minGQ=38(inclusive),maxGQ=39(exclusive)
##GVCFBlock39-40=minGQ=39(inclusive),maxGQ=40(exclusive)
##GVCFBlock4-5=minGQ=4(inclusive),maxGQ=5(exclusive)
##GVCFBlock40-41=minGQ=40(inclusive),maxGQ=41(exclusive)
##GVCFBlock41-42=minGQ=41(inclusive),maxGQ=42(exclusive)
##GVCFBlock42-43=minGQ=42(inclusive),maxGQ=43(exclusive)
##GVCFBlock43-44=minGQ=43(inclusive),maxGQ=44(exclusive)
##GVCFBlock44-45=minGQ=44(inclusive),maxGQ=45(exclusive)
##GVCFBlock45-46=minGQ=45(inclusive),maxGQ=46(exclusive)
##GVCFBlock46-47=minGQ=46(inclusive),maxGQ=47(exclusive)
##GVCFBlock47-48=minGQ=47(inclusive),maxGQ=48(exclusive)
##GVCFBlock48-49=minGQ=48(inclusive),maxGQ=49(exclusive)
##GVCFBlock49-50=minGQ=49(inclusive),maxGQ=50(exclusive)
##GVCFBlock5-6=minGQ=5(inclusive),maxGQ=6(exclusive)
##GVCFBlock50-51=minGQ=50(inclusive),maxGQ=51(exclusive)
##GVCFBlock51-52=minGQ=51(inclusive),maxGQ=52(exclusive)
##GVCFBlock52-53=minGQ=52(inclusive),maxGQ=53(exclusive)
##GVCFBlock53-54=minGQ=53(inclusive),maxGQ=54(exclusive)
##GVCFBlock54-55=minGQ=54(inclusive),maxGQ=55(exclusive)
##GVCFBlock55-56=minGQ=55(inclusive),maxGQ=56(exclusive)
##GVCFBlock56-57=minGQ=56(inclusive),maxGQ=57(exclusive)
##GVCFBlock57-58=minGQ=57(inclusive),maxGQ=58(exclusive)
##GVCFBlock58-59=minGQ=58(inclusive),maxGQ=59(exclusive)
##GVCFBlock59-60=minGQ=59(inclusive),maxGQ=60(exclusive)
##GVCFBlock6-7=minGQ=6(inclusive),maxGQ=7(exclusive)
##GVCFBlock60-70=minGQ=60(inclusive),maxGQ=70(exclusive)
##GVCFBlock7-8=minGQ=7(inclusive),maxGQ=8(exclusive)
##GVCFBlock70-80=minGQ=70(inclusive),maxGQ=80(exclusive)
##GVCFBlock8-9=minGQ=8(inclusive),maxGQ=9(exclusive)
##GVCFBlock80-90=minGQ=80(inclusive),maxGQ=90(exclusive)
##GVCFBlock9-10=minGQ=9(inclusive),maxGQ=10(exclusive)
##GVCFBlock90-99=minGQ=90(inclusive),maxGQ=99(exclusive)
##GVCFBlock99-100=minGQ=99(inclusive),maxGQ=100(exclusive)
##INFO=<ID=BaseQRankSum,Number=1,Type=Float,Description="Z-score from Wilcoxon rank sum test of Alt Vs. Ref base qualities">
##INFO=<ID=ClippingRankSum,Number=1,Type=Float,Description="Z-score From Wilcoxon rank sum test of Alt vs. Ref number of hard clipped bases">
##INFO=<ID=DB,Number=0,Type=Flag,Description="dbSNP Membership">
##INFO=<ID=DP,Number=1,Type=Integer,Description="Approximate read depth; some reads may have been filtered">
##INFO=<ID=DS,Number=0,Type=Flag,Description="Were any of the samples downsampled?">
##INFO=<ID=END,Number=1,Type=Integer,Description="Stop position of the interval">
##INFO=<ID=ExcessHet,Number=1,Type=Float,Description="Phred-scaled p-value for exact test of excess heterozygosity">
##INFO=<ID=InbreedingCoeff,Number=1,Type=Float,Description="Inbreeding coefficient as estimated from the genotype likelihoods per-sample when compared against the Hardy-Weinberg expectation">
##INFO=<ID=MLEAC,Number=A,Type=Integer,Description="Maximum likelihood expectation (MLE) for the allele counts (not necessarily the same as the AC), for each ALT allele, in the same order as listed">
##INFO=<ID=MLEAF,Number=A,Type=Float,Description="Maximum likelihood expectation (MLE) for the allele frequency (not necessarily the same as the AF), for each ALT allele, in the same order as listed">
##INFO=<ID=MQ,Number=1,Type=Float,Description="RMS Mapping Quality">
##INFO=<ID=MQRankSum,Number=1,Type=Float,Description="Z-score From Wilcoxon rank sum test of Alt vs. Ref read mapping qualities">
##INFO=<ID=RAW_MQ,Number=1,Type=Float,Description="Raw data for RMS Mapping Quality">
##INFO=<ID=ReadPosRankSum,Number=1,Type=Float,Description="Z-score from Wilcoxon rank sum test of Alt vs. Ref read position bias">
##contig=<ID=pseudochrom_23,length=13860564>
##source=HaplotypeCaller
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT CL100020307_L01_17
pseudochrom_23 1 . A <NON_REF> . . END=13860564 GT:DP:GQ:MIN_DP:PL 0/0:0:0:0:0,0,0
I don't know if it's reasonable to suppose that there must be some variation, as the dbsnp vcf file contained 11733 variation. Even if there is no variation, HaplotypeCaller should output all recode like position 1. But there is nothing.
a question about running HaplotypeCaller with intervals
Hi,
I have a question when running HaplotypeCaller functions with intervals on exome-seq data.
Here is the command I used:
java -jar gatk-package-4.0.6.0-local.jar HaplotypeCaller -R /espresso/share/genomes/hg38/genome.fa -I recal_reads.bam -O variants.g.vcf -ERC GVCF -L capture.bed
However, when I ran the command, I got the following message:
17:13:14.439 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/gatk-4.0.6.0/gatk-package-4.0.6.0-local.jar!/com/intel/gkl/native/libgkl_compression.so 17:13:14.591 INFO HaplotypeCaller - ------------------------------------------------------------ 17:13:14.591 INFO HaplotypeCaller - The Genome Analysis Toolkit (GATK) v4.0.6.0 17:13:14.591 INFO HaplotypeCaller - For support and documentation go to https://software.broadinstitute.org/gatk/ 17:13:14.591 INFO HaplotypeCaller - Executing as ... on Linux v2.6.32-431.29.2.el6.x86_64 amd64 17:13:14.592 INFO HaplotypeCaller - Java runtime: Java HotSpot(TM) 64-Bit Server VM v1.8.0_121-b13 17:13:14.592 INFO HaplotypeCaller - Start Date/Time: July 16, 2018 5:13:14 PM EDT 17:13:14.592 INFO HaplotypeCaller - ------------------------------------------------------------ 17:13:14.592 INFO HaplotypeCaller - ------------------------------------------------------------ 17:13:14.592 INFO HaplotypeCaller - HTSJDK Version: 2.16.0 17:13:14.592 INFO HaplotypeCaller - Picard Version: 2.18.7 17:13:14.592 INFO HaplotypeCaller - HTSJDK Defaults.COMPRESSION_LEVEL : 2 17:13:14.592 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false 17:13:14.592 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true 17:13:14.592 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false 17:13:14.593 INFO HaplotypeCaller - Deflater: IntelDeflater 17:13:14.593 INFO HaplotypeCaller - Inflater: IntelInflater 17:13:14.593 INFO HaplotypeCaller - GCS max retries/reopens: 20 17:13:14.593 INFO HaplotypeCaller - Using google-cloud-java patch 6d11bef1c81f885c26b2b56c8616b7a705171e4f from https://github.com/droazen/google-cloud-java/tree/dr_all_nio_fixes 17:13:14.593 INFO HaplotypeCaller - Initializing engine 17:13:15.037 INFO FeatureManager - Using codec BEDCodec to read file file:///capture.bed 17:13:16.883 INFO IntervalArgumentCollection - Processing 64190747 bp from intervals 17:13:17.009 INFO HaplotypeCaller - Shutting down engine [July 16, 2018 5:13:17 PM EDT] org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCaller done. Elapsed time: 0.04 minutes. Runtime.totalMemory()=2041053184 java.lang.NullPointerException at java.util.ComparableTimSort.countRunAndMakeAscending(ComparableTimSort.java:325) at java.util.ComparableTimSort.sort(ComparableTimSort.java:202) at java.util.Arrays.sort(Arrays.java:1312) at java.util.Arrays.sort(Arrays.java:1506) at java.util.ArrayList.sort(ArrayList.java:1454) at java.util.Collections.sort(Collections.java:141) at org.broadinstitute.hellbender.utils.IntervalUtils.sortAndMergeIntervals(IntervalUtils.java:459) at org.broadinstitute.hellbender.utils.IntervalUtils.getIntervalsWithFlanks(IntervalUtils.java:956) at org.broadinstitute.hellbender.utils.IntervalUtils.getIntervalsWithFlanks(IntervalUtils.java:971) at org.broadinstitute.hellbender.engine.MultiIntervalLocalReadShard.<init>(MultiIntervalLocalReadShard.java:59) at org.broadinstitute.hellbender.engine.AssemblyRegionWalker.makeReadShards(AssemblyRegionWalker.java:195) at org.broadinstitute.hellbender.engine.AssemblyRegionWalker.onStartup(AssemblyRegionWalker.java:175) at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:133) at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:180) at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:199) at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160) at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203) at org.broadinstitute.hellbender.Main.main(Main.java:289)
I did not see any error but it seems HaplotypeCaller did not run and there is no output.
So I will really appreciate it if I can get help from you guys.
Thank you!
Best,
Siyu
can VariantsToTable output the raw genotype call (i.e., 0/1) rather than the actual basecall (A/T)?
I'm interested in getting simple "heterozygous" or "homozygous" designations for all of the samples/SNPs in my multisample VCF file. In the past, I have been using the -GF GT option in VariantsToTable, and then annotating my basecalls in Excel as either heterozygous or homozygous. This takes forever since Excel isn't really built for big data like this. Is there a simple way to output all of the SNPs as 0/1, 0/0, 0/1, or 1/1 instead of C/A, A/A, G/T, C/C?
Short read data in highly repetitive genomic region for heterozygous individuals
Hello GATK team,
This might be a very general and overrated question but I appreciate your input. I am working with natural populations of plants (expected highly heterozygous individuals) and an enriched genomic region which contains some promoters of interest together with transposons, duplications and a lot of expected indels and SVs, including a potential paralog for one of our BACs. Unfortunately the long read sequencing is not yet ready so I am using the 2*75pb data and our BAC sequences as references to test how close we can get with HaplotypeCaller to see some SNP and short indel calls for an association analysis. Our coverage distribution seems to be heavily biased towards areas with duplications and potential TE and most of the assemblers based on local assembly are thrown off by our data. I have use very strict mapping parameters to avoid this problem with missaligned reads, given that we can't discard the possibility of having hyper-variable regions.
I understand that aiming for genotype calls is dangerous given our kind of data and the lack of a genome reference, so I am aiming to include the genotype likelihoods into the association analysis. With HaplotypeCaller I get a vcf file for my population and an associated PL value. My question is basically if given our type of data, do you consider that the local assembly inherent to HaplotypeCaller will give us false positives variants in the final output? Do you have any suggestion or alternative tools to get genotype likelihoods (without local assembly?) and input those into an association analysis tool?
I really appreciate your insight.
Best,
Distribution of RGQ scores
I work with non-human genomes and commonly need the confidence of the reference sites, so I was happy to see the inclusion of the RGQ score in the format field of GenotypeGVCFs. However, I am a little confused as to what this score means (how it is calculated). Out of curiosity I plotted the distribution of RGQ and GQ scores over ~1Mbp. A few things jumped out that I was hoping you could explain:
(1) There are two peaks of GQ and RGQ scores, one at 99 - which is obviously just the highest confidence score and another at exactly GQ/RGQ=45. You can see this in the GQ/RGQ distribution below. I've excluded the sites where RGQ/GQ = 0 or 99 (RGQ = blue, GQ=red) is there some reason why so many GT calls == 45?
(2) There are very few GQ = 0 calls and ~96% are GQ=99 - but in the RGQ ~42% == 0 and 54%=99. Is there any explanation why so many RGQ scores == 0? I fear that filtering on RGQ will bias the data against reference calls and include a disproportionate number of variant calls.
Issue of Haplotype call on a large chromosome (>536 Mb)
Hi
I tried to run HaplotypeCaller with GVCF mode. My reference genome is over 5 Gb in size. Below my code and error,
Using GATK jar /source/gatk-4.0.6.0/gatk-package-4.0.6.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -XX:+UseSerialGC -Xmx100g -jar /source/gatk-4.0.6.0/gatk-package-4.0.6.0-local.jar HaplotypeCaller -R /data/Pseudomolecule_v3.fasta -L /IntervalFiles/0003-scattered.intervals -I WGS_FTNO.cram -O result/0003-scattered.vcf.gz -mbq 20 --native-pair-hmm-threads 4 -ERC GVCF --verbosity ERROR
[August 1, 2018 11:32:11 AM CEST] org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCaller done. Elapsed time: 0.01 minutes.
Runtime.totalMemory()=2076049408
htsjdk.samtools.SAMException: Exception creating BAM index for slice slice: seqID 1, start 536834320, span 457789, records 259850.
at htsjdk.samtools.CRAMBAIIndexer.processSingleReferenceSlice(CRAMBAIIndexer.java:194)
at htsjdk.samtools.cram.CRAIIndex.openCraiFileAsBaiStream(CRAIIndex.java:180)
at htsjdk.samtools.SamIndexes.asBaiSeekableStreamOrNull(SamIndexes.java:78)
at htsjdk.samtools.CRAMFileReader.initWithStreams(CRAMFileReader.java:228)
at htsjdk.samtools.CRAMFileReader.(CRAMFileReader.java:219)
at htsjdk.samtools.SamReaderFactory$SamReaderFactoryImpl.open(SamReaderFactory.java:422)
at htsjdk.samtools.SamReaderFactory.open(SamReaderFactory.java:105)
at org.broadinstitute.hellbender.engine.ReadsDataSource.(ReadsDataSource.java:227)
at org.broadinstitute.hellbender.engine.ReadsDataSource.(ReadsDataSource.java:162)
at org.broadinstitute.hellbender.engine.GATKTool.initializeReads(GATKTool.java:387)
at org.broadinstitute.hellbender.engine.GATKTool.onStartup(GATKTool.java:636)
at org.broadinstitute.hellbender.engine.AssemblyRegionWalker.onStartup(AssemblyRegionWalker.java:156)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:133)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:180)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:199)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
at org.broadinstitute.hellbender.Main.main(Main.java:289)
Caused by: java.lang.ArrayIndexOutOfBoundsException: 32770
at htsjdk.samtools.CRAMBAIIndexer$BAMIndexBuilder.processSingleReferenceSlice(CRAMBAIIndexer.java:354)
at htsjdk.samtools.CRAMBAIIndexer$BAMIndexBuilder.access$100(CRAMBAIIndexer.java:227)
at htsjdk.samtools.CRAMBAIIndexer.processSingleReferenceSlice(CRAMBAIIndexer.java:192)
... 17 more
Does GATK4 handle large single chromosome ? Is there any solution ?
Mutect2 missed variant called by HaplotypeCaller
Hi,
I am running GATK 3.5.0 with java version 1.8.0. I have two cell line samples that I paired with a promega baseline reference (its essentially a mixed germline sample) to run Mutect2 (which I am aware of is not a part of the Best Practices). I also ran the tumour sample a lone using the HaplotypeCaller and noticed a very clear ALK variant that was missed by Mutect2 but called by the HaplotypeCaller in both samples. Due to the nature of the cell line we also expected to see an ALK variant which is why it was detected.
What I find odd is that the local reassembly of Mutect2 seems to have discarded the variant as the bamout does not contain the variant (C > T) at loci chr2:29443695 whereas the HaplotypeCaller call does for both samples. I have read through the documentation and the specifics of the local reassembly and would be very interested in knowing at what stage this occurs and your suggestions on what can be done.
I will be trying GATK v.4.0 as well as some of the things mentioned here https://software.broadinstitute.org/gatk/documentation/article?id=1235 in the meantime I would be very greatful if someone could look into this. I will be posting the updates on my new tests as well. See details below on various metrics and IGV screenshots.
The chemistry is a DNA capture Kapa hyperplus kit, 75 paired end reads.
Sample 945
- Entire ALK covered up to 80X
- Mean/min coverage 1013/378
- BWA bam shows 50% allele frequency
HaplotypeCaller line Sample 945
- chr2 29443695 . G T 8496.77 . AC=1;AF=0.500;AN=2;BaseQRankSum=5.863;ClippingRankSum=-0.368;DP=601;ExcessHet=3.0103;FS=0.536;MLEAC=1;MLEAF=0.500;MQ=62.46;MQRankSum=1.113;QD=14.21;ReadPosRankSum=0.502;SOR=0.76GT:AD:DP:GQ:PL 0/1:300,298:598:99:8525,0,8240
Sample 946
- Entire ALK covered up to 80x
- Mean/min coverage 523/204
- BWA bam shows 49% allele frequency
HaplotypeCaller line Sample 946
- chr2 29443695 . G T 5056.77 . AC=1;AF=0.500;AN=2;BaseQRankSum=3.569;ClippingRankSum=-0.212;DP=397;ExcessHet=3.0103;FS=2.133;MLEAC=1;MLEAF=0.500;MQ=63.61;MQRankSum=-1.274;QD=13.00;ReadPosRankSum=0.063;SOR=0.595 GT:AD:DP:GQ:PL 0/1:199,190:389:99:5085,0,5319
Promega control sample
- Same control sample used as pair for both 945 and 946 using Mutect
- Coverage around ALK region ~200+
Please see IGV images of the various cases below. The --bamout (run together with disabling optimization and forcing output) command was run with a 500bp padding downstream and upstream of the target location that contains the variant (i.e the actual padding upstream and downstream the actual variant at loci 29443695 will be slighly more than 500bp). I also ran mutect with the adjust 500bp but included all the targets in chr2 without adding any padding on any other targets other than the one that contains the variant.
Sample945_bwaBAM - Bam output from BWA
Sample946_bwaBAM - Bam output from BWA
Sample945_GATKForcedBamOut
Sample946_GATKForcedBamOut
Sample945_MutectForcedBamOutChr2
Sample946_MutectForcedBamOutChr2
Sample945_MutectForcedBamOutALKOnly
Sample946_MutectForcedBamOutALKOnly
Thank you very much and I look forward hearing your thoughts on this
Sabri
Haptyepecaller calls incorrect genotype in several site
Hi,
I found that the Haptyepecaller made heterozygous calls where there is no support for them in the BAM. We use IGV to compare input BAM and Haptyepecaller output bam. The region shown in the figure confused us. At the top of this figure is input-BAM while another is Haptyepecaller-output-bam. Haptyepecaller-output-gvcf also suggest this site is heterozygous.
It seems that it's the same issue as https://gatkforums.broadinstitute.org/gatk/discussion/2319/haplotypecaller-incorrectly-making-heterozygous-calls-again. In that question,your suggested solution is updating GATK. Howerer,we used GATK 3.8 and GATK4.0.6 and we got same results.
The command line we used is:
~/software/gatk-4.0.6.0/gatk --java-options "-Xmx30G" HaplotypeCaller -L chr01:9550000-9850000 -ERC GVCF -R -I -O <output_g.vcf> -bamout
Quality of mutation by constructing haplotypes
Hi there,
i have two questions
one can i construct haplotype using GATK haplotypecaller?
how can i check the quality of a mutation using haplotypes?
Is there any way to take vcf data and output 2 fastas - one of each of the sample's alleles?
So I looked at using the ReadBackedPhasing tool or the Haplotype caller but I already have a calling pipeline setup that works well with my data and I'm really just looking at a way to leverage the vcfs I generate to make consensus fastas of each allele. Sample data is diploid, currently I export to a fasta with ambiguity codes and use dnasp to generate the allele fastas, but I know there's got to be a good way to leverage that vcf information.
adapter removal and variant calling in samples with different library prep/pre-processing
Hi,
This question is an amalgamation of good practice and conceptual doubts. So I have a cohort of a non-model organism of say approx. 100 animals. 40 have been sequenced at 10x depth by Illumina 2500 machine and rest 60 have been sequenced at 30x depth by Illumina 4000. The samples that have been sequenced at 30x had their adapters removed during bcl2fastq conversion stage. Unfortunately, the samples that were sequenced at 10x did not have their adapters removed. On doing some fastqc analysis, adapters were found in those samples, but except for one or two samples, the lines did not reach the red zone.
I used BWA-mem for alignment. Theoretically, the adapters present in 10x samples get soft-clipped be default as they won't match the reference genome. Hence, I did not remove the adapters from those samples. My aim is to understand the genetic variation amongst those samples and hence followed the germline variant discovery pipeline (SNPs+Indels). The questions are:
1) Haplotypecaller does local-reassembly and throws away MAPQ information and also uses soft-clipped bases for re-alignment unless '--dontUseSoftClippedBases' is used. During realignment, technically, the adapter sequences won't align again and Haplotypecaller will call SNPs or indels from those regions?
2) Since a joint genotype calling is done at a later stage, when genotype calling is done at a region where adapter is present in a 10x sample, adapter won't be found at that region in the 30x samples, and a lower genotype quality score will be given to that particular locus with and SNP or probably an indel? I will be filtering positions (put them to missing) when GQ will be less than 40 which may reduce wrongly assigned variants/ genotypes.
3) Should I have removed the adapters before performing variant calling? I wanted to keep the pipeline same for all samples and because of my above understanding, I followed my procedure of not removing adapters from 10x depth samples.