Analyzing the same sample with and without queue, I noticed a variant being filtered out in one of the runs with VQSRTrancheSNP99.00to99.90 in the filter column.
In my debugging of the problem, I noticed that the size of the region in HaplotypeCaller can influence both BaseQRankSum and ReadPosRankSum greatly in the g.vcf file.
commands:
1)
java -Xmx8g -Djava.io.tmpdir=tmp -jar /com/extra/GATK/3.5/jar-bin/GenomeAnalysisTK.jar -T HaplotypeCaller -I BDD.sorted.markdup.realigned.recal.bam -R ucsc.hg19_chrY_PAR1_PAR2_masked.fasta -L chr5:171333106-177333146 --genotyping_mode DISCOVERY --dbsnp dbsnp_138.hg19.vcf -ERC GVCF -variant_index_type LINEAR -variant_index_parameter 128000 -o BDD.sorted.markdup.realigned.recal.HaplotypeCaller_gVCF_chr5.vcf.gz
2)
java -Xmx8g -Djava.io.tmpdir=tmp -jar /com/extra/GATK/3.5/jar-bin/GenomeAnalysisTK.jar -T HaplotypeCaller -I BDD.sorted.markdup.realigned.recal.bam -R ucsc.hg19_chrY_PAR1_PAR2_masked.fasta -L chr5:175333106-177333146 --genotyping_mode DISCOVERY --dbsnp dbsnp_138.hg19.vcf -ERC GVCF -variant_index_type LINEAR -variant_index_parameter 128000 -o BDD.sorted.markdup.realigned.recal.HaplotypeCaller_gVCF_chr5.vcf.gz
The results for the SNP in question in the g.vcf file:
1)
chr5 176333126 rs2292256 C T, 5817.77 . BaseQRankSum=0.389;ClippingRankSum=2.280;DB;DP=314;ExcessHet=3.0103;MLEAC=1,0;MLEAF=0.500,0.00;MQRankSum=-1.360;RAW_MQ=1130400.00;ReadPosRankSum=-1.733 GT:AD:DP:GQ:PGT:PID:PL:SB 0/1:154,160,0:314:99:0|1:176333126_C_T:5846,0,7455,6310,7937,14246:53,101,56,104
2)
chr5 176333126 rs2292256 C T, 5817.77 . BaseQRankSum=-0.254;ClippingRankSum=0.132;DB;DP=314;ExcessHet=3.0103;MLEAC=1,0;MLEAF=0.500,0.00;MQRankSum=-1.278;RAW_MQ=1130400.00;ReadPosRankSum=-1.679 GT:AD:DP:GQ:PGT:PID:PL:SB 0/1:154,160,0:314:99:0|1:176333126_C_T:5846,0,7455,6310,7937,14246:53,101,56,104
This is probably the cause of the SNP being filtered in one run (no-queue) and not the other (queue). This leaves me with the question of which is most correct.
But why are these values different?