Quantcast
Channel: haplotypecaller — GATK-Forum
Viewing all articles
Browse latest Browse all 1335

Problem with allele specific annotation AS_QualByDepth (AS_QD) during variant calling

$
0
0

Hi GATK team,

First a big thank you for all your hard work in developing the tool and supporting the users!

I am trying out the allelic specific(AS) annotations in version 3.6. While I have gotten a few other AS annotations to properly show up in my VCF, I am having trouble getting the AS_QualByDepth in particular.

For example, I tried to call variant on a few samples at a specific locus with a "T" homopolymer run. I first ran HaplotypeCaller in the GVCF mode for each sample:

java -jar GenomeAnalysisTK.jar\
  -T HaplotypeCaller \
  --emitRefConfidence GVCF -variant_index_type LINEAR -variant_index_parameter 128000 \
  -R ref_fasta \
  -I sample_$i \
  -L chr1:10348759-10348801 \
  -A AS_StrandOddsRatio -A AS_FisherStrand -A AS_QualByDepth \
  -A AS_BaseQualityRankSumTest -A AS_ReadPosRankSumTest -A AS_MappingQualityRankSumTest
  -o sample_$i.gvcf

I then did GenotypeGVCFs on all the samples together:

java -jar GenomeAnalysisTK.jar\
  -T GenotypeGVCFs \
  -R ref_fasta \
  -V gvcf_list \
  -L chr1:10348759-10348801 \
  -A AS_StrandOddsRatio -A AS_FisherStrand -A AS_QualByDepth \
  -A AS_BaseQualityRankSumTest -A AS_ReadPosRankSumTest -A AS_MappingQualityRankSumTest
  -o out.vcf

In the final joint-called VCF header, the following AS annotations all showed up.

##INFO=<ID=AS_BaseQRankSum,Number=A,Type=Float,Description="allele specific Z-score from Wilcoxon rank sum test of each Alt Vs. Ref base qualities">
##INFO=<ID=AS_FS,Number=A,Type=Float,Description="allele specific phred-scaled p-value using Fisher's exact test to detect strand bias of each alt allele">
##INFO=<ID=AS_MQRankSum,Number=A,Type=Float,Description="Allele-specific Mapping Quality Rank Sum">
##INFO=<ID=AS_QD,Number=1,Type=Float,Description="Allele-specific Variant Confidence/Quality by Depth">
##INFO=<ID=AS_RAW_BaseQRankSum,Number=1,Type=String,Description="raw data for allele specific rank sum test of base qualities">
##INFO=<ID=AS_RAW_MQRankSum,Number=1,Type=String,Description="Allele-specific raw data for Mapping Quality Rank Sum">
##INFO=<ID=AS_RAW_ReadPosRankSum,Number=1,Type=String,Description="allele specific raw data for rank sum test of read position bias">
##INFO=<ID=AS_ReadPosRankSum,Number=A,Type=Float,Description="allele specific Z-score from Wilcoxon rank sum test of each Alt vs. Ref read position bias">
##INFO=<ID=AS_SB_TABLE,Number=1,Type=String,Description="Allele-specific forward/reverse read counts for strand bias tests">
##INFO=<ID=AS_SOR,Number=A,Type=Float,Description="Allele specific strand Odds Ratio of 2x|Alts| contingency table to detect allele specific strand bias">

However, in the INFO column, I only got the other AS annotations but not AS_QD.

chr1    10348779        .       AT      A,ATT   981.29  .       AC=4,2;AF=0.333,0.167;AN=12;AS_BaseQRankSum=-1.087,-2.521;AS_FS=3.986,7.378;AS_MQRankSum=-1.130,-2.349;AS_ReadPosRankSum=-1.192,-1.396;AS_SOR=0.415,0.254;BaseQRankSum=-6.350e-01;ClippingRankSum=0.00;DP=627;ExcessHet=14.6052;FS=6.378;MLEAC=4,2;MLEAF=0.333,0.167;MQ=59.95;MQRankSum=0.00;QD=1.94;ReadPosRankSum=-1.050e-01;SOR=0.352        GT:AD:DP:GQ:PL  0/1:44,9,7:63:81:81,0,1033,93,844,1165  0/1:71,11,8:99:47:47,0,1659,110,1414,1803       0/1:54,15,7:81:99:205,0,1239,280,1087,1635      0/1:69,25,12:106:99:311,0,1603,336,1306,2058    0/2:55,11,22:94:99:291,233,1636,0,943,1294      0/2:61,11,14:91:14:92,14,1473,0,1071,1468

I also checked the individual sample gVCFs. Similarly, there is AS_QD in the header but not in the INFO column. I wondering if this might be a bug or I am doing something wrong.

Another curious thing I noticed is that in the VCF header, the other AS annotations all have "Number=A" but AS_QD has "Number=1". Don't know if this might be causing some problem.


Viewing all articles
Browse latest Browse all 1335

Trending Articles