Hello,
I am using the new pipeline of haplotype caller in order to obtain a vcf file containing both variant and invariant sites.
For each individual, I called variant and invariant sites :
java -Xmx300g -jar GenomeAnalysisTK.jar \
-T HaplotypeCaller \
-R ref.fasta \
-I ${INPUT}.bam \
--genotyping_mode DISCOVERY
-stand_emit_conf 0 \
-stand_call_conf 0 \
-o ${INPUT}\_VC.vcf \
--emitRefConfidence BP_RESOLUTION \
--variant_index_type LINEAR \
--variant_index_parameter 128000 \
-nct 16
In the vcf that I obtain, I indeed have every position.
The problem is that he INFO and QUAL fileds are empty (.) if the site is non variant.
KE332545.1 44 . T <NON_REF> . . . GT:AD:DP:GQ:PL 0/0:13,0:13:39:0,39,503
KE332545.1 45 . T <NON_REF> . . . GT:AD:DP:GQ:PL 0/0:13,0:13:39:0,39,518
KE332545.1 46 . C T,<NON_REF> 0 . BaseQRankSum=-2.270;ClippingRankSum=-0.691;DP=17;MLEAC=0,0;MLEAF=0.00,0.00;MQ=38.98;MQ0=0;MQRankSum=0.099;ReadPosRankSum=0.493 GT:AD:DP:GQ:PL:SB 0/0:11,2,0:13:3:0,3,379,33,385,414:0,0,0,0
KE332545.1 47 . C <NON_REF> . . . GT:AD:DP:GQ:PL 0/0:13,0:13:39:0,39,515
KE332545.1 48 . A <NON_REF> . . . GT:AD:DP:GQ:PL 0/0:13,0:13:39:0,39,540
KE332545.1 49 . C <NON_REF> . . . GT:AD:DP:GQ:PL 0/0:13,0:13:39:0,39,563
But I also wanted this information in order to use my filtering pipeline on those invariant sites as well !
Any solution ?
Thanks !
Muriel