Hello,
We are calling variants on data that has been sequenced on NextSeq platform. We have been using the same pipeline , with the same commands since a year and in all our runs we have a control sample to check if the variant calling and sequencing has been done right. For this particular run, one of a known SNP 7:143013285
that was called in the same sample in the previous 5 runs (over the last year) was missed by haplotype caller. On looking at the bam file, the variant seems to be present (highlighted BAM file). Above two bam files are from the same sample that were called in previous runs and HC was able to pick it up. The command I use are as follows
trim_galore -q 0 --paired --fastqc $R1_fastq $R2_fastq --output_dir $FASTQ
bwa mem -M -t 8 $ind.fa $FASTQ/${s_id}.R1_val_1.fq.gz $FASTQ/${s_id}.R2_val_2.fq.gz | sambamba_v0.6.6 view -t 8 -S -h -f bam -o $s_id.bam /dev/stdin
sambamba_v0.6.6 sort -t 8 -o $s_id.sorted.bam $s_id.bam
sambamba_v0.6.6 index -t 8 $s_id.sorted.bam
java -jar $picard AddOrReplaceReadGroups I=$s_id.sorted.bam O=$s_id.sorted.RG.bam SORT_ORDER=coordinate RGID=$s_id RGLB=$flowcell RGPL=illumina RGPU=U RGSM=$RUNNAME
sambamba_v0.6.6 index $s_id.sorted.RG.bam
sambamba_v0.6.6 markdup -t 8 $s_id.sorted.RG.bam $s_id.markdup.bam
sambamba_v0.6.6 index $s_id.markdup.bam
java -Xmx8g -Djava.io.tmpdir=/ionng/tmp -jar $gatk -T BaseRecalibrator \
-I $s_id.markdup.bam \
-R $ind.fa \
-knownSites dbsnp_138.b37.vcf \
-knownSites Mills_and_1000G_gold_standard.indels.b37.vcf \
-knownSites 1000G_phase1.indels.b37.vcf \
-o $s_id.recal_data.table \
-L $bed
#Apply the Recalibration
java -Xmx8g -Djava.io.tmpdir=$TMPDIR -jar $gatk -T PrintReads \
-I $s_id.markdup.bam \
-R $ind.fa \
-BQSR $s_id.recal_data.table \
-o $s_id.${RUNNAME}.variant_ready.bam
java -Xmx32g -jar $gatk -T HaplotypeCaller \
-R $ind.fa --dbsnp $dbsnp_138.b37.vcf \
-I $s_id.${RUNNAME}.variant_ready.bam \
-stand_call_conf 30.0 \
-L $bed \
-o $s_id.${RUNNAME}.g.vcf
Things that I have tried which did not work:-
1. tried running HC with the options -allowNonUniqueKmers
2. also tried options to change parameters -stand_call_conf 2.0 -mmq 5
3. tried -ERC BP_RESOLUTION
that results in
7 143013285 . C <NON_REF> . . . GT:AD:DP:GQ:PL 0/0:9,2:11:0:0,0,152
NOTE: The variant was picked up when I ran FREEBAYES and VARSCAN using default parameters.
FREEBAYES
7 143013285 . C T 206.73 . AB=0.394737;ABP=6.66752;AC=1;AF=0.5;AN=2;AO=15;CIGAR=1X;DP=38;DPB=38;DPRA=0;EPP=20.5268;EPPR=37.093;GTI=0;LEN=1;MEANALT=1;MQM=60;MQMR=60;NS=1;NUMALT=1;ODDS=47.6012;PAIRED=1;PAIREDR=1;PAO=0;PQA=0;PQR=0;PRO=0;QA=495;QR=811;RO=23;RPL=2;RPP=20.5268;RPPR=37.093;RPR=13;RUN=1;SAF=15;SAP=35.5824;SAR=0;SRF=23;SRP=52.9542;SRR=0;TYPE=snp;technology.illumina=1 GT:DP:AD:RO:QR:AO:QA:GL 0/1:38:23,15:23:811:15:495:-33.4265,0,-61.8717
VARSCAN
7 143013285 . C T . PASS ADP=38;WT=0;HET=1;HOM=0;NC=0 GT:GQ:SDP:DP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR 0/1:52:38:38:23:15:39.47%:5.4463E-6:35:33:23:0:15:0
I can email the bamout file if required (though I am not allowed to upload it publicly.)
Any suggestions will be helpful. Thanks.
Thankyou