Calling variants in a small target region

Hi,
This may not directly relate to how GATK works, but I am still asking to get your expert input (I do use GATK for this pipeline).

I am trying to test my variant calling pipeline that I have prepared for my target region of 5Mb for which I need a test dataset where the fastq files and the VCF files are provided so that I can run my pipeline on those fastqs and then compare my VCF to those VCFs. So, I used the two whole exome datasets from Genome In a Bottle data with the following steps:

In order to obtain the fastqs that originated from my target region, I used the bam files published by GIAB and extracted the alignments falling in my target region using samtools
Converted the extracted bams in those regions to paired-end fastqs
Aligned those fastqs to the entire genome using BWA
Called the variants in my region of interest using GATK (with -L option, restricting my analysis to my target region for RealignerTargetCreator and HaplotypeCaller steps only). I don't do BQSR and VQSR.
Compared my variants to the variants from GIAB in the target region.

But I only get 50% variant calls correctly. I call only 50% of the total variants in that region. What could be the reason for this low agreement rate?

Interestingly, when I call the variants on the whole exome for which the fastqs were originally created, I can obtain 96% of variants in the whole-exome region published by GIAB. Moreover, from those variants, when I extract the variants that fall in my target regions and compare it to the corresponding GIAB variants, I can go upto 97%. In other words, when I use the entire whole-exome data, I can call 97% of the variants in my target region, but when I start with the alignments falling in my target region, convert to fastq and then call variants in the target region, I only get 50%. I use the exact same settings of GATK in both cases (except the -L region which is different).

Can you please help me figure out what could be going wrong?

Thanks,

Calling variants in a small target region

Trending Articles

VHSE First (1st) Allotment 2025 - vhscap.kerala.gov.in

Tumkur University Results 2017 May-June Semester 1st-2nd-3rd-4th-5th-6th

Windows Update / Microsoft Update の接続先 URL について

Moondru Mudichu 20-07-2016 – Polimer tv Serial

A/L Technology Stream – Subject combinations, Syllabuses and Teacher guides

[GET] Kizzy Parks – Govcon Winners Proposal Master Academy ($2,397.00)

Bureau of Internal Revenue: Regional Offices (Directory)

Tench receives death penalty

Lady Gaga – MAYHEM (Bonus Tracks Version) [iTunes Rip M4A]

The Angry Birds Movie (Tamil Dubbed)

Alan Walker & Meek – Dancing in Love – Single [iTunes Plus M4A]

VARRIO KING KOBRAS RIFA

Black Angus Grilled Artichokes

Practice Sheet of Right form of verbs for HSC Students

Best Suvichar in Hindi |बेस्ट सुविचार |शुभ विचार हिंदी में

Game Of Thrones S03 Season 3 720p BluRay DTS x264-PublicHD

Connect failed:(10060) The current connection has timeout.

Notes of Development of Phy. Edu. - Post Independence| Class 11th Physical...

Late educator remembered at graduation

Tagalog Praise and Worship Song Lyrics with Chords