Hi,
I am currently working with data from HaloPlex Target Enrichment System. HaloPlex is using retriction enzymes to digest the DNA, thus producing non-random reads and often have false mutations in 3´and 5´ends caused by adapter remnant. The problem with the adpater remnant mutations has previously been handled using custom scripts as described in Geéen et al. (https://doi.org/10.1016/j.jmoldx.2014.09.006 ) : _First, the cleaned index-sorted paired-end reads were scanned for flanking HaloPlex adapter sequences, ie, 5′-AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC-3′ and 5′-AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTAGATCTCGGTGGTCGCCGTATCATT-3′. However, the adapter 5′-recognition motif was restricted to 6 to 13 bp depending on the position of the adapter in the read. A perfect match was required in each case, which is simpler and faster compared with the procedure recommended by the HaloPlex development team. The minimal sequence for identification of an adapter at the 3′ end of a read was set to AGATCG. The adapter sequences were removed in the following way: i) five bases were removed from the 3′ end of all reads lacking identified adapter sequence (resulting in approximately 146-bp reads), ii) reads with adapter sequence within 50 bp of the 5′ end were discarded, and iii) reads with flanking adapter sequence in the 3′ end were trimmed by removal of the corresponding number of nucleotides. _
My question: Is there a option in variantFiltration or HaplotypeCaller that can mask/ignore variants that are detected in X (fx. 5) bp distance from the 3´and 5´ end of the reads?
Thank you!