Hi team, (this is really two questions)
- Do you have any recommendations for hard-filtering haplotypecaller-generated vcfs ?
This was my previous filter for the unifiedgenotyper output"
`GenomeAnalysisTK -R ${ref} \
-T VariantFiltration \
-V ${my_vcf} \
-filter "QUAL<1000.0" -filterName "LowQual" \
-filter "MQ0>=4&&((MQ0/(1.0*DP))>0.1)" -filterName "BadVal" \
-filter "MQ<60" -filterName "LowMQ" \
-filter "QD<5.0" -filterName "LowQD" \
-filter "FS>60" -filterName "FishStra" \
-filter "DP<2000" -filterName "lowTotDP" \
-o qual_marked.vcf`
Obviously fields such as MQ0 won't work as this isn't present in the HC-generated vcf, and obviously there are many fields to filter on. (There are 222 samples and 1.9m variants in the vcf)
- One filter that I'm really keen to apply but never got the hang-of, is to drop all individual genotype calls where the coverage is less than 10X. (This is because I'm really interested in getting the genotype correct, rather than actually detecting mutations).
Sincerely,
William Gilks