I have read couple documents on GVCF but still can't understand how it works. Just one example from the GVCF file I got from HaplotypeCaller from a single bam file with `-ERC GVCF` option:
```
chr22 10718959 . T . . END=10718959 GT:DP:GQ:MIN_DP:PL 0/0:1:3:1:0,3,42
chr22 10718960 . T . . END=10718997 GT:DP:GQ:MIN_DP:PL 0/0:1:0:1:0,0,0
chr22 10718998 . C . . END=10719058 GT:DP:GQ:MIN_DP:PL 0/0:2:3:2:0,3,45
```
When I look at the original bam file around position 10718959, I see that there is indeed 1 read (as indicated in `DP` field), but its sequence matches the reference, with no variations! Why it is listed as a potential variant site at all?
Another example of the same kind:
```
chr22 12602453 . G . . END=12602461 GT:DP:GQ:MIN_DP:PL 0/0:33:99:33:0,99,1038
chr22 12602462 . A . . END=12602462 GT:DP:GQ:MIN_DP:PL 0/0:36:96:36:0,96,1440
chr22 12602463 . G . . END=12602464 GT:DP:GQ:MIN_DP:PL 0/0:37:99:37:0,99,1485
```
Very high genotyping quality score, and in the BAM file I see indeed 33-37 reads on this position - but again, all of them are same as a reference.
I will be very grateful if you could point me to any reference/resource that would be detailed enough to learn this sort of details. So far I have read
[GVCF - Genomic Variant Call Format](software.broadinstitute.org/gatk/documentation/article?id=11004) document, [FAQ on GVCF](software.broadinstitute.org/gatk/documentation/article.php?id=4017), and VCFv4.2 specs.
```
chr22 10718959 . T . . END=10718959 GT:DP:GQ:MIN_DP:PL 0/0:1:3:1:0,3,42
chr22 10718960 . T . . END=10718997 GT:DP:GQ:MIN_DP:PL 0/0:1:0:1:0,0,0
chr22 10718998 . C . . END=10719058 GT:DP:GQ:MIN_DP:PL 0/0:2:3:2:0,3,45
```
When I look at the original bam file around position 10718959, I see that there is indeed 1 read (as indicated in `DP` field), but its sequence matches the reference, with no variations! Why it is listed as a potential variant site at all?
Another example of the same kind:
```
chr22 12602453 . G . . END=12602461 GT:DP:GQ:MIN_DP:PL 0/0:33:99:33:0,99,1038
chr22 12602462 . A . . END=12602462 GT:DP:GQ:MIN_DP:PL 0/0:36:96:36:0,96,1440
chr22 12602463 . G . . END=12602464 GT:DP:GQ:MIN_DP:PL 0/0:37:99:37:0,99,1485
```
Very high genotyping quality score, and in the BAM file I see indeed 33-37 reads on this position - but again, all of them are same as a reference.
I will be very grateful if you could point me to any reference/resource that would be detailed enough to learn this sort of details. So far I have read
[GVCF - Genomic Variant Call Format](software.broadinstitute.org/gatk/documentation/article?id=11004) document, [FAQ on GVCF](software.broadinstitute.org/gatk/documentation/article.php?id=4017), and VCFv4.2 specs.