Quantcast
Viewing all 1335 articles
Browse latest View live

Why did HaplotypeCaller report HET genotype for loci without reads supporting REF?

Hi GATK team,

When I checked closely by IGV, I found many of the loci, with all reads supporting variants, were reported as a HET genotype by HC.

After go back to the VCF result, I still don't understand why HC called this as a HET genotype, due to the ref's AD is 0, for example:

0/1:0,20:99:1069,0,1743

The whole line of this loci in VCF as below:

SCF1       47255   .       T       A       21688.96        .       AC=33;AF=0.500;AN=66;ActiveRegionSize=225;DP=264;EVENTLENGTH=0;FS=0.000;Haploty
peScore=73.8674;InbreedingCoeff=-0.8774;MLEAC=33;MLEAF=0.500;MQ=85.46;MQ0=0;NVH=5;NumHapAssembly=19;NumHapEval=12;QD=84.07;QDE=16.81;TYPE=SNP;extType=S
NP        GT:AD:GQ:PL     1/1:0,4:18:503,18,0     0/1:0,16:99:1732,0,2964 0/1:0,4:99:279,0,173    0/1:0,5:99:353,0,884    0/1:0,3:99:279,0,645    0/1:0
,4:99:270,0,584    0/1:0,19:99:1428,0,4290 0/1:0,7:99:388,0,794    0/1:0,3:99:302,0,391    0/1:0,15:99:1087,0,1165 0/1:0,6:99:529,0,1437   0/1:0,4:99:4
82,0,493    0/1:0,12:99:1245,0,2556 0/1:0,5:99:433,0,605    0/1:0,4:99:624,0,1135   0/1:0,3:99:362,0,1219   0/1:0,4:99:454,0,866    0/1:0,5:99:200,0,29
8    0/1:0,14:99:1846,0,1785 0/1:0,3:99:130,0,205    0/1:0,16:99:1453,0,999  0/1:0,15:99:757,0,866   0/1:0,12:99:939,0,518   0/1:0,10:99:697,0,758   0/
1:0,8:99:790,0,1319   0/1:0,9:99:748,0,1417   0/1:0,7:99:634,0,2198   0/0:0,6:63:0,63,1438    0/1:0,8:99:1029,0,1634  0/1:0,5:99:676,0,874    0/1:0,2:6
5:65,0,107     0/1:0,20:99:1069,0,1743 0/1:0,2:14:14,0,710

Best,
SK


Some abnormal outputs of HaplotypeCaller

Hi GATK team,

I found some abnormal output lines in VCF file which was generated by HaplotypeCaller in GATK 2.2.5 pipeline, with default parameters:

0/2 genotype reported for a bi-allelic loci:

SCF35 934292 . T G 52146.48 . AC=60;AF=0.909;AN=66;ActiveRegionSize=272;BaseQRankSum=5.365;ClippingRankSum=0.827;DP=403;EVENTLENGTH=0;FS=1.133;HaplotypeScore=31.6690;InbreedingCoeff=-0.1000;MLEAC=60;MLEAF=0.909;MQ=91.89;MQ0=0;MQRankSum=-1.672;NVH=5;NumHapAssembly=4;NumHapEval=4;QD=129.40;QDE=25.88;ReadPosRank8,9,148,148 0/0:7,0,0:70:0,70,371,70,371,371 0/1:9,10,0:99:117,0,289,356,319,675 0/0:6,0,0:85:0,85,310,85,310,310 0/1:4,7,0:99:250,0,129,262,150,412 0/0:20,0,0:72:0,72,1261,72,1261,1261 0/1:11,7,0:99:217,0,386,250,407,657 0/1:7,3,0:99:108,0,386,114,395,503 0/1:9,7,0:99:237,0,316,264,337,601 0/0:8,0,0:24:0,24,320,24,320,320 0/0:18,0,0:54:0,54,695,54,695,695 1/1:0,11,0:33:448,33,0,448,33,448 0/1:3,2,0:62:62,0,98,71,104,175 0/0:12,0,0:42:0,42,764,42,764,764 0/0:18,0,0:99:0,161,804,163,805,806 0/0:22,0,0:64:0,64,802,65,803,805 0/0:5,0,0:6:0,6,246,6,246,246 0/0:12,0,0:36:0,36,470,36,470,470 0/2:7,0,6:99:106,127,376,0,249,231 0/0:23,0,0:99:0,299,1169,299,1169,1169 0/0:13,0,0:39:0,39,512,39,512,512 0/0:18,0,0:54:0,54,693,54,693,693 0/1:3,4,0:97:137,0,97,146,109,255 0/0:13,0,0:39:0,39,506,39,506,506 0/0:7,0,0:21:0,21,276,21,276,276 0/0:19,0,0:99:0,127,958,127,958,958 0/0:7,0,0:21:0,21,240,21,240,240 0/0:27,0,0:83:0,83,1037,83,1037,1037 0/0:8,0,0:9:0,9,439,9,439,439>

no genotype for some sample:

SCF35 901454 . A G 7638.11 . AC=22;AF=0.333;AN=66;ActiveRegionSize=51;BaseQRankSum=12.776;ClippingRankSum=-3.046;DP=355;EVENTLENGTH=0;FS=34.579;HaplotypeScore=45.6947;InbreedingCoeff=-0.5098;MLEAC=22;MLEAF=0.333;MQ=88.52;MQ0=0;MQRankSum=-8.391;NVH=4;NumHapAssembly=9;NumHapEval=9;QD=31.05;QDE=7.76;ReadPosRankSum=-0.323;TYPE=SNP;extType=SNP GT:AD:GQ:PL 0/0:6,1:5:0,5,698 0/1:3,11:99:216,0,531 0/1:2,5:99:211,0,136 0/1:3,8:99:183,0,370 0/0:7,0:27:0,27,767 0/1:0,8:99:446,0,222 0/1:11,12:99:197,0,1491 0/1:9,3:99:172,0,1153 :15,0:75:0,75,1290 0/0:4,0:24:0,24,323 0/1:7,11:99:292,0,218 0/1:5,10:99:289,0,133 0/1:9,15:99:537,0,276 1/1:0,14:42:545,42,0 1/1:0,1:2:35,2,0 0/0:11,0:49:0,49,618 0/0:14,0:42:0,42,542 1/1:0,17:44:485,44,0 0/0:26,0:99:0,613,1675 0/1:3,3:91:91,0,94 0/1:10,8:99:189,0,555 0/1:3,14:99:395,0,141 0/1:7,3:99:120,0,387 ./. 0/0:15,0:99:0,147,675 0/1:1,3:29:104,0,29>

Best,
SK

Is this expected that HaplotypCaller identified 40% more variants by 2.2.5 than 2.1.8

Hi GATK team,

I recently downloaded GATK2.2.5 and re-analyzed a small part of my data which was analyzed by2.1.8 pipeline previously, following the same parameters. However I found the new pipeline identified 7247 variants (Q30) while 2.1.8 previously detected 5119. Fortunately, 5118 of the previous ones were recovered by 2.2.5 and 4845 of them were given the same genotypes for all samples. But I still want to know whether the previous version would indeed miss some data.

I checked your release notes, but it just claimed some performance issues from 2.1.* to 2.2.* for HC.

Best,
SK

HaplotypeCaller Indel detection

We find the haplotypecaller is an excellent SNP caller. But recently we got confused for the indel results. We did the target sequencing (total 6 samples with 3 case vs. 3 control). We followed the best practice suggestion except that the VariantRecalibrator (the snp number was around 600 and seems too little for the recalibration). Haploptypecaller detected correctly a SNP but the neighbor deletion was a little strange. From the samtools tview, there is no clear sign for the deletion. We wonder if it came from the de novo assembly by haplotypecaller and is it creditable? Thanks.

The command line:

java -Xmx4g -jar ~/GenomeAnalysisTK-2.1-9-gb90951c/GenomeAnalysisTK.jar -T HaplotypeCaller -R ucsc.hg19.fasta -I sample1.clean.dedup.recal.bam -I sample2.clean.dedup.recal.bam -I sample3.clean.dedup.recal.bam -I sample4.clean.dedup.recal.bam -I sample5.clean.dedup.recal.bam -I sample6.clean.dedup.recal.bam --dbsnp dbsnp_135.hg19.vcf -L target.interval_list  -stand_call_conf 50.0 -stand_emit_conf 10.0 -o samples_new.raw.snps.indels.vcf

haplotypecaller result:

SNP: 19448410   .    T  G   2126.64 .   AC=6;AF=0.500;AN=12;ActiveRegionSize=135;ClippingRankSum=18.283;DP=976;EVENTLENGTH=0;FS=1076.837;MLEAC=6;MLEAF=0.500;MQ=58.70;MQRankSum=-2.107;NVH=3;NumHapAssembly=17;NumHapEval=13;QD=2.18;QDE=0.73;ReadPosRankSum=-17.858;TYPE=SNP;  GT:GQ:PL    0/1:99:195,0,2945   0/1:99:936,0,6037   0/1:99:354,0,3059   0/1:99:301,0,4595   0/1:99:187,0,2191   0/1:99:203,0,2617
Indel: 19448411 .    GTGGCTCC   G   274.85  .   AC=3;AF=0.250;AN=12;ActiveRegionSize=135;ClippingRankSum=9.296;DP=1019;EVENTLENGTH=-7;FS=328.629;MLEAC=3;MLEAF=0.250;MQ=58.74;MQRankSum=-0.451;NVH=3;NumHapAssembly=17;NumHapEval=13;QD=0.46;QDE=0.15;ReadPosRankSum=-11.624;TYPE=INDEL;    GT:GQ:PL    0/0:99:0,106,14388  0/1:99:200,0,28094  0/0:45:0,45,15261   0/1:50:50,0,22048   0/1:74:74,0,10244   0/0:42:0,42,12913

Genotyping a substitution using the HaplotypeCaller

Hi All,

I have the following substitution that I am trying to genotype in a deep coverage (>1000x) dataset:
4 2558307 GCTGATGTGGGG GAGCTACTCAA

I've aligned it using very relaxed BWA parameters and am now getting it correctly with the Haplotype caller, however it is currently genotyped as multiple indel/SNP events:
4 2558307 G GAGCTA
4 2558310 G C
4 2558311 ATGTGGG A
4 2558318 G A

Filling the blanks between the events above using the reference sequence gives exactly the substitution I am looking for however I'd like to genotype this as one substitution event. I've tried playing with the following options but I never got any results using them:
--fullHaplotype --genotypeFullActiveRegion --activeRegionIn 4:2558307-2558318 --activeRegionOut substitution.out

I am not sure if what I'm trying to do is feasible but would appreciate any advice.

Thanks a lot!
Laurent

HaplotypeCaller

I understand the HaplotypeCaller does some local assembly and realignment. Can someone expand on the parameters used during the local assembly? What is the kmer used for the assembly graph? I would like to explore the use of digital normalization prior to SNP calling to remove PCR artifacts and this information would be helpful.

How do I filter the reads that result in this type of error in some of my intervals?

org.broadinstitute.sting.utils.exceptions.ReviewedStingException: Only one of refStart or refStop must be < 0, not both (-1, -8)

Here is some information

INFO  20:22:19,446 HelpFormatter - The Genome Analysis Toolkit (GATK) v2.2-16-g9f648cb, Compiled 2012/12/04 03:48:10 
INFO  20:22:19,452 HelpFormatter - Program Args: -T HaplotypeCaller

Thanks

15bp deletion previously called by GATK v1 not called by GATK 2.0 or 2.1

We have a 15bp deletion in a sample that was run on an Illumina sequencer twice:

1) on GAII, 76bp paired-end reads, 10-sample pool;

2) HiSeq2000, 100bp paired-end reads, 24-sample pool.

Both times aligned with bwa and analysed with recommended GATK protocol using the latest available version. The deletion was called both times by GATK 1.0.5083 (76bp batch) and GATK 1.4 (100bp batch). However the deletion is not detected by GATK 2.0 or 2.1-8, either by UnifiedGenotyper or HaplotypeCaller. The newer GATK versions do generate fewer calls and substantially cut out false positives, but on the other hand the sensitivity seems to have dropped too much!
Here is the variant called by the GATK 1.4, 100bp batch:

7   44190538    .   CCCTCCACCCGGCCCA    C   8793.94 PASS    AC=1;AF=0.021;AN=48;BaseQRankSum=14.575;DP=7878;FS=10.160;HRun=0;HaplotypeScore=3737.9770;InbreedingCoeff=-0.0213;MQ=57.74;MQ0=0;MQRankSum=-10.741;QD=29.71;ReadPosRankSum=-0.056   GT:AD:DP:GQ:PL  0/1:216,80:296:99:8794,0,39333

And here it is called by GATK 1.0.5083, 76bp batch:

7   44190538    .   CCCTCCACCCGGCCCA    C   7765.65 PASS    AC=1;AF=0.050;AN=20;DP=1907;Dels=0.02;HRun=1;HaplotypeScore=40.0818;MQ=59.30;MQ0=0;QD=49.78;SB=-1238.06;sumGLbyD=49.78  GT:DP:GQ:PL 0/1:118:99:7766,0,14618

Where is this loss in sensitivity likely to be occurring? Can we adjust any of the default GATK settings to be able to detect these slightly larger indels?
Reads with the 15bp deletion inherently have slightly lower mapping qualities, but I would have thought this is taken into account for indel calling.
(I should add that we have a sample with a 27bp homozygous deletion and this always gets called.)

I'd be happy to send the relevant part of the bam file if that would help.

Thanks, Hana


"Cannot extend a symbolic allele" when running HaplotypeCaller on Picard Validated bam files

I encounter this error when running HaplotypeCaller on picard-validated bam files using the b37 reference in the GATK resource bundle with GATK v2.3-4-g57ea19f. My commands are : -R resources/BroadInstitute/bundle_1.5/b37/human_g1k_v37.fasta \
-T HaplotypeCaller \
-L 1:233766919-233767204 \
-debug \
-allowPotentiallyMisencodedQuals \
-I sample1.recal.bam -I sample2.recal.bam [...] -I sample36.recal.bam \
-dcov 1200 \
-o samples.3:60800001-60830896.raw.snps.indels.vcf

There was no problem running UnifiedGenotyper on the same data. The stack trace (with debug info) is pasted below.

Any help would be greatly appreciated.

Thanks,

Paige

INFO 14:33:33,576 HelpFormatter - --------------------------------------------------------------------------------
INFO 14:33:33,579 HelpFormatter - The Genome Analysis Toolkit (GATK) v2.3-4-g57ea19f, Compiled 2012/12/20 15:09:50
INFO 14:33:33,579 HelpFormatter - Copyright (c) 2010 The Broad Institute
INFO 14:33:33,579 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk
INFO 14:33:33,584 HelpFormatter - Program Args: -allowPotentiallyMisencodedQuals -R /scratch1/tmp/tpaige/resources/BroadInstitute/bundle_1.5/b37/human_g1k_v37.fasta -T HaplotypeCaller -L 1:233766919-233767204 -debug -I /scratch0/tmp/tpaige/projects/SRP/6-recal/112_25.markdup.realigned.recalibrated.bam -I /scratch0/tmp/tpaige/projects/SRP/6-recal/128_26.markdup.realigned.recalibrated.bam -I /scratch0/tmp/tpaige/projects/SRP/6-recal/133_6.markdup.realigned.recalibrated.bam -I /scratch0/tmp/tpaige/projects/SRP/6-recal/R01_013A.markdup.realigned.recalibrated.bam -I /scratch0/tmp/tpaige/projects/SRP/6-recal/R06_303A.markdup.realigned.recalibrated.bam -I /scratch0/tmp/tpaige/projects/SRP/6-recal/R97_375.markdup.realigned.recalibrated.bam -I /scratch0/tmp/tpaige/projects/SRP/6-recal/L01_362.markdup.realigned.recalibrated.bam -I /scratch0/tmp/tpaige/projects/SRP/6-recal/L10_440A.markdup.realigned.recalibrated.bam -I /scratch0/tmp/tpaige/projects/SRP/6-recal/R00_113A.markdup.realigned.recalibrated.bam -I /scratch0/tmp/tpaige/projects/SRP/6-recal/R00_294A.markdup.realigned.recalibrated.bam -I /scratch0/tmp/tpaige/projects/SRP/6-recal/R00_421A.markdup.realigned.recalibrated.bam -I /scratch0/tmp/tpaige/projects/SRP/6-recal/R01_210A.markdup.realigned.recalibrated.bam -I /scratch0/tmp/tpaige/projects/SRP/6-recal/R01_489A.markdup.realigned.recalibrated.bam -I /scratch0/tmp/tpaige/projects/SRP/6-recal/R02_217A.markdup.realigned.recalibrated.bam -I /scratch0/tmp/tpaige/projects/SRP/6-recal/R02_363A.markdup.realigned.recalibrated.bam -I /scratch0/tmp/tpaige/projects/SRP/6-recal/R02_449A.markdup.realigned.recalibrated.bam -I /scratch0/tmp/tpaige/projects/SRP/6-recal/R05_304A.markdup.realigned.recalibrated.bam -I /scratch0/tmp/tpaige/projects/SRP/6-recal/R05_449C.markdup.realigned.recalibrated.bam -I /scratch0/tmp/tpaige/projects/SRP/6-recal/R06_527A.markdup.realigned.recalibrated.bam -I /scratch0/tmp/tpaige/projects/SRP/6-recal/R08_045A.markdup.realigned.recalibrated.bam -I /scratch0/tmp/tpaige/projects/SRP/6-recal/R08_134A.markdup.realigned.recalibrated.bam -I /scratch0/tmp/tpaige/projects/SRP/6-recal/R08_273A.markdup.realigned.recalibrated.bam -I /scratch0/tmp/tpaige/projects/SRP/6-recal/R08_553A.markdup.realigned.recalibrated.bam -I /scratch0/tmp/tpaige/projects/SRP/6-recal/R09_101A.markdup.realigned.recalibrated.bam -I /scratch0/tmp/tpaige/projects/SRP/6-recal/R09_524A.markdup.realigned.recalibrated.bam -I /scratch0/tmp/tpaige/projects/SRP/6-recal/R09_588A.markdup.realigned.recalibrated.bam -I /scratch0/tmp/tpaige/projects/SRP/6-recal/R10_483A.markdup.realigned.recalibrated.bam -I /scratch0/tmp/tpaige/projects/SRP/6-recal/R10_640A.markdup.realigned.recalibrated.bam -I /scratch0/tmp/tpaige/projects/SRP/6-recal/R10_711D.markdup.realigned.recalibrated.bam -I /scratch0/tmp/tpaige/projects/SRP/6-recal/R96_010A.markdup.realigned.recalibrated.bam -I /scratch0/tmp/tpaige/projects/SRP/6-recal/R97_030B.markdup.realigned.recalibrated.bam -I /scratch0/tmp/tpaige/projects/SRP/6-recal/R97_051B.markdup.realigned.recalibrated.bam -I /scratch0/tmp/tpaige/projects/SRP/6-recal/R97_145A.markdup.realigned.recalibrated.bam -I /scratch0/tmp/tpaige/projects/SRP/6-recal/R98_219A.markdup.realigned.recalibrated.bam -I /scratch0/tmp/tpaige/projects/SRP/6-recal/R98_463E.markdup.realigned.recalibrated.bam -I /scratch0/tmp/tpaige/projects/SRP/6-recal/R98_468D.markdup.realigned.recalibrated.bam -dcov 1200 -o /scratch0/tmp/tpaige/projects/SRP/7-rawVariants/haplotypeCaller/byChr_v3/troubleshoot/SRP.markdup.realigned.recalibrated.haplotypeCaller.1:233766919-233767204.raw.snps.indels.vcf
INFO 14:33:33,585 HelpFormatter - Date/Time: 2013/01/02 14:33:33
INFO 14:33:33,585 HelpFormatter - --------------------------------------------------------------------------------
INFO 14:33:33,585 HelpFormatter - --------------------------------------------------------------------------------
INFO 14:33:33,607 GenomeAnalysisEngine - Strictness is SILENT
WARN 14:33:33,616 FSLockWithShared - WARNING: Unable to lock file /scratch1/tmp/tpaige/resources/BroadInstitute/bundle_1.5/b37/human_g1k_v37.dict: Function not implemented.
INFO 14:33:33,617 ReferenceDataSource - Unable to create a lock on dictionary file: Function not implemented
INFO 14:33:33,617 ReferenceDataSource - Treating existing dictionary file as complete.
WARN 14:33:33,617 FSLockWithShared - WARNING: Unable to lock file /scratch1/tmp/tpaige/resources/BroadInstitute/bundle_1.5/b37/human_g1k_v37.fasta.fai: Function not implemented.
INFO 14:33:33,618 ReferenceDataSource - Unable to create a lock on index file: Function not implemented
INFO 14:33:33,618 ReferenceDataSource - Treating existing index file as complete.
INFO 14:33:33,815 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 1200, Using the new downsampling implementation
INFO 14:33:33,823 SAMDataSource$SAMReaders - Initializing SAMRecords in serial
INFO 14:33:34,449 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.63
INFO 14:33:34,535 GenomeAnalysisEngine - Processing 286 bp from intervals
INFO 14:33:34,544 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING]
INFO 14:33:34,544 ProgressMeter - Location processed.active regions runtime per.1M.active regions completed total.runtime remaining

Assembling 1:233766919-233767204 with 14701 reads: (with overlap region = 1:233766854-233767269)
Found 27 candidate haplotypes to evaluate every read against.
TTCTGACAAGGAAAAGACCGTGCCTCATCTCTGGCTCTCAGAATTGTCCATCATCCACAGTCTATTTTATTTATACCTACGTGGTAACTGCTAACTCAAATAGGCGTGATGGTGACCGATTATCATTGTGTTTCCCTTTGCCACCAGGTGCGGGTTTCTGACAACATCAAGATATGAAAGGTTGTCACTCACTGAACCCCTAATGTGTGTTTAACTGTTGGGGAAGAATGGGAATCACTGCCCTGAGACACAGAATTTAATTAGGCCTGCTTCTTTGTGTTTCTCTTTAATGATAGGATTTTAAGCTCAACGGTTATCAGCTCAAAAGCAGAAAATGTTCATTCCCGTTTGTGCCCTGGTATCTTGAAGAATGTGCAATAGGTGAGCTTGCCATTAATCTAACTGAATTTCTCCCT

Cigar = 416M

TTCTGACAAGGAAAAGACCGTGCCTCATCTCTGGCTCTCAGAATTGTCCATCATCCACAGTCTATTTTATTTATACCTACGTGGTAACTGCTAACTCAAATAGGCGTGATGGTGACCGATTATCATTGTGTTTCCCTTTGCCACCAGGTGCGGGTTTCTGACAACATCAAGATATGAAAGGTTGTCACTCACTGAACCCCTAATGCGTGTTTAACTGTTGGGGAAGAATGGGAATCACTGCCCTGAGACACAGAATTTAATTAGGCCTGCTTCTTTGTGTTTCTCTTTAATGATAGGATTTTAAGCTCAACGGTTATCAGCTCAAAAGCAGAAAATGTTCATTCCCGTTTGTGCCCTGGTATCTTGAAGAATGTGCAATAGGTGAGCTTGCCATTAATCTAACTGAATTTCTCCCT

Cigar = 416M

TTCTGACAAGGAAAAGACCGTGCCTCATCTCTGGCTCTCAGAATTGTCCATCATCCACAGTCTATTTTATTTATACCTACGTGGTAACTGCTAACTCAAATAGGCGTGATGGTGACCGATTATCATTGTGTTTCCCTTTGCCACCAGGTGCGGGTTTCTGACAACATCAAGATATGAAAGGTTGTCACTCACTGAACCCCTAATGTGTGTTTAACTGTTGGGGAAGAATGGGAATCACTGCCCTGAGACACAAGATTTAATTAGGCCTGCTTCTTTGTGTTTCTCTTTAATGATAGGATTTTAAGCTCAACGGTTATCAGCTCAAAAGCAGAAAATGTTCATTCCCGTTTGTGCCCTGGTATCTTGAAGAATGTGCAATAGGTGAGCTTGCCATTAATCTAACTGAATTTCTCCCT

Cigar = 416M

TTCTGACAAGGAAAAGACCGTGCCTCATCTCTGGCTCTCAGAATTGTCCATCATCCACAGTCTATTTTATTTATACCTACGTGGTAACTGCTAACTCAAATAGGCGTGATGGTGACCGATTATCATTGTGTTTCCCTTTGCCACCAGGTGCGGGTTTCTGACAACATCAAGATATGAAAGGTTGTCACTCACTGAACCCCTAATGTGTGTTTAACTGTTGGGGAAGAATGGGAATCACTGCCCTGAGACACGGAATTTAATTAGGCCTGCTTCTTTGTGTTTCTCTTTAATGATAGGATTTTAAGCTCAACGGTTATCAGCTCAAAAGCAGAAAATGTTCATTCCCGTTTGTGCCCTGGTATCTTGAAGAATGTGCAATAGGTGAGCTTGCCATTAATCTAACTGAATTTCTCCCT

Cigar = 416M

TTCTGACAAGGAAAAGACCGTGCCTCATCTCTGGCTCTCAGAATTGTCCATCATCCACAGTCTATTTTATTTATACCTACGTGGTAACTGCTAACTCAAATAGGCGTGATGGTGACCGATTATCATTGTGTTTCCCTTTGCCACCAGGTGCGGGTTTCTGACAACATCAAGATATGAAAGGTTGTCACTCACTGAACCCCTAATGTGTGTTTAACTGTTGGGGAAGAATGGGAATCACTGCCCTGAGATCGGAAGAGCACAGAATTTAATTAGGCCTGCTTCTTTGTGTTTCTCTTTAATGATAGGATTTTAAGCTCAACGGTTATCAGCTCAAAAGCAGAAAATGTTCATTCCCGTTTGTGCCCTGGTATCTTGAAGAATGTGCAATAGGTGAGCTTGCCATTAATCTAACTGAATTTCTCCCT

Cigar = 248M9I168M

TTCTGACAAGGAAAAGACCGTGCCTCATCTCTGGCTCTCAGAATTGTCCATCATCCACAGTCTATTTTATTTATACCTACGTGGTAACTGCTAACTCAAATAGGAGTGATGGTGACCGATTATCATTGTGTTTCCCTTTGCCACCAGGTGCGGGTTTCTGACAACATCAAGATATGAAAGGTTGTCACTCACTGAACCCCTAATGTGTGTTTAACTGTTGGGGAAGAATGGGAATCACTGCCCTGAGACACAGAATTTAATTAGGCCTGCTTCTTTGTGTTTCTCTTTAATGATAGGATTTTAAGCTCAACGGTTATCAGCTCAAAAGCAGAAAATGTTCATTCCCGTTTGTGCCCTGGTATCTTGAAGAATGTGCAATAGGTGAGCTTGCCATTAATCTAACTGAATTTCTCCCT

Cigar = 416M

TTCTGACAAGGAAAAGACCGTGCCTCATCTCTGGCTCTCAGAATTGTCCATCATCCACAGTCTATTTTATTTATACCTACGTGGTAACTGCTAACTCAAATAGGCCTGATGGTGACCGATTATCATTGTGTTTCCCTTTGCCACCAGGTGCGGGTTTCTGACAACATCAAGATATGAAAGGTTGTCACTCACTGAACCCCTAATGTGTGTTTAACTGTTGGGGAAGAATGGGAATCACTGCCCTGAGACACAGAATTTAATTAGGCCTGCTTCTTTGTGTTTCTCTTTAATGATAGGATTTTAAGCTCAACGGTTATCAGCTCAAAAGCAGAAAATGTTCATTCCCGTTTGTGCCCTGGTATCTTGAAGAATGTGCAATAGGTGAGCTTGCCATTAATCTAACTGAATTTCTCCCT

Cigar = 416M

TTCTGACAAGGAAAAGACCGTGCCTCATCTCTGGCTCTCAGAATTGTCCATCATCCACAGTCTATTTTATTTATACCTACGTGGTAACTGCTAACTCAAATAGGCGTGATGGTGACCGATTATCATTGTGTTTCCCTTTGCCACCAGGTGCGGGTTTCTGACAACATCAAGATATGAAAGGTTGTCACTCACTGAACCCCTAATGTGTGTTTAACTGTTGGGGAAGAATGGGAATCACTGCCCTGGTGGCAAAGACACAGAATTTAATTAGGCCTGCTTCTTTGTGTTTCTCTTTAATGATAGGATTTTAAGCTCAACGGTTATCAGCTCAAAAGCAGAAAATGTTCATTCCCGTTTGTGCCCTGGTATCTTGAAGAATGTGCAATAGGTGAGCTTGCCATTAATCTAACTGAATTTCTCCCT

Cigar = 245M7I171M

TTCTGACAAGGAAAAGACCGTGCCTCATCTCTGGCTCTCAGAATTGTCCATCATCCACAGTCTATTTTATTTATACCTACGTGGTAACTGCTAACTCAAATAGGCGTGATGGTGACCGATTATCATTGTGTTTCCCTTTGCCACCAGGTGCGGGTTTCTGACAACATCAAGATATGAAAGGTTGTCACTCACTGAACCCCTAATGTGTGTTTAACTGTTGGGGAAGAATGGGAATCACTGCCCTAGATCACAGAATTTAATTAGGCCTGCTTCTTTGTGTTTCTCTTTAATGATAGGATTTTAAGCTCAACGGTTATCAGCTCAAAAGCAGAAAATGTTCATTCCCGTTTGTGCCCTGGTATCTTGAAGAATGTGCAATAGGTGAGCTTGCCATTAATCTAACTGAATTTCTCCCT

Cigar = 244M1D3M1I168M

TTCTGACAAGGAAAAGACCGTGCCTCATCTCTGGCTCTCAGAATTGTCCATCATCCACAGTCTATTTTATTTATACCTACGTGGTAACTGCTAACTCAAATAGGCGTGATGGTGACCGATTATCATTGTGTTTCCCTTTGCCACCAGGTGCGGGTTTCTGACAACATCAAGATATGAAAGGTTGTCACTCACTGAACCCCTAATGTGTGTTTAACTGTTGGGGAAGAATGGGAATCACTGACCTGAGACACAGAATTTAATTAGGCCTGCTTCTTTGTGTTTCTCTTTAATGATAGGATTTTAAGCTCAACGGTTATCAGCTCAAAAGCAGAAAATGTTCATTCCCGTTTGTGCCCTGGTATCTTGAAGAATGTGCAATAGGTGAGCTTGCCATTAATCTAACTGAATTTCTCCCT

Cigar = 416M

TTCTGACAAGGAAAAGACCGTGCCTCATCTCTGGCTCTCAGAATTGTCCATCATCCACAGTCTATTTTATTTATACCTACGTGGTAACTGCTAACTCAAATAGGCGTGATGGTGACCGATTATCATTGTGTTTCCCTTTGCCACCAGGTGCGGGTTTCTGACAACATCAAGATATGAAAGGTTGTCACTCACTGAACCCCTAATGTGTGTTTAACTGTTGGGGAAGAATGGGAATCACTACCCTGAGACACAGAATTTAATTAGGCCTGCTTCTTTGTGTTTCTCTTTAATGATAGGATTTTAAGCTCAACGGTTATCAGCTCAAAAGCAGAAAATGTTCATTCCCGTTTGTGCCCTGGTATCTTGAAGAATGTGCAATAGGTGAGCTTGCCATTAATCTAACTGAATTTCTCCCT

Cigar = 416M

TTCTGACAAGGAAAAGACCGTGCCTCATCTCTGGCTCTCAGAATTGTCCATCATCCACAGTCTATTTTATTTATACCTACGTGGTAACTGCTAACTCAAATAGGCGTGATGGTGACCTCTTCCGATCTGATTATCATTGTGTTTCCCTTTGCCACCAGGTGCGGGTTTCTGACAACATCAAGATATGAAAGGTTGTCACTCACTGAACCCCTAATGTGTGTTTAACTGTTGGGGAAGAATGGGAATCACTGCCCTGAGACACAGAATTTAATTAGGCCTGCTTCTTTGTGTTTCTCTTTAATGATAGGATTTTAAGCTCAACGGTTATCAGCTCAAAAGCAGAAAATGTTCATTCCCGTTTGTGCCCTGGTATCTTGAAGAATGTGCAATAGGTGAGCTTGCCATTAATCTAACTGAATTTCTCCCT

Cigar = 117M11I299M

TTCTGACAAGGAAAAGACCGTGCCTCATCTCTGGCTCTCAGAATTGTCCATCATCCACAGTCTATTTTATTTATACCTACGTGGTAACTGCTAACTCAAATAGGCGTGATGGTGATCTATTATCATTGTGTTTCCCTTTGCCACCAGGTGCGGGTTTCTGACAACATCAAGATATGAAAGGTTGTCACTCACTGAACCCCTAATGTGTGTTTAACTGTTGGGGAAGAATGGGAATCACTGCCCTGAGACACAGAATTTAATTAGGCCTGCTTCTTTGTGTTTCTCTTTAATGATAGGATTTTAAGCTCAACGGTTATCAGCTCAAAAGCAGAAAATGTTCATTCCCGTTTGTGCCCTGGTATCTTGAAGAATGTGCAATAGGTGAGCTTGCCATTAATCTAACTGAATTTCTCCCT

Cigar = 416M

TTCTGACAAGGAAAAGACCGTGCCTCATCTCTGGCTCTCAGAATTGTCCATCATCCACAGTCTATTTTATTTATACCTACGTGGTAACTGCTAACTCAAATAGGCGTGATGGTGACCGATCTTTATCATTGTGTTTCCCTTTGCCACCAGGTGCGGGTTTCTGACAACATCAAGATATGAAAGGTTGTCACTCACTGAACCCCTAATGTGTGTTTAACTGTTGGGGAAGAATGGGAATCACTGCCCTGAGACACAGAATTTAATTAGGCCTGCTTCTTTGTGTTTCTCTTTAATGATAGGATTTTAAGCTCAACGGTTATCAGCTCAAAAGCAGAAAATGTTCATTCCCGTTTGTGCCCTGGTATCTTGAAGAATGTGCAATAGGTGAGCTTGCCATTAATCTAACTGAATTTCTCCCT

Cigar = 119M3I297M

TTCTGACAAGGAAAAGACCGTGCCTCATCTCTGGCTCTCAGAATTGTCCATCATCCACAGTCTATTTTATTTATACCTACGTGGTAACTGCTAACTCAAATAGGCGTGATGGTGACCGATTATCATTGTGTTTCCCTTTGCCACCAGGTGCGGGTTTCTGACAACATCAAGATATGAAAGGTTGTCACTCACTGAACCCCTAATGTGTGTTTAACTGTTGGGGAAGAATGGGAATCACAATGCCCTGAGACACAGAATTTAATTAGGCCTGCTTCTTTGTGTTTCTCTTTAATGATAGGATTTTAAGCTCAACGGTTATCAGCTCAAAAGCAGAAAATGTTCATTCCCGTTTGTGCCCTGGTATCTTGAAGAATGTGCAATAGGTGAGCTTGCCATTAATCTAACTGAATTTCTCCCT

Cigar = 238M2I178M

TTCTGACAAGGAAAAGACCGTGCCTCATCTCTGGCTCTCAGAATTGTCCATCATCCACAGTCTATTTTATTTATACCTACGTGGTAACTGCTAACTCAAATAGGCGTGATGGTGACCGATTATCATTGTGTTTCCCTTTGCCACCAGGTGCGGGTTTCTGACAACATCAAGATATGAAAGGTTGTCACTCACTGAACCCCTAATGTGTGTTTAACTGTTGGGGAAGAATGGGAATCACAGCCCTGAGACACAGAATTTAATTAGGCCTGCTTCTTTGTGTTTCTCTTTAATGATAGGATTTTAAGCTCAACGGTTATCAGCTCAAAAGCAGAAAATGTTCATTCCCGTTTGTGCCCTGGTATCTTGAAGAATGTGCAATAGGTGAGCTTGCCATTAATCTAACTGAATTTCTCCCT

Cigar = 416M

TTCTGACAAGGAAAAGACCGTGCCTCATCTCTGGCTCTCAGAATTGTCCATCATCCACAGTCTATTTTATTTATACCTACGTGGTAACTGCTAACTCAAATAGGCGTGATCTGATTATCATTGTGTTTCCCTTTGCCACCAGGTGCGGGTTTCTGACAACATCAAGATATGAAAGGTTGTCACTCACTGAACCCCTAATGTGTGTTTAACTGTTGGGGAAGAATGGGAATCACTGCCCTGAGACACAGAATTTAATTAGGCCTGCTTCTTTGTGTTTCTCTTTAATGATAGGATTTTAAGCTCAACGGTTATCAGCTCAAAAGCAGAAAATGTTCATTCCCGTTTGTGCCCTGGTATCTTGAAGAATGTGCAATAGGTGAGCTTGCCATTAATCTAACTGAATTTCTCCCT

Cigar = 110M5D301M

TTCTGACAAGGAAAAGACCGTGCCTCATCTCTGGCTCTCAGAATTGTCCATCATCCACAGTCTATTTTATTTATACCTACGTGGTAACTGCTAACTCAAATAGGAGTGATGGTGACCGATTATCATTGTGTTTCCCTTTGCCACCAGGTGCGGGTTTCTGACAACATCAAGATATGAAAGGTTGTCACTCACTGAACCCCTAATGTGTGTTTAACTGTTGGGGAAGAATGGGAATCACTGACCTGAGACACAGAATTTAATTAGGCCTGCTTCTTTGTGTTTCTCTTTAATGATAGGATTTTAAGCTCAACGGTTATCAGCTCAAAAGCAGAAAATGTTCATTCCCGTTTGTGCCCTGGTATCTTGAAGAATGTGCAATAGGTGAGCTTGCCATTAATCTAACTGAATTTCTCCCT

Cigar = 416M

TTCTGACAAGGAAAAGACCGTGCCTCATCTCTGGCTCTCAGAATTGTCCATCATCCACAGTCTATTTTATTTATACCTACGTGGTAACTGCTAACTCAAATAGGCGTGATGGTGACCGATTATCATTGTGTTTCCCTTTGCCACCAGGTGCGGGATTCTGACAACATCAGGATATGAAAGGTTGTCACTCACTGAACCCCTAATGCGTGTTTAACTGTTGGGGAAGAATGGGAATCACTGCCCTGAGACACAGAATTTAATTAGGCCTGCTTCTTTGTGTTTCTCTTTAATGATAGGATTTTAAGCTCAACGGTTATCAGCTCAAAAGCAGAAAATGTTCATTCCCGTTTGTGCCCTGGTATCTTGAAGAATGTGCAATAGGTGAGCTTGCCATTAATCTAACTGAATTTCTCCCT

Cigar = 416M

TTCTGACAAGGAAAAGACCGTGCCTCATCTCTGGCTCTCAGAATTGTCCATCATCCACAGTCTATTTTATTTATACCTACGTGGTAACTGCTAACTCAAATAGGCGTGATGGTGACCGATTATCATTGTGTTTCCCTTTGCCACCAGGTGCGGGATTCTGACAACATCAGGATATGAAAGGTTGTCACTCACTGAACCCCTAATGCGTGTTTAACTGTTGGGGAAGAATGGGAATCACTGCCCTGAGACACGGAATTTAATTAGGCCTGCTTCTTTGTGTTTCTCTTTAATGATAGGATTTTAAGCTCAACGGTTATCAGCTCAAAAGCAGAAAATGTTCATTCCCGTTTGTGCCCTGGTATCTTGAAGAATGTGCAATAGGTGAGCTTGCCATTAATCTAACTGAATTTCTCCCT

Cigar = 416M

TTCTGACAAGGAAAAGACCGTGCCTCATCTCTGGCTCTCAGAATTGTCCATCATCCACAGTCTATTTTATTTATACCTACGTGGTAACTGCTAACTCAAATAGGAGTGATGGTGACCGATTATCATTGTGTTTCCCTTTGCCACCAGGTGCGGGATTCTGACAACATCAGGATATGAAAGGTTGTCACTCACTGAACCCCTAATGCGTGTTTAACTGTTGGGGAAGAATGGGAATCACTGCCCTGAGACACAGAATTTAATTAGGCCTGCTTCTTTGTGTTTCTCTTTAATGATAGGATTTTAAGCTCAACGGTTATCAGCTCAAAAGCAGAAAATGTTCATTCCCGTTTGTGCCCTGGTATCTTGAAGAATGTGCAATAGGTGAGCTTGCCATTAATCTAACTGAATTTCTCCCT

Cigar = 416M

TTCTGACAAGGAAAAGACCGTGCCTCATCTCTGGCTCTCAGAATTGTCCATCATCCACAGTCTATTTTATTTATACCTACGTGGTAACTGCTAACTCAAATAGGAGTGATGGTGACCGATTATCATTGTGTTTCCCTTTGCCACCAGGTGCGGGATTCTGACAACATCAGGATATGAAAGGTTGTCACTCACTGAACCCCTAATGCGTGTTTAACTGTTGGGGAAGAATGGGAATCACTGCCCTGAGACACGGAATTTAATTAGGCCTGCTTCTTTGTGTTTCTCTTTAATGATAGGATTTTAAGCTCAACGGTTATCAGCTCAAAAGCAGAAAATGTTCATTCCCGTTTGTGCCCTGGTATCTTGAAGAATGTGCAATAGGTGAGCTTGCCATTAATCTAACTGAATTTCTCCCT

Cigar = 416M

TTCTGACAAGGAAAAGACCGTGCCTCATCTCTGGCTCTCAGAATTGTCCATCATCCACAGTCTATTTTATTTATACCTACGTGGTAACTGCTAACTCAAATAGGCGTGATGGTGACCGATTATCATTGTGTTTCCCTTTGCCACCAGGTGCGGGATTCTGACAACATCAGGATATGAAATGTTGTCACTCACTGAACCCCTAATGCGTGTTTAACTGTTGGGGAAGAATGGGAATCACTGCCCTGAGACACAGAATTTAATTAGGCCTGCTTCTTTGTGTTTCTCTTTAATGATAGGATTTTAAGCTCAACGGTTATCAGCTCAAAAGCAGAAAATGTTCATTCCCGTTTGTGCCCTGGTATCTTGAAGAATGTGCAATAGGTGAGCTTGCCATTAATCTAACTGAATTTCTCCCT

Cigar = 416M

TTCTGACAAGGAAAAGACCGTGCCTCATCTCTGGCTCTCAGAATTGTCCATCATCCACAGTCTATTTTATTTATACCTACGTGGTAACTGCTAACTCAAATAGGCGTGATGGTGACCGATTATCATTGTGTTTCCCTTTGCCACCAGGTGCGGGATTCTGACAACATCAGGATATGAAATGTTGTCACTCACTGAACCCCTAATGCGTGTTTAACTGTTGGGGAAGAATGGGAATCACTGCCCTGAGACACGGAATTTAATTAGGCCTGCTTCTTTGTGTTTCTCTTTAATGATAGGATTTTAAGCTCAACGGTTATCAGCTCAAAAGCAGAAAATGTTCATTCCCGTTTGTGCCCTGGTATCTTGAAGAATGTGCAATAGGTGAGCTTGCCATTAATCTAACTGAATTTCTCCCT

Cigar = 416M

TTCTGACAAGGAAAAGACCGTGCCTCATCTCTGGCTCTCAGAATTGTCCATCATCCACAGTCTATTTTATTTATACCTACGTGGTAACTGCTAACTCAAATAGGCGTGATGGTGACCGATTATCATTGTGTTTCCCTTTGCCACCAGGTGCGGGTTTCTGACAACATCAGGATATGAAATGTTGTCACTCACTGAACCCCTAATGCGTGTTTAACTGTTGGGGAAGAATGGGAATCACTGCCCTGAGACACAGAATTTAATTAGGCCTGCTTCTTTGTGTTTCTCTTTAATGATAGGATTTTAAGCTCAACGGTTATCAGCTCAAAAGCAGAAAATGTTCATTCCCGTTTGTGCCCTGGTATCTTGAAGAATGTGCAATAGGTGAGCTTGCCATTAATCTAACTGAATTTCTCCCT

Cigar = 416M

TTCTGACAAGGAAAAGACCGTGCCTCATCTCTGGCTCTCAGAATTGTCCATCATCCACAGTCTATTTTATTTATACCTACGTGGTAACTGCTAACTCAAATAGGCGTGATGGTGACCGATTATCATTGTGTTTCCCTTTGCCACCAGGTGCGGGTTTCTGACAACATCAGGATATGAAATGTTGTCACTCACTGAACCCCTAATGCGTGTTTAACTGTTGGGGAAGAATGGGAATCACTGCCCTGAGACACGGAATTTAATTAGGCCTGCTTCTTTGTGTTTCTCTTTAATGATAGGATTTTAAGCTCAACGGTTATCAGCTCAAAAGCAGAAAATGTTCATTCCCGTTTGTGCCCTGGTATCTTGAAGAATGTGCAATAGGTGAGCTTGCCATTAATCTAACTGAATTTCTCCCT

Cigar = 416M

TTCTGACAAGGAAAAGACCGTGCCTCATCTCTGGCTCTCAGAATTGTCCATCATCCACAGTCTATTTTATTTATACCTACGTGGTAACTGCTAACTCAAATAGGCGTGATGGTGACCGATTATCATTGTGTTTCCCTTTGCCACCAGGTGCGGGTTTCTGACAACATCAAGATATGAAATGTTGTCACTCACTGAACCCCTAATGCGTGTTTAACTGTTGGGGAAGAATGGGAATCACTGCCCTGAGACACAGAATTTAATTAGGCCTGCTTCTTTGTGTTTCTCTTTAATGATAGGATTTTAAGCTCAACGGTTATCAGCTCAAAAGCAGAAAATGTTCATTCCCGTTTGTGCCCTGGTATCTTGAAGAATGTGCAATAGGTGAGCTTGCCATTAATCTAACTGAATTTCTCCCT

Cigar = 416M

INFO 14:34:04,547 ProgressMeter - 1:233767204 0.00e+00 30.0 s 49.6 w 99.7% 30.1 s 0.1 s
INFO 14:34:34,549 ProgressMeter - 1:233767204 0.00e+00 60.0 s 99.2 w 99.7% 60.2 s 0.2 s
INFO 14:35:04,553 ProgressMeter - 1:233767204 0.00e+00 90.0 s 148.8 w 99.7% 90.3 s 0.3 s
INFO 14:35:34,558 ProgressMeter - 1:233767204 0.00e+00 2.0 m 198.4 w 99.7% 2.0 m 0.4 s
INFO 14:36:04,563 ProgressMeter - 1:233767204 0.00e+00 2.5 m 248.0 w 99.7% 2.5 m 0.5 s
INFO 14:36:34,568 ProgressMeter - 1:233767204 0.00e+00 3.0 m 297.7 w 99.7% 3.0 m 0.6 s
Chose haplotypes 18 and 7 with diploid likelihood = 0.0
Chose haplotypes 18 and 8 with diploid likelihood = -533.747309737968
Chose haplotypes 18 and 0 with diploid likelihood = -777.1108623066011
Chose haplotypes 18 and 1 with diploid likelihood = -1310.8581720445072
Chose haplotypes 7 and 3 with diploid likelihood = -1451.1726611647882
Chose haplotypes 19 and 7 with diploid likelihood = -1501.7146670729999
Chose haplotypes 7 and 4 with diploid likelihood = -1695.1134320870115
Chose haplotypes 17 and 7 with diploid likelihood = -1713.464772552341
Chose haplotypes 8 and 3 with diploid likelihood = -1984.9199709026507
Chose haplotypes 8 and 4 with diploid likelihood = -2228.860741824894
Chose haplotypes 19 and 0 with diploid likelihood = -2278.8255293794828
Chose haplotypes 17 and 0 with diploid likelihood = -2490.5756348588693
Chose haplotypes 20 and 7 with diploid likelihood = -2585.3380487913055
Chose haplotypes 20 and 0 with diploid likelihood = -3362.448911097821
Chose haplotypes 21 and 7 with diploid likelihood = -3483.4850108307965
Chose haplotypes 15 and 7 with diploid likelihood = -3568.629541037797
Chose haplotypes 16 and 7 with diploid likelihood = -3777.6186484047903
Chose 12 alternate haplotypes to genotype in all samples.
=== Best Haplotypes ===
TTCTGACAAGGAAAAGACCGTGCCTCATCTCTGGCTCTCAGAATTGTCCATCATCCACAGTCTATTTTATTTATACCTACGTGGTAACTGCTAACTCAAATAGGCGTGATGGTGACCGATTATCATTGTGTTTCCCTTTGCCACCAGGTGCGGGTTTCTGACAACATCAAGATATGAAAGGTTGTCACTCACTGAACCCCTAATGTGTGTTTAACTGTTGGGGAAGAATGGGAATCACTGCCCTGAGACACAGAATTTAATTAGGCCTGCTTCTTTGTGTTTCTCTTTAATGATAGGATTTTAAGCTCAACGGTTATCAGCTCAAAAGCAGAAAATGTTCATTCCCGTTTGTGCCCTGGTATCTTGAAGAATGTGCAATAGGTGAGCTTGCCATTAATCTAACTGAATTTCTCCCT

Cigar = 416M
Left and right breaks = (0 , 0)

Events = {}

TTCTGACAAGGAAAAGACCGTGCCTCATCTCTGGCTCTCAGAATTGTCCATCATCCACAGTCTATTTTATTTATACCTACGTGGTAACTGCTAACTCAAATAGGCGTGATGGTGACCGATTATCATTGTGTTTCCCTTTGCCACCAGGTGCGGGATTCTGACAACATCAGGATATGAAAGGTTGTCACTCACTGAACCCCTAATGCGTGTTTAACTGTTGGGGAAGAATGGGAATCACTGCCCTGAGACACAGAATTTAATTAGGCCTGCTTCTTTGTGTTTCTCTTTAATGATAGGATTTTAAGCTCAACGGTTATCAGCTCAAAAGCAGAAAATGTTCATTCCCGTTTGTGCCCTGGTATCTTGAAGAATGTGCAATAGGTGAGCTTGCCATTAATCTAACTGAATTTCTCCCT

Cigar = 416M
Left and right breaks = (0 , 0)

Events = {233767008=[VC HC1 @ 1:233767008 Q. of type=SNP alleles=[T*, A] attr={} GT=[], 233767023=[VC HC1 @ 1:233767023 Q. of type=SNP alleles=[A*, G] attr={} GT=[], 233767059=[VC HC1 @ 1:233767059 Q. of type=SNP alleles=[T*, C] attr={} GT=[]}

TTCTGACAAGGAAAAGACCGTGCCTCATCTCTGGCTCTCAGAATTGTCCATCATCCACAGTCTATTTTATTTATACCTACGTGGTAACTGCTAACTCAAATAGGCGTGATGGTGACCGATTATCATTGTGTTTCCCTTTGCCACCAGGTGCGGGATTCTGACAACATCAGGATATGAAAGGTTGTCACTCACTGAACCCCTAATGCGTGTTTAACTGTTGGGGAAGAATGGGAATCACTGCCCTGAGACACGGAATTTAATTAGGCCTGCTTCTTTGTGTTTCTCTTTAATGATAGGATTTTAAGCTCAACGGTTATCAGCTCAAAAGCAGAAAATGTTCATTCCCGTTTGTGCCCTGGTATCTTGAAGAATGTGCAATAGGTGAGCTTGCCATTAATCTAACTGAATTTCTCCCT

Cigar = 416M
Left and right breaks = (0 , 274)

Events = {233767008=[VC HC2 @ 1:233767008 Q. of type=SNP alleles=[T*, A] attr={} GT=[], 233767023=[VC HC2 @ 1:233767023 Q. of type=SNP alleles=[A*, G] attr={} GT=[], 233767059=[VC HC2 @ 1:233767059 Q. of type=SNP alleles=[T*, C] attr={} GT=[], 233767105=[VC HC2 @ 1:233767105 Q. of type=SNP alleles=[A*, G] attr={} GT=[]}

TTCTGACAAGGAAAAGACCGTGCCTCATCTCTGGCTCTCAGAATTGTCCATCATCCACAGTCTATTTTATTTATACCTACGTGGTAACTGCTAACTCAAATAGGAGTGATGGTGACCGATTATCATTGTGTTTCCCTTTGCCACCAGGTGCGGGATTCTGACAACATCAGGATATGAAAGGTTGTCACTCACTGAACCCCTAATGCGTGTTTAACTGTTGGGGAAGAATGGGAATCACTGCCCTGAGACACAGAATTTAATTAGGCCTGCTTCTTTGTGTTTCTCTTTAATGATAGGATTTTAAGCTCAACGGTTATCAGCTCAAAAGCAGAAAATGTTCATTCCCGTTTGTGCCCTGGTATCTTGAAGAATGTGCAATAGGTGAGCTTGCCATTAATCTAACTGAATTTCTCCCT

Cigar = 416M
Left and right breaks = (104 , 0)

Events = {233767008=[VC HC3 @ 1:233767008 Q. of type=SNP alleles=[T*, A] attr={} GT=[], 233767023=[VC HC3 @ 1:233767023 Q. of type=SNP alleles=[A*, G] attr={} GT=[], 233767059=[VC HC3 @ 1:233767059 Q. of type=SNP alleles=[T*, C] attr={} GT=[], 233766958=[VC HC3 @ 1:233766958 Q. of type=SNP alleles=[C*, A] attr={} GT=[]}

TTCTGACAAGGAAAAGACCGTGCCTCATCTCTGGCTCTCAGAATTGTCCATCATCCACAGTCTATTTTATTTATACCTACGTGGTAACTGCTAACTCAAATAGGAGTGATGGTGACCGATTATCATTGTGTTTCCCTTTGCCACCAGGTGCGGGATTCTGACAACATCAGGATATGAAAGGTTGTCACTCACTGAACCCCTAATGCGTGTTTAACTGTTGGGGAAGAATGGGAATCACTGCCCTGAGACACGGAATTTAATTAGGCCTGCTTCTTTGTGTTTCTCTTTAATGATAGGATTTTAAGCTCAACGGTTATCAGCTCAAAAGCAGAAAATGTTCATTCCCGTTTGTGCCCTGGTATCTTGAAGAATGTGCAATAGGTGAGCTTGCCATTAATCTAACTGAATTTCTCCCT

Cigar = 416M
Left and right breaks = (104 , 274)

Events = {233767008=[VC HC4 @ 1:233767008 Q. of type=SNP alleles=[T*, A] attr={} GT=[], 233767023=[VC HC4 @ 1:233767023 Q. of type=SNP alleles=[A*, G] attr={} GT=[], 233767059=[VC HC4 @ 1:233767059 Q. of type=SNP alleles=[T*, C] attr={} GT=[], 233767105=[VC HC4 @ 1:233767105 Q. of type=SNP alleles=[A*, G] attr={} GT=[], 233766958=[VC HC4 @ 1:233766958 Q. of type=SNP alleles=[C*, A] attr={} GT=[]}

TTCTGACAAGGAAAAGACCGTGCCTCATCTCTGGCTCTCAGAATTGTCCATCATCCACAGTCTATTTTATTTATACCTACGTGGTAACTGCTAACTCAAATAGGAGTGATGGTGACCGATTATCATTGTGTTTCCCTTTGCCACCAGGTGCGGGTTTCTGACAACATCAAGATATGAAAGGTTGTCACTCACTGAACCCCTAATGTGTGTTTAACTGTTGGGGAAGAATGGGAATCACTGCCCTGAGACACAGAATTTAATTAGGCCTGCTTCTTTGTGTTTCTCTTTAATGATAGGATTTTAAGCTCAACGGTTATCAGCTCAAAAGCAGAAAATGTTCATTCCCGTTTGTGCCCTGGTATCTTGAAGAATGTGCAATAGGTGAGCTTGCCATTAATCTAACTGAATTTCTCCCT

Cigar = 416M
Left and right breaks = (104 , 0)

Events = {233766958=[VC HC5 @ 1:233766958 Q. of type=SNP alleles=[C*, A] attr={} GT=[]}

TTCTGACAAGGAAAAGACCGTGCCTCATCTCTGGCTCTCAGAATTGTCCATCATCCACAGTCTATTTTATTTATACCTACGTGGTAACTGCTAACTCAAATAGGCGTGATGGTGACCGATTATCATTGTGTTTCCCTTTGCCACCAGGTGCGGGTTTCTGACAACATCAAGATATGAAAGGTTGTCACTCACTGAACCCCTAATGTGTGTTTAACTGTTGGGGAAGAATGGGAATCACTGCCCTGAGACACGGAATTTAATTAGGCCTGCTTCTTTGTGTTTCTCTTTAATGATAGGATTTTAAGCTCAACGGTTATCAGCTCAAAAGCAGAAAATGTTCATTCCCGTTTGTGCCCTGGTATCTTGAAGAATGTGCAATAGGTGAGCTTGCCATTAATCTAACTGAATTTCTCCCT

Cigar = 416M
Left and right breaks = (0 , 0)

Events = {233767105=[VC HC6 @ 1:233767105 Q. of type=SNP alleles=[A*, G] attr={} GT=[]}

TTCTGACAAGGAAAAGACCGTGCCTCATCTCTGGCTCTCAGAATTGTCCATCATCCACAGTCTATTTTATTTATACCTACGTGGTAACTGCTAACTCAAATAGGCCTGATGGTGACCGATTATCATTGTGTTTCCCTTTGCCACCAGGTGCGGGTTTCTGACAACATCAAGATATGAAAGGTTGTCACTCACTGAACCCCTAATGTGTGTTTAACTGTTGGGGAAGAATGGGAATCACTGCCCTGAGACACAGAATTTAATTAGGCCTGCTTCTTTGTGTTTCTCTTTAATGATAGGATTTTAAGCTCAACGGTTATCAGCTCAAAAGCAGAAAATGTTCATTCCCGTTTGTGCCCTGGTATCTTGAAGAATGTGCAATAGGTGAGCTTGCCATTAATCTAACTGAATTTCTCCCT

Cigar = 416M
Left and right breaks = (105 , 0)

Events = {233766959=[VC HC7 @ 1:233766959 Q. of type=SNP alleles=[G*, C] attr={} GT=[]}

TTCTGACAAGGAAAAGACCGTGCCTCATCTCTGGCTCTCAGAATTGTCCATCATCCACAGTCTATTTTATTTATACCTACGTGGTAACTGCTAACTCAAATAGGCGTGATGGTGACCGATTATCATTGTGTTTCCCTTTGCCACCAGGTGCGGGTTTCTGACAACATCAAGATATGAAAGGTTGTCACTCACTGAACCCCTAATGTGTGTTTAACTGTTGGGGAAGAATGGGAATCACTGCCCTGAGACACAAGATTTAATTAGGCCTGCTTCTTTGTGTTTCTCTTTAATGATAGGATTTTAAGCTCAACGGTTATCAGCTCAAAAGCAGAAAATGTTCATTCCCGTTTGTGCCCTGGTATCTTGAAGAATGTGCAATAGGTGAGCTTGCCATTAATCTAACTGAATTTCTCCCT

Cigar = 416M
Left and right breaks = (0 , 256)

Events = {233767107=[VC HC8 @ 1:233767107 Q. of type=SNP alleles=[A*, G] attr={} GT=[], 233767106=[VC HC8 @ 1:233767106 Q. of type=SNP alleles=[G*, A] attr={} GT=[]}

TTCTGACAAGGAAAAGACCGTGCCTCATCTCTGGCTCTCAGAATTGTCCATCATCCACAGTCTATTTTATTTATACCTACGTGGTAACTGCTAACTCAAATAGGCGTGATGGTGACCGATTATCATTGTGTTTCCCTTTGCCACCAGGTGCGGGTTTCTGACAACATCAAGATATGAAAGGTTGTCACTCACTGAACCCCTAATGTGTGTTTAACTGTTGGGGAAGAATGGGAATCACTGCCCTGAGATCGGAAGAGCACAGAATTTAATTAGGCCTGCTTCTTTGTGTTTCTCTTTAATGATAGGATTTTAAGCTCAACGGTTATCAGCTCAAAAGCAGAAAATGTTCATTCCCGTTTGTGCCCTGGTATCTTGAAGAATGTGCAATAGGTGAGCTTGCCATTAATCTAACTGAATTTCTCCCT

Cigar = 248M9I168M
Left and right breaks = (0 , 258)

Events = {233767101=[VC HC9 @ 1:233767101 Q. of type=SYMBOLIC alleles=[A*, ] attr={} GT=[]}

TTCTGACAAGGAAAAGACCGTGCCTCATCTCTGGCTCTCAGAATTGTCCATCATCCACAGTCTATTTTATTTATACCTACGTGGTAACTGCTAACTCAAATAGGCGTGATGGTGACCGATTATCATTGTGTTTCCCTTTGCCACCAGGTGCGGGTTTCTGACAACATCAAGATATGAAAGGTTGTCACTCACTGAACCCCTAATGTGTGTTTAACTGTTGGGGAAGAATGGGAATCACTGCCCTGGTGGCAAAGACACAGAATTTAATTAGGCCTGCTTCTTTGTGTTTCTCTTTAATGATAGGATTTTAAGCTCAACGGTTATCAGCTCAAAAGCAGAAAATGTTCATTCCCGTTTGTGCCCTGGTATCTTGAAGAATGTGCAATAGGTGAGCTTGCCATTAATCTAACTGAATTTCTCCCT

Cigar = 245M7I171M
Left and right breaks = (0 , 253)

Events = {233767098=[VC HC10 @ 1:233767098 Q. of type=SYMBOLIC alleles=[G*, ] attr={} GT=[]}

TTCTGACAAGGAAAAGACCGTGCCTCATCTCTGGCTCTCAGAATTGTCCATCATCCACAGTCTATTTTATTTATACCTACGTGGTAACTGCTAACTCAAATAGGCGTGATGGTGACCGATTATCATTGTGTTTCCCTTTGCCACCAGGTGCGGGTTTCTGACAACATCAAGATATGAAAGGTTGTCACTCACTGAACCCCTAATGTGTGTTTAACTGTTGGGGAAGAATGGGAATCACTGACCTGAGACACAGAATTTAATTAGGCCTGCTTCTTTGTGTTTCTCTTTAATGATAGGATTTTAAGCTCAACGGTTATCAGCTCAAAAGCAGAAAATGTTCATTCCCGTTTGTGCCCTGGTATCTTGAAGAATGTGCAATAGGTGAGCTTGCCATTAATCTAACTGAATTTCTCCCT

Cigar = 416M
Left and right breaks = (0 , 241)

Events = {233767094=[VC HC11 @ 1:233767094 Q. of type=SNP alleles=[C*, A] attr={} GT=[]}

TTCTGACAAGGAAAAGACCGTGCCTCATCTCTGGCTCTCAGAATTGTCCATCATCCACAGTCTATTTTATTTATACCTACGTGGTAACTGCTAACTCAAATAGGCGTGATGGTGACCGATTATCATTGTGTTTCCCTTTGCCACCAGGTGCGGGTTTCTGACAACATCAAGATATGAAAGGTTGTCACTCACTGAACCCCTAATGTGTGTTTAACTGTTGGGGAAGAATGGGAATCACTGCCCTAGATCACAGAATTTAATTAGGCCTGCTTCTTTGTGTTTCTCTTTAATGATAGGATTTTAAGCTCAACGGTTATCAGCTCAAAAGCAGAAAATGTTCATTCCCGTTTGTGCCCTGGTATCTTGAAGAATGTGCAATAGGTGAGCTTGCCATTAATCTAACTGAATTTCTCCCT

Cigar = 244M1D3M1I168M
Left and right breaks = (0 , 249)

Events = {233767097=[VC HC12 @ 1:233767097-233767098 Q. of type=INDEL alleles=[TG*, T] attr={} GT=[], 233767101=[VC HC12 @ 1:233767101 Q. of type=INDEL alleles=[A*, AT] attr={} GT=[]}

Found consecutive biallelic events with R^2 = 0.0303
-- [VC HC3 @ 1:233766958 Q. of type=SNP alleles=[C*, A] attr={} GT=[]
-- [VC HC7 @ 1:233766959 Q. of type=SNP alleles=[G*, C] attr={} GT=[]
Found consecutive biallelic events with R^2 = 0.0083
-- [VC HC11 @ 1:233767094 Q. of type=SNP alleles=[C*, A] attr={} GT=[]
-- [VC HC12 @ 1:233767097-233767098 Q. of type=INDEL alleles=[TG*, T] attr={} GT=[]
Found consecutive biallelic events with R^2 = 1.0000
-- [VC HC12 @ 1:233767097-233767098 Q. of type=INDEL alleles=[TG*, T] attr={} GT=[]
-- [VC HC12 @ 1:233767101 Q. of type=INDEL alleles=[A*, AT] attr={} GT=[]
====> [VC merged @ 1:233767098-233767101 Q. of type=MNP alleles=[GAGA*, AGAT] attr={} GT=[]
Found consecutive biallelic events with R^2 = 0.0303
-- [VC HC3 @ 1:233766958 Q. of type=SNP alleles=[C*, A] attr={} GT=[]
-- [VC HC7 @ 1:233766959 Q. of type=SNP alleles=[G*, C] attr={} GT=[]
Found consecutive biallelic events with R^2 = 0.0182
-- [VC HC11 @ 1:233767094 Q. of type=SNP alleles=[C*, A] attr={} GT=[]
-- [VC merged @ 1:233767098-233767101 Q. of type=MNP alleles=[GAGA*, AGAT] attr={} GT=[]
Found consecutive biallelic events with R^2 = 0.0667
-- [VC merged @ 1:233767098-233767101 Q. of type=MNP alleles=[GAGA*, AGAT] attr={} GT=[]
-- [VC HC2 @ 1:233767105 Q. of type=SNP alleles=[A*, G] attr={} GT=[]
Found consecutive biallelic events with R^2 = 0.0303
-- [VC HC2 @ 1:233767105 Q. of type=SNP alleles=[A*, G] attr={} GT=[]
-- [VC HC8 @ 1:233767106 Q. of type=SNP alleles=[G*, A] attr={} GT=[]
Found consecutive biallelic events with R^2 = 1.0000
-- [VC HC8 @ 1:233767106 Q. of type=SNP alleles=[G*, A] attr={} GT=[]
-- [VC HC8 @ 1:233767107 Q. of type=SNP alleles=[A*, G] attr={} GT=[]
====> [VC merged @ 1:233767106-233767107 Q. of type=MNP alleles=[GA*, AG] attr={} GT=[]
Found consecutive biallelic events with R^2 = 0.0303
-- [VC HC3 @ 1:233766958 Q. of type=SNP alleles=[C*, A] attr={} GT=[]
-- [VC HC7 @ 1:233766959 Q. of type=SNP alleles=[G*, C] attr={} GT=[]
Found consecutive biallelic events with R^2 = 0.0182
-- [VC HC11 @ 1:233767094 Q. of type=SNP alleles=[C*, A] attr={} GT=[]
-- [VC merged @ 1:233767098-233767101 Q. of type=MNP alleles=[GAGA*, AGAT] attr={} GT=[]
Found consecutive biallelic events with R^2 = 0.0667
-- [VC merged @ 1:233767098-233767101 Q. of type=MNP alleles=[GAGA*, AGAT] attr={} GT=[]
-- [VC HC2 @ 1:233767105 Q. of type=SNP alleles=[A*, G] attr={} GT=[]
Found consecutive biallelic events with R^2 = 0.0303
-- [VC HC2 @ 1:233767105 Q. of type=SNP alleles=[A*, G] attr={} GT=[]
-- [VC merged @ 1:233767106-233767107 Q. of type=MNP alleles=[GA*, AG] attr={} GT=[]
Genotyping event at 233766958 with alleles = [C*, A]
Genotyping event at 233766959 with alleles = [G*, C]
Genotyping event at 233767008 with alleles = [T*, A]
Genotyping event at 233767023 with alleles = [A*, G]
Genotyping event at 233767059 with alleles = [T*, C]
Genotyping event at 233767094 with alleles = [C*, A]
INFO 14:36:57,341 GATKRunReport - Uploaded run statistics report to AWS S3

ERROR ------------------------------------------------------------------------------------------
ERROR stack trace

java.lang.IllegalArgumentException: Cannot extend a symbolic allele
at org.broadinstitute.sting.utils.variantcontext.Allele.extend(Allele.java:182)
at org.broadinstitute.sting.utils.variantcontext.VariantContextUtils.resolveIncompatibleAlleles(VariantContextUtils.java:842)
at org.broadinstitute.sting.utils.variantcontext.VariantContextUtils.simpleMerge(VariantContextUtils.java:545)
at org.broadinstitute.sting.utils.variantcontext.VariantContextUtils.simpleMerge(VariantContextUtils.java:452)
at org.broadinstitute.sting.gatk.walkers.haplotypecaller.GenotypingEngine.assignGenotypeLikelihoodsAndCallIndependentEvents(GenotypingEngine.java:159)
at org.broadinstitute.sting.gatk.walkers.haplotypecaller.HaplotypeCaller.map(HaplotypeCaller.java:411)
at org.broadinstitute.sting.gatk.walkers.haplotypecaller.HaplotypeCaller.map(HaplotypeCaller.java:107)
at org.broadinstitute.sting.gatk.traversals.TraverseActiveRegions.processActiveRegion(TraverseActiveRegions.java:285)
at org.broadinstitute.sting.gatk.traversals.TraverseActiveRegions.callWalkerMapOnActiveRegions(TraverseActiveRegions.java:230)
at org.broadinstitute.sting.gatk.traversals.TraverseActiveRegions.processActiveRegions(TraverseActiveRegions.java:205)
at org.broadinstitute.sting.gatk.traversals.TraverseActiveRegions.endTraversal(TraverseActiveRegions.java:294)
at org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:93)
at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:281)
at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:113)
at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:237)
at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:147)
at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:94)

ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 2.3-4-g57ea19f):
ERROR
ERROR Please visit the wiki to see if this is a known problem
ERROR If not, please post the error, with stack trace, to the GATK forum
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR
ERROR MESSAGE: Cannot extend a symbolic allele
ERROR ------------------------------------------------------------------------------------------

wrong candidate haplotype chosen by HaplotypeCaller

I've been experiencing some apparent errors with HaplotypeCaller that I think could be related to how it chooses candidate haplotypes when performing multi-sample calling. Please see the example files I've uploaded to the server (cooketho_20130103.tar.gz). For instance if you look at position 3511 in sample 2, there are 14 non-reference reads and 0 reference reads. When HaplotypeCaller is run with just this sample, it calls this locus homozygous non-reference, which seems to me to be the correct behavior. But when run with all 14 samples, it doesn't call a SNP at this locus. Repeating the run in debug mode shows that the (immediate) cause is that there were 11 candidate haplotypes found, and not a single one of them had the non-reference allele at position 3511. Why?

I came across an earlier post that suggested in some cases increasing the --minPruning value can be of use, but I tried this to no avail.

http://gatkforums.broadinstitute.org/discussion/1764/haplotypecaller-in-cohorts

My organism is a plant, and is is considerably more heterozygous than human, but changing the --heterozygosity value did not appear to help either. Double check me on this if you like.

Can you please suggest a fix, or perhaps release some documentation on how HaplotypeCaller selects candidate haplotypes?

P.S. Any idea of when the source will be released to the public, or when a more comprehensive manual will be released? Would be very helpful for figuring out what is going on in cases like this.

Thanks!
Tom

Sensibility/sensitivity of VQSR processed VCF

Hi all,
I've somewhere in this site that before VQSR the FP rate is expected to be around 10% (I guess for UnifiedGenotyper). Are there some updated statistics for VQRS? For HaplotypeCaller? For Exome/WG data?
Another thing: we apply VQRS on all our analysis, we are trying to collect some validation statistics. We suspect that most of the FP have some particular "culprits" in VQRS (especially QD and MQ). Do you have some data about this?
Best

d

Haplotype Caller incorrectly calling Blocks of Variants Heterozygous

Hi

I seem to have found a bit of an issue with the Haplotype caller. Looking at variants called with it I've come across a number of small blocks in the genome where the Haplotype caller has called every individual (50 individuals) either RA or RR, which seemed a bit odd considering the population.

Looking at the BAMs and VCFs from SAMtools and the Unified Genotyper these blocks of snps clearly contain all three states as I'd expect RR/RA/AA. Looking at the BAM the reads are of decent quality and have no nearby insertions or deletions to complicate things, and the variants have been called correctly by Samtools and UG.

Any idea what's causing this? Attached is an IGV image showing one of the regions in question, Top VCF is the Haplotype Caller (showing all calls as RA or RR, which is incorrect), Second is UG (showing a mix of RR/RA/AA which is correct). The First BAM shows one of the Animals HC is calling incorrectly as RA for the 5 SNPs shown, while the Second is an Animal that HC is calling RA correctly.

Note these incorrect calls from the HC also passed VQSR.
I believe the version of GATK is one of the 2.1 releases.

UnifiedGenotyping, HaplotypeCaller and PhasebyTrasmission

Hi to all

I began a variant analysis from 4 family related exome-seq samples in which a patology seems to be related to a polimorphism. I am just wondering which variant calling tools is better to use and if applying PhasebyTrasmission refinement is the correct way (in PhasebyTrasmission analysis does the read group that I assigned to bam file play a role in definition of the relation or I have to use just the ped file?).

Best
Giuliano

Bug in HaplotypeCaller v2.4: "Reads are too small for use in assembly."

We have received reports of a bug occurring with HaplotypeCaller in v2.4, with the error message "Reads are too small for use in assembly." We are working to fix it.

In the meantime, if you encounter it too, please don't post a new discussion about it, but do post a comment on this announcement so that we can count how many people are affected.

Thank you for your patience and our apologies for the inconvenience!

UPDATE: This is fixed as of version 2.4-7.

Run time error during variant calling

Hi, I'm using GATK latest version to analyze paired end exome sequencing data. I'd like to see the SNP, Indel and also SVs. I have followed the workflow of GATK, from the duplicates marking to the reads reducing step. Everything goes fine, until I start to use the HaplogypeCaller walker for the variant calling.
Command line I used:

java -jar $GATK/GenomeAnalysisTK.jar -T HaplotypeCaller -R human_g1k_v37.fa -I sample_reduced.bam -o sample_variant.vcf

At the beginning, it worked well, then I got the error message of "Reads are too small for use in assembly."
And I also tried the UnifiedGenotyper walker, command line:

java -jar $GATK/GenomeAnalysisTK.jar -T UnifiedGenotyper -R  human_g1k_v37.fa -I sample_reduced.bam -glm BOTH -o sample_variant.vcf

I got an error message of "Read bases and read insertion quals aren't the same, size 46 vs. 49".
I have googled the error message, but no related result. Does anyone met with the same problem? Eager to know how to solve this.
Thanks!


ArrayIndexOutOfBoundsException in HaplotypeCaller

Command was:
java -Xmx4g -jar GenomeAnalysisTK.jar -T HaplotypeCaller -R ref.fasta -I sample1.cleaned.sorted.rmdup.realigned.bam -o sample1.haplo.vcf

UnifiedGenotyper works with the same input

pao205@bio-masago:~/tmp/indel$ gatk -T HaplotypeCaller -R ramorum1.fasta -I 12475.cleaned.sorted.rmdup.realigned.bam -o 12475.haplo.vcf
INFO 16:55:17,314 HelpFormatter - --------------------------------------------------------------------------------
INFO 16:55:17,317 HelpFormatter - The Genome Analysis Toolkit (GATK) v2.4-7-g5e89f01, Compiled 2013/03/06 01:01:28
INFO 16:55:17,318 HelpFormatter - Copyright (c) 2010 The Broad Institute
INFO 16:55:17,318 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk
INFO 16:55:17,323 HelpFormatter - Program Args: -T HaplotypeCaller -R ramorum1.fasta -I 12475.cleaned.sorted.rmdup.realigned.bam -o 12475.haplo.vcf
INFO 16:55:17,324 HelpFormatter - Date/Time: 2013/03/13 16:55:17
INFO 16:55:17,324 HelpFormatter - --------------------------------------------------------------------------------
INFO 16:55:17,324 HelpFormatter - --------------------------------------------------------------------------------
INFO 16:55:17,409 GenomeAnalysisEngine - Strictness is SILENT
INFO 16:55:17,935 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 250
INFO 16:55:17,944 SAMDataSource$SAMReaders - Initializing SAMRecords in serial
INFO 16:55:18,041 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.09
INFO 16:55:18,785 GenomeAnalysisEngine - Creating shard strategy for 1 BAM files
INFO 16:55:18,834 GenomeAnalysisEngine - Done creating shard strategy
INFO 16:55:18,834 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING]
INFO 16:55:18,834 ProgressMeter - Location processed.active regions runtime per.1M.active regions completed total.runtime remaining
INFO 16:55:48,838 ProgressMeter - scaffold_1:36910 1.64e+04 30.0 s 30.5 m 0.1% 15.0 h 15.0 h
INFO 16:56:48,839 ProgressMeter - scaffold_1:53293 4.92e+04 90.0 s 30.5 m 0.1% 31.3 h 31.2 h
...
INFO 21:54:49,518 ProgressMeter - scaffold_11:611973 9.37e+06 5.0 h 32.0 m 14.1% 35.5 h 30.5 h
INFO 21:55:49,636 ProgressMeter - scaffold_12:400 9.46e+06 5.0 h 31.8 m 14.2% 35.3 h 30.3 h
INFO 21:56:38,764 GATKRunReport - Uploaded run statistics report to AWS S3

ERROR ------------------------------------------------------------------------------------------
ERROR stack trace

java.lang.ArrayIndexOutOfBoundsException: -1
at org.broadinstitute.sting.gatk.walkers.haplotypecaller.GenotypingEngine.generateVCsFromAlignment(GenotypingEngine.java:675)
at org.broadinstitute.sting.gatk.walkers.haplotypecaller.GenotypingEngine.assignGenotypeLikelihoods(GenotypingEngine.java:140)
at org.broadinstitute.sting.gatk.walkers.haplotypecaller.HaplotypeCaller.map(HaplotypeCaller.java:500)
at org.broadinstitute.sting.gatk.walkers.haplotypecaller.HaplotypeCaller.map(HaplotypeCaller.java:132)
at org.broadinstitute.sting.gatk.traversals.TraverseActiveRegions.processActiveRegion(TraverseActiveRegions.java:552)
at org.broadinstitute.sting.gatk.traversals.TraverseActiveRegions.processActiveRegions(TraverseActiveRegions.java:512)
at org.broadinstitute.sting.gatk.traversals.TraverseActiveRegions.traverse(TraverseActiveRegions.java:244)
at org.broadinstitute.sting.gatk.traversals.TraverseActiveRegions.traverse(TraverseActiveRegions.java:69)
at org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:100)
at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:283)
at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:113)
at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:245)
at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:152)
at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:91)

ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 2.4-7-g5e89f01):
ERROR
ERROR Please visit the wiki to see if this is a known problem
ERROR If not, please post the error, with stack trace, to the GATK forum

HaplotypeCaller 2.4

I am getting the following error. What is the minimum read size to do assembly? 50 basepair too short?

ERROR stack trace

java.lang.IllegalStateException: Reads are too small for use in assembly.
at org.broadinstitute.sting.gatk.walkers.haplotypecaller.DeBruijnAssembler.createDeBruijnGraphs(DeBruijnAssembler.java:139)
at org.broadinstitute.sting.gatk.walkers.haplotypecaller.DeBruijnAssembler.runLocalAssembly(DeBruijnAssembler.java:123)
at org.broadinstitute.sting.gatk.walkers.haplotypecaller.HaplotypeCaller.map(HaplotypeCaller.java:483)
at org.broadinstitute.sting.gatk.walkers.haplotypecaller.HaplotypeCaller.map(HaplotypeCaller.java:132)
at org.broadinstitute.sting.gatk.traversals.TraverseActiveRegions.processActiveRegion(TraverseActiveRegions.java:552)
at org.broadinstitute.sting.gatk.traversals.TraverseActiveRegions.processActiveRegions(TraverseActiveRegions.java:512)
at org.broadinstitute.sting.gatk.traversals.TraverseActiveRegions.traverse(TraverseActiveRegions.java:244)
at org.broadinstitute.sting.gatk.traversals.TraverseActiveRegions.traverse(TraverseActiveRegions.ja
at org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.j
at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:283
at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:1
at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:24
at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:15
at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:91)

ERROR ---------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 2.4-3-g2a7af43):
ERROR
ERROR Please visit the wiki to see if this is a known problem
ERROR If not, please post the error, with stack trace, to the GATK forum
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR
ERROR MESSAGE: Reads are too small for use in assembly.
ERROR ---------------------------------------------------------------------------------------

::::::::::::::

Bug in HaplotypeCaller 2.4-7 : "Bad likelihoods detected"

Identification of all SNPs in a haplotype based on a single SNP

Dear all!

Is it possible to identify all SNPs in a haplotype using a single SNP?

For example, input file: list of SNPs. Output file: list of all SNPs correlated with each SNP in the input file.

Thanks in advance!

Sincerely,
aiglos

Disagreement between HaplotypeCaller, VariantAnnotator, and ValidateVariants over a dbSNP annotation

I ran the HaplotypeCaller, VariantAnnotator, and Variant Validatoor on chr3 locations from a human tumor sample.

The HaplotypeCaller command line is:

gatk="/usr/local/gatk/GenomeAnalysisTK-2.2-8-gec077cd/GenomeAnalysisTK.jar"
#Fasta from the gz in the resource bundle
indx="/home/ref/ucsc.hg19.fasta" 
dbsnp="/fdb/GATK_resource_bundle/hg19-1.5/dbsnp_135.hg19.vcf"

java -Xms1g -Xmx2g -jar $gatk -R ${indx} -T HaplotypeCaller \
 -I chrom_bams/286T.chr3.bam \
 -o hapc_vcfs/286T.chr3.raw.vcf 

The VariantAnnotator command line is:

java -Xms1g -Xmx2g -jar $gatk -R ${indx} -T VariantAnnotator \
     --dbsnp $dbsnp  --alwaysAppendDbsnpId \
    -A BaseQualityRankSumTest -A DepthOfCoverage \
    -A FisherStrand -A HaplotypeScore -A InbreedingCoeff \
    -A MappingQualityRankSumTest -A MappingQualityZero -A QualByDepth \
    -A RMSMappingQuality -A ReadPosRankSumTest -A SpanningDeletions \
    -A TandemRepeatAnnotator \
    --variant:vcf hapc_vcfs/286T.chr3.raw.vcf \
    --out varanno_vcfs/286T.chr3.va.vcf

This all works nicely, but I go back and use ValidateVariants just to be sure:

java -Xms1g -Xmx2g -jar $gatk -R ${indx} -T ValidateVariants \
   --dbsnp ${dbsnp} \
   --variant:vcf varanno_vcfs/286T.chr3.va.vcf \
    1> report/ValidateVariants/286T.chr3.va.valid.out \
    2> report/ValidateVariants/286T.chr3.va.valid.err &

An issue arises with a rsID that is flagged as not being present in dbSNP.

...fails strict validation: the rsID rs67850374 for the record at position chr3:123022685 is not in dbSNP

I realize this is an error message that generally would not generally qualify as an issue to post to these forums, however it is an error that seems to be generated by the Haplotype caller, illuminated by VariantAnnotator, and caught by the ValidateVariants.

The first 7 fields of the offending line in the 286T.chr3.va.vcf can be found using: cat 286T.chr3.va.vcf | grep rs67850374

chr3    123022685       rs67850374;rs72184829   AAAGAGAAGAGAAGAG        A       1865.98 .

There is a corresponding entry in the dbsnp_135.hg19.vcf file: cat $dbsnp | grep rs67850374

chr3    123022685       rs67850374;rs72184829   AA      A,AAAGAGAAGAG,AAAGAGAAGAGAAGAGAAGAG     .  PASS

My initial guess is that this is caused by a disagreement in the reference and variant fields between the two annotations. From what I can gather the call to the variantcontext function validateRSIDs() has a call to validateAlternateAlleles(). I assume this is what throws the error that is then caught and reported as "...fails strict validation..."

The UCSC genome browser for hg19 does show the specified position to be AA. It seems as thought the HaplotypeCaller simply used a different reference than dbsnp in this case.

The reference file supplied to HaplotypeCaller was the same as to VariantAnnotator and ValidateVariants. I did not supply the dbsnp argument to the HaplotypeCaller as I planned on doing all annotations after the initial variant calling, and the documentation states that the information is not utilized in the calculations. It seems as though this is a difference in between the reference assembly for dbSNP and the the reference supplied by the resource bundle.

My questions are:

  1. Is this really a problem that arises from slightly different reference assemblies?
  2. Is the hg19-1.5 reference fasta different from any other hg19 reference fasta?
  3. Is there at tool that I have missed that would have prevented this error and allowed the pipeline to continue without error?"
  4. Will this strict validation failure cause problems for the VariantRecalibrator?

As it stands, I am simply going to discard the offending lines manually. There are less than twenty in the entire exome sequencing of this particular tumor-normal sequencing. However, it seems like this issue will likely arise again. I will check the dbSNP VCF for places where the reference differs from the sequence in hg19. At least that should give me an estimate of the number of times this will arise and the locations to exclude from the variant calls.

-- Colin

Viewing all 1335 articles
Browse latest View live