I've got 300 gvcfs as a results of a Queue pipeline, that I want to combine. When I run CombineGVCFs (GATK v3.1-1) this however seems fairly slow:
INFO 15:24:22,100 ProgressMeter - Location processed.sites runtime per.1M.sites completed total.runtime remaining
INFO 15:57:52,778 ProgressMeter - 1:11456201 1.10e+07 33.5 m 3.0 m 0.4% 6.4 d 6.3 d
INFO 15:58:52,780 ProgressMeter - 1:11805001 1.10e+07 34.5 m 3.1 m 0.4% 6.4 d 6.3 d
INFO 15:59:52,781 ProgressMeter - 1:12140201 1.20e+07 35.5 m 3.0 m 0.4% 6.4 d 6.3 d
Is there a way of improving the performance of this merge? 6 days seems like a lot, but of course not unfeasible. Likewise, what kind of performance could I expect in the GenotypeGVCFs
step?