Prepped Aphaenogaster samples for shipping to BMGC for sequencing.
After correcting velvet assembler to specify that data format was fastq (not fasta), de novo assembly went quickly, ~ 1 hr run time!
Found online tool velvet advisor to help with k-mer selection. Based on my inputs of 12.5 million reads, paired-end, each read 60bp long, estimated genome size of 9 Mbp, recommended k=53. For 30 fold k-mer coverage, recommended k=51
Results for different k-mers. Assembly length is for contigs > 999 bp
k-mer | num_contigs | 50% coverage quantile | assembly length contigs >999 (Mbp) |
---|---|---|---|
21 | 13,853 | 88.2 | |
31 | 3,065 | 90 | |
41 | 1,695 | 71 | |
53 | 1,946 | 45 | 8.16e6 |
velvetOptimiser looks like a nice defensible way to go about selecting k-mer…but not implemented in Galaxy.
Another good reason to (finally!) migrate to command-line.
This work is licensed under a Creative Commons Attribution 4.0 International License.