Information for:
Seems like I’m a bit behind the times with MUMmer. Better options for multiple genome alignment include
But first, mapping-based assembly to B. japonicum genome using bowtie2.
Fairly easy to understand…got script running quickly!
#!/bin/bash -l
#PBS -l nodes=1:ppn=1,mem=24gb,walltime=6:00:00
#PBS -m abe
cd /home/tiffinp/stanton1/B-elkanii-genomes/bowtie2-assemblies
# Set variables
refgenome=Bjaponicum_USDA110_ref_genome.fasta
read1="/home/tiffinp/stanton1/B-elkanii-genomes/fastq-quality-trimmed/IC1_R1_seqtk_trimmed_clipped_stillpaired.fastq"
read2="/home/tiffinp/stanton1/B-elkanii-genomes/fastq-quality-trimmed/IC1_R2_seqtk_trimmed_clipped_stillpaired.fastq"
# Align contigs from velvet de novo assembly to Bradyrhizobium japonicum reference genome
# bowtie build
bowtie2-build $refgenome Bjaponicum
# bowtie assemble for each sample
for sample in IC1 ENC4 EWC3
do
bowtie2 -x Bjaponicum -1 $read1 -2 $read2 --phred33 -S BT2-${sample}.sam --un ${sample}-unaligned
done
Spot-checked SAM output - looks good, but unaligned reads not being saved to separate file. Recommendation to remove unmapped reads using:
awk '$3!="*"' in.sam > in.filt.sam
Searches file for unmapped read with * in 3rd column. Can also do
awk '$3=="*"' in.sam > in.unaligned.sam
And remove original file.
Picard. Explain SAM flags
Meeting with NG.
Useful git reminders from software carpentry git tutorial
This work is licensed under a Creative Commons Attribution 4.0 International License.