Preparing data from paired-end FASTQ files

I have a very basic question about how to prepare data for Nextstrain. I have around 100 paired end reads of influenza samples (ie, files ending in _R1.fastq or _R2.fastq) and I am unsure how to format them as a consensus genome as described here in the Nextstrain documentation.

Based on my understanding, I think I need to use Bioconductor or Samtools or something to align the paired-end reads to a reference genome and generate a consensus genome - is that correct? Do I do this pairwise for each set of paired-end reads?

Thanks, I appreciate any help getting started in the right direction.

Hi @maryj,

Consensus genomes are usually generated through a bioinformatics pipeline that uses multiple tools to assemble consensus genomes. INSaFlu can be a good platform for you to get started with your influenza samples.

Best,
Jover