Perform analysis merging my dataset and South America dataset, without subsampling

Hi.
I have a dataset of 300 local sequences. I would like to do a phylogenetic analysis using the South America dataset as reference (auspice). For this purpose, I believe it is not necessary to subsample, because it was already made, isn’t it?
How could I bypass the subsampling step of the snakemake pipeline?

Thank you.

Solved. I included all strains of the fasta in the file “include.txt”.

1 Like

Nice work. Currently the subsampling steps are rather integral to the ncov pipeline so skipping them would require a few modifications to the Snakefile, but they can be tricked into not removing any sequences by adding all the samples to the include list (as you’ve done) or by setting “dummy” group_by and sequences_per_group variables (e.g. “country” and “1000000”, respectively).

Just heard about using skip_diagnostics: true in the build configuration file, to skip filtering strains by quality control metrics.