Perform analysis merging my dataset and South America dataset, without subsampling

fhsantanna · March 3, 2021, 11:06am

Hi.
I have a dataset of 300 local sequences. I would like to do a phylogenetic analysis using the South America dataset as reference (auspice). For this purpose, I believe it is not necessary to subsample, because it was already made, isn’t it?
How could I bypass the subsampling step of the snakemake pipeline?

Thank you.

fhsantanna · March 3, 2021, 1:45pm

Solved. I included all strains of the fasta in the file “include.txt”.

james · March 8, 2021, 2:01am

Nice work. Currently the subsampling steps are rather integral to the ncov pipeline so skipping them would require a few modifications to the Snakefile, but they can be tricked into not removing any sequences by adding all the samples to the include list (as you’ve done) or by setting “dummy” group_by and sequences_per_group variables (e.g. “country” and “1000000”, respectively).

quietjen · February 24, 2022, 8:53pm

Just heard about using skip_diagnostics: true in the build configuration file, to skip filtering strains by quality control metrics.

Change Log — SARS-CoV-2 Workflow documentation

Topic		Replies	Views
Subsampling Local DENV dataset based on genetic similarity Help and Getting Started	1	274	December 19, 2023
Sequence missing after certain dates General	5	222	January 16, 2024
Difference between sequence samples based on Dataset Help and Getting Started	1	34	September 16, 2024
Using existing alignment Help and Getting Started	5	535	January 29, 2022
Multiple subsampling from same alignment	2	364	September 1, 2021

Perform analysis merging my dataset and South America dataset, without subsampling

Related topics