Running local samples in global background

nassad · July 31, 2020, 7:27pm

Hi All,

I am trying to run a few hundred samples through NextStrain in a global background from GISAID. I include the sample names in the “include.txt” file, and see them added back in on the log files. The samples show up in the sequence-diagnostics.tsv output and each has an associated align_#.txt, but none of the jsons seem to include the samples. The log files do not indicate that they were filtered out. The samples all have >99% complete genomes and the metadata was carefully defined to match the standards.

Can you help me figure out why the samples don’t appear in the json output files?

Thank you!
Nima

rneher · August 1, 2020, 9:43am

Hi Nima,
what exact pipeline/snakefile are you running. In many of our runs, there are two filtering steps: One to align all sensible sequences and another later to pick a relevant subsample. Could it be that you only force inclusion of your samples in the first and not the second step?

best,
richard

nassad · August 3, 2020, 4:28pm

Hi Richard,

I do not believe I am forcing the inclusion of the samples in the second step. I am running everything default according to the SARS-CoV-2 tutorial aside from using only the global build and including my samples in include.txt.

My snakemake call looks like: snakemake --profile my_profiles/modified_example -p

I am looking at the main_workflow.smk, but I do not readily see how I can force the inclusion of my samples in the subsampling. Can you guide me through this or point me to a page where it is described?

Thank you!
Nima

nassad · August 3, 2020, 4:55pm

I see now that I have to make additional changes according to:

I will spend more time with the advanced customization example and will come back if I have questions!

Thank you,
Nima

Topic		Replies	Views
ERROR: All samples have been dropped! Check filter rules and metadata file format Help and Getting Started	0	730	September 21, 2020
Only global build found in ./auspice General	4	568	October 23, 2020
Sequence missing after certain dates General	5	222	January 16, 2024
SARS-CoV-2 Sequences from Wastewater sample Help and Getting Started	2	411	July 14, 2022
Error running snakemake Help and Getting Started	2	1952	January 5, 2021

Running local samples in global background

Related topics