Contextual strain list from augur filter

yaotli · May 6, 2022, 10:17am

Hi all,

I recently found a majority of contextual sequences were dropped in the combine_samples steps. I then traced back to the subsample step which generates contextual strain list (e.g. sample-global.txt), and found that most of these sampled strains were not in “priorities_country.tsv”, “proximity_country.tsv” or “combined_sequences_for_subsampling.fasta”. I am wondering how these strains were sampled (listed in the strain list) even if they were not in the .fasta file or the files relevant to the priority.

I compared the subsampled step in the previous workflow. It seems the old workflow considered sequence data in the step. I don’t know if I need to manually change the role to solve the issue or the problem is caused by other reasons.

I attached my entire line for subsamping below -

augur filter --metadata results/combined_metadata.tsv.xz --include defaults/include.txt --exclude defaults/exclude.txt --min-date ‘2021-01-01’ --exclude-where ‘country=XX’ --priority results/XX_region/priorities_country.tsv --group-by year month --subsample-max-sequences 2000 --output-strains results/XX_region/sample-global.txt

Thanks!

Topic		Replies	Views
Augur filter --subsample-seed reproducible example Help and Getting Started	3	543	September 23, 2021
Question about augur filter Help and Getting Started	5	283	April 23, 2024
All samples dropped during augur filter	29	2194	January 24, 2022
Exclusion of forced sequences after augur filter step - seasonalflu build General	4	506	January 30, 2023
Ncov: how to exclude samples with bad Nextclade QC? Help and Getting Started	0	396	March 19, 2022

Contextual strain list from augur filter

Related topics