Ncov: how to exclude samples with bad Nextclade QC?

corneliusroemer · March 19, 2022, 12:43pm

This is copied from a Q&A via email, for general visibility in case anyone has similar questions:

Q: Can the nextclade generated QC fields (eg: QC_overall_status) be used as filters or query/exclude values in a subsampling scheme? Or are those values generated after subsampling? If so, is there a way to filter on those fields after subsampling?

In the default workflow, we only run Nextclade on the subsampled data (to produce both alignments and QC annotations), so you can’t use these values in a subsampling scheme. We filter on these fields after subsampling through the diagnostic rule that flags low quality data for exclusion.

After aligning the subsampled data, there is a final filter rule has access to the Nextclade QC metrics through the metadata. You could use the “exclude_where” parameter to apply filters based on the QC values. This parameter passes through to augur filter’s --exclude-where argument which is not as full-featured as the --query argument, but you can do simple rules like this:

filter:
    exclude_where: division='USA' QC_overall_status='bad'

You could also drop in a replacement “filter” rule with the ruleorder trick you’ve been using, to get more control over that final filter step.

Or, you could run Nextclade on the full input data and annotate your metadata with the resulting QC columns prior to running the workflow. Then you could reference those QC columns from subsampling rules just like any other metadata.

Do any of these solutions sound like they’d work for what you’re trying to do?

Topic		Replies	Views
Using existing alignment Help and Getting Started	5	536	January 29, 2022
Subsampling sequences genetically related to a focal sample Help and Getting Started	0	453	January 14, 2022
Subsampling Local DENV dataset based on genetic similarity Help and Getting Started	1	277	December 19, 2023
GISAID nextfasta QC criteria General	7	1027	December 15, 2020
Contextual strain list from augur filter General	0	370	May 6, 2022

Ncov: how to exclude samples with bad Nextclade QC?

Related topics