Too high divergence

Mike.Lloyd · January 13, 2021, 7:37pm

Can someone please point me to the setting, or explanation of the sequences placed into flagged-sequences.tsv. Specifically the meaning of ‘too high divergence’

I am seeing some samples excluded with too high divergence 15.23>15; or similar values that are falling close to the 15 threshold, and I’d like to see about increasing it slightly above 15.

Thanks

rneher · January 23, 2021, 1:24pm

Yes, these parameters are hard coded in scripts/diagnostic.py (not ideal, I know).

we recently relaxed these numbers as the old set of numbers was too strict for the increasing diversity accumulating globally.

reuns · February 1, 2021, 10:15am

Hi, can you share a copy of some version of flagged-sequences.tsv and related files ? For example I’d like to count the number of Belgium sequences and eventually see the flagging messages. Thank you.

reuns · February 1, 2021, 12:12pm

The ‘excess divergence’ field seems to be the most interesting for excluding sequences with an isolated long branch in the tree, it is unclear if it is used and how.

rneher · February 28, 2021, 7:05pm

We are filtering on excess divergence. if its absolute value is too large, the sequences get added to the exclude file and will then be dropped.

Topic		Replies	Views
Why do my sequences end up in excluded_by_diagnostics.txt? Help and Getting Started	3	786	October 18, 2022
How to deal with samples with way too many mutations General	2	688	August 5, 2022
Sequence missing after certain dates General	5	262	January 16, 2024
Exclusion of forced sequences after augur filter step - seasonalflu build General	4	558	January 30, 2023
Understanding divergence Help and Getting Started	2	780	September 16, 2021

Too high divergence

Related topics