Inconsistencies in the result of augur traits

juan_dc · May 8, 2023, 4:27pm

I have been carrying out different executions of a dataset with the augur traits option, currently I have a base dataset of 20000 sequences and I do different executions taking an average of 4000 sequences from the main dataset using the group_by option, however when I see the results in auspice I notice that the inference of transmission events varies between iterations

As an example I have regions A, B, C and D
first repetition infers first transmission event of variant X from point A to B
Second repetition infers first transmission event of variant X from point B to C
Second repetition infers first transmission event of variant X from point D to C

I know from the literature that the starting point is region B, and in all executions the first sequences of the variant are found in region B.

Taking this result into account, can I assume that using NEXTSTRAIN it is not possible to infer the initial transmission events of the variants? Or am I making a mistake?

In my configuration I use group_by (division year mount), in augur traits I use a sample bias of 2.5 and reviewing the sequences of each execution I see that they are selected evenly between all regions

corneliusroemer · May 8, 2023, 5:03pm

Hi Juan! Inference of ancestral states is inherently a statistics problem. If you have different trees, you can get different results. TreeTime uses maximum likelihood methods, which gives only limited information about how likely a particular transmission path is compared to all the possible paths.

If different runs give different results it just means that all these paths are compatible with the data. Does that make sense?

See: Inference of transition between discrete characters and ‘mugration’ models — TreeTime 0.10.0 documentation
and maybe also: TreeTime: Maximum-likelihood phylodynamic analysis | Virus Evolution | Oxford Academic for a deep dive

Topic		Replies	Views
Using trait subcommand to infer location of unsampled nodes Help and Getting Started	2	382	September 8, 2021
How to create transmission lines for divisions Help and Getting Started	6	1102	March 12, 2021
Customize Nextstrain Help and Getting Started	4	784	October 2, 2020
300 or more distinct discrete states found in ancestral reconstruction	4	730	March 13, 2021
Problems with `augur traits` and `augur frequencies` using supplied sequences Help and Getting Started	3	962	June 23, 2021

Inconsistencies in the result of augur traits

Related topics