First, thanks for this amazing tool and the accompanied documentations.
In my research, my aim is to infer the Avian Influenza transmission events for the period 2019-2021 based on an outbreak dataset. In order to validate my proposed model, I would like to obtain a kind of ground-truth regarding Avian Influenza transmission events. I thought that the phylogenetic analysis could be a solution for my validation issue at coarser level. This is how I came across the site NextStrain.org and the article “Nextstrain: real-time tracking of pathogen
I do not have any bioinformatics background. There are some interesting works on Avian Influenza transmissions based on phylogenetic analysis, (e.g. link, see also Fig 3 in this paper), but there is no code provided.
So, for now my aim is just to rely on the 4 phylogenetic analysis for Avian Influenza that you have on NextStrain: H5N1, H5NX, H7N9, H9N2 (e.g. auspice). Based on them, I could be able to download the corresponding phylogenetic tree results in nexus format. Then, I can extract a transmission graph between locations by combining these 4 phylogenetic trees for a specific period (what you have called “transmission lines”).
I would like to ask a couple of questions:
- In auspice, we see this information: “Showing 2103 of 2103 genomes sampled between Dec 1996 and Mar 2022.” How I can obtain the sources for the genome data used in these 4 analysis ? For instance, are you using some GitHub repositories of genomic data ?
- Unless I am mistaken, when I downloaded the phylogenetic trees in Nexus format, I could not see any probability related values on edges between nodes, which would indicate Bayes Factor results. Is there a way to get these values ?
- Would you have any recommendations about my objective of obtaining a transmission event dataset ? If you can point me to other works or tools (in R or python) on Avian Influenza, I would be grateful.