Chikungunya Virus, 250 years old?

Hi, there:

I downloaded the high-depth sequence data for 4,590 Chikungunya Virus from GISAID, and run a nextstrain analysis. Please see the screenshot below. I am so surprised to see that the inferred root of the phylogenetic tree is somewhere in year 1777, which is around 250 years ago.

That does not seem correct to me. I now put the fasta file, the metadata file, the Snakefile and configure file at this Github link 001/analysis at master · jielab/001 · GitHub . Can you please take a peek at your convenience and let me know if I somehow mis-used nextstrain?

Thank you very much!

Best regards,

Jie

Hi Jie,

time scaled phylogenetic analysis can be very sensitive to the quality of the temporal signal in the data. If a molecular rate and the location of the root can not be reliably inferred, time scaled trees can be quite far from the true timing. I’d advise to investigate the temporal signal and if necessary specify root and rate explicitly,

richard

Dear Richard:

Thank you very much for your reply.

As you might know, the Chikungunya virus is reported to affect a major part of the global population and mosquitos will never go away. It is a pity that many software that works with COVID-19 would be outdated, including the PANGO.

It would be great if your guys could make another example using the Chikungunya virus data, besides the current zika_tutorial. GISAID has a lot of Chikungunya virus sequences for downloading.

In such a tutorial or case study, it would be great if you guys could show the broad users of your distinguished software platform:

:red_paper_lantern:1. How to investigate the temporal signal and if necessary specify root and rate explicitly, as you mentioned?

:red_paper_lantern:2. How to detect blockbuster genotypes / clades such as East/Central/South African (ECSA), West African, Asian clades? Can Nextstrain run analyses that were previously done by PANGO?

BTW, three quick technical questions:

:star:1. I could run with nextstrain view, but not auspice view. Why?

:star:2. How to make the Play button on the top left panel go slower? Right now, the phylogeny tree that I generated spans from year 1700 to 2025. So, I need the play button to go slower when it reaches the year between 2020 and 2025.

:star:3. I assume that nextclade is a simpler way to run nextstrain by simply dragging files into a web interface. However, I got the error of “Suggestion algorithm was unable to find a dataset suitable for your sequences” when I upload my .fasta file to it. Please see screenshot below.

Your help is greatly appreciated.

Best regards,

Jie

Hi Jie @jiehuang001

Nextclade is not a simpler way to run Nextstrain. These are different tools (conventional nextstrain pipelines being a set of multiple tools) for different purposes, although sometimes looking similar (e.g. both nextstrain.org and nextclade both include Auspice visualization) and often used together.

In order to better understand how Nextstrain ecosystem work, please check out the documentation. You will find documentation for all tools, including nextclade, in the “components“ section. In particular, note that Nextclade does not build trees from scratch (see “Phylogenetic placement“ section in Nextclade docs).

Regarding Chikungunya: In order for Nextclade to analyze a particular organism, it needs a so-called dataset - a set of file which contain organism-specific information. Datasets are documented here (for users) and here (for dataset authors). Anyone can prepare a dataset. There is no dataset for Chikungunya in our dataset collection (yet?) and I am not aware of any third-party datasets.

I will let our scientists to address the rest of the questions.