How to change the reference sequence?

How to I change Reference: HIV-1, strain HXB2 (NC_001802) to another reference in nextclade?

@XingguangLi It’s not typically possible to just swap a reference sequence. You’d need to also adjust all other components of the dataset (genome annotation, reference tree, QC config etc.). Which for a diverse virus basically means creating a new dataset.

You can start with only the new reference sequence - this will make analysis less complete (e.g. if there’s no genome annotation, then in the results there will be no translation and aminoacid mutations, and if there’s no reference tree, then there will be no clade assignment and tree placement). But you can start small and add more features later.

The structure and usage of datasets in Nextclade is described in the user documentation:
https://docs.nextstrain.org/projects/nextclade/en/stable/user/datasets.html

The existing official datasets and relevant dataset authors documentation are here: https://github.com/nextstrain/nextclade_data

In particular, the HIV dataset is in https://github.com/nextstrain/nextclade_data/tree/master/data/community/neherlab/hiv-1/hxb2

Richard is preparing files for this dataset here: https://github.com/neherlab/HIV-nextclade.

Note that the HIV dataset is currently marked as “experimental” and is still being worked on. Please make sure you check it’s README.md file for details.

Dear @ivan-aksamentov,

I appreciate your reply and help.