Please, when you upload more than one RSV genome.fasta and it chooses a reference for each one, how do you know which ones are RSV-A and which ones are RSV-B?
How to type the sequence because if you choose one reference they all come out as that one and if you choose the other one they come out as the other one. Tks.
Hi @martahp! When you upload sequences to Nextclade’s web interface, Nextclade runs a quick alignment of each sequence you uploaded against the reference sequence for each available dataset and finds the dataset with the best match for each sequence. It “suggests” the dataset with the best overall match across all of your sequences.
You can tell Nextclade to use a specific different dataset than the one it automatically suggests, though. See the Nextclade getting started guide, for more details about how to choose a different dataset.
Thank you very much Dr. Huddleston, what I don’t understand is why if I select one reference sequence it types to one clade, and if I choose another reference it types for a different clade.
Kind regards
Each Nextclade dataset has its own reference sequence (used for alignment) and its own guide tree (used for clade assignment). RSV A and B have different clades representing the diversity within each of these viral lineages. When you upload a sequence to Nextclade, it will get aligned to the reference sequence of the selected dataset, placed to the closest match in the guide tree, and assigned to the clade label associated with that placement in the tree. It is possible that the same sequence could align (even partially) to both RSV A and B references, but since there are different clades for each lineage, that sequence will get a different clade label depending on the dataset.
If you don’t know which of your input sequences are RSV A or B (or something else!), you can look at the alignment quality of each sequence to the RSV A reference and then the RSV B reference. You should have a high-quality alignment of RSV A sequences to the RSV A reference, for example, and lower-quality alignment of RSV A sequences to the RSV B reference.