RSV-A Reference Sequence Question

Hello,

I noticed that the referenced RSV-A NCBI accession has been updated to the INSDC reference that differs slightly from the A/England/397/2017 reference used in the RSV-A Nextclade dataset. I was wondering if there are plans to update the Nextclade dataset to use the INSDC reference or if the current reference will continue to be used?

Thank you for your time.

Hi Tara @T4R4,

Sorry for the confusion!

My understanding is that it’s only a documentation change, i.e. in the README.md file Theo added a link to the PP109421.1 (same sequence as EPI_ISL_412866), now that it’s avaialble in an open database, and now we can give a link to it, so that people can use it (GISAID EPI_ISL_* data is restricted).

Previously the README.md provided a link to an open LR699737, which is almost the same as EPI_ISL_412866, but not quite, so it required some explanation. None of that is needed now that we have PP109421.1.

So there is no dataset change planned. It is based on EPI_ISL_412866 and continues to be. The changes to README.md will be released in the next version of a dataset, once we have some actual changes, because what’s in README.md is only a visual change.

I double checked just now that EPI_ISL_412866 and PP109421.1 are indeed the same sequences:

# Download both sequences
curl -fsSLo "PP109421.1.fasta" "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=nuccore&id=PP109421.1&rettype=fasta&retmode=text"
curl -fsSLo "EPI_ISL_412866.fasta" "https://raw.githubusercontent.com/nextstrain/nextclade_data/refs/heads/master/data/nextstrain/rsv/a/EPI_ISL_412866/reference.fasta"

# Make file formatting consistent
seqkit seq EPI_ISL_412866.fasta -s -n -i -w 0 > f1.txt
seqkit seq PP109421.1.fasta -s -n -i -w 0 > f2.txt

# Compare
diff f1.txt f2.txt

Output:

1c1
< >EPI_ISL_412866
---
> >PP109421.1

i.e. they only differ in header.

Let us know if you noticed any other problems we might have missed.

Hi Ivan,

Thank you for your response! My apologies, I accidentally interpreted the commit in the opposite order as I was thinking it reflected the current read me on the Nextclade site.

That makes sense!

Thank you for your time,
Tara