I noticed that the referenced RSV-A NCBI accession has been updated to the INSDC reference that differs slightly from the A/England/397/2017 reference used in the RSV-A Nextclade dataset. I was wondering if there are plans to update the Nextclade dataset to use the INSDC reference or if the current reference will continue to be used?
My understanding is that it’s only a documentation change, i.e. in the README.md file Theo added a link to the PP109421.1 (same sequence as EPI_ISL_412866), now that it’s avaialble in an open database, and now we can give a link to it, so that people can use it (GISAID EPI_ISL_* data is restricted).
Previously the README.md provided a link to an open LR699737, which is almost the same as EPI_ISL_412866, but not quite, so it required some explanation. None of that is needed now that we have PP109421.1.
So there is no dataset change planned. It is based on EPI_ISL_412866 and continues to be. The changes to README.md will be released in the next version of a dataset, once we have some actual changes, because what’s in README.md is only a visual change.
I double checked just now that EPI_ISL_412866 and PP109421.1 are indeed the same sequences:
Thank you for your response! My apologies, I accidentally interpreted the commit in the opposite order as I was thinking it reflected the current read me on the Nextclade site.