Dear Nextstrain Team,
I hope this message finds you well. I am writing to seek clarification regarding the annotation of the M2-2 gene in the RSV-A module on Nextstrain.
I noticed a discrepancy between the M2-2 gene coordinates in Nextstrain and those in the NCBI reference sequence (PP109421.1):
- Nextstrain annotation: M2-2 = 8193–8465 (+)
- NCBI (PP109421.1): M2-2 = 8199–8465 (+)
This 6-nucleotide difference at the start of the gene results in a translation discrepancy:
- Nextstrain translation:
TTMPKIMILPDKYPCSINSILITSNYRVTMYNQKNTLYINQNNQNSHIYPPDQPFNEIHWTSQDLIDATQNFLQHLGITDDIYTIYILVS* - NCBI translation:
MPKIMILPDKYPCSINSILITSNYRVTMYNQKNTLYINQNNQNSHIYPPDQPFNEIHWTSQDLIDATQNFLQHLGITDDIYTIYILVS*(lacking the initial “TT”).
Could you kindly clarify:
- What is the basis for the 8193 start position in Nextstrain (e.g., experimental evidence, historical annotation, or alignment-based inference)?
- How should this discrepancy be resolved for downstream analyses? Is there a recommended canonical annotation for RSV-A M2-2?
Thank you for your time and expertise. I greatly appreciate your work in maintaining this invaluable resource.
Best regards,
Jingqi Yang