Using influenza datasets in clades.nextstrain.org

Hello, I have experience using https://clades.nextstrain.org/ for SARS-CoV-2 genome sequence analysis, but am having difficulty with influenza A.

I am trying to use it primarily to identify mutations in partial HA and NA fragments.

Whereas for SARS-CoV-2, you can upload a partial sequence (e.g. partial spike gene fragment) and get a result, when I try to do it with different influenza A reference sets, it gives error messages saying “Unable to align: seed alignment was unable to find any matches that are long enough.”

I tried using a full-length NA segment against the Influenza A H1N1pdm NA reference dataset, and it still gave the same error.

Is it necessary to have the full-length HA or NA sequence for it to work?

Thank you in advance,
Jose

hi @jmediavi

partial sequences for HA and NA should definitely work. Could you share one such example?

richard

Hi Richard,

Sorry for the delay in responding. Attached is an example of a Sanger fragment from the NA gene of an H1N1 strain.

The sequence is also pasted below for convenience.
It is 525 bp long, but when I try to upload it to NextClade using the Influenza A H1N1pdm NA reference set, it gives the following error message:
When calculating seed matches: Unable to align: seed alignment was unable to find any matches that are long enough. Only matches of at least 40 nucleotides long are considered (configurable using ‘min match length’ CLI flag or dataset property). This is likely due to low quality of the provided sequence, or due to using incorrect reference sequence.

Please let me know if I am doing something wrong…

Thanks,
Jose

83_R-NA_1157_R_B11.ab1
TTCTCCCTATCNNNACACCATTGCCGTATTTAAATGAAAATCCCTTTACCCCATTTGCTCCATTAGACGATACTGGGCCACAACTGCCTGTCTTATCATTAGGGCGTGGATTGTCTCCGAAAACCCCACTGCATATGTATCCCATCTGATATTCCAGATTCTGGTTGAAAGACACCCAAGGTCGATTTGAGCCATGCCAATTATCCCTGCACACACATGTGATTTCACTAGAATCAGGGTAACAGGAGCATTCTTCATAGTGATAATTAGGGGCCTTCATTTCGACTGATTTGGTTATCTTTCCCTTCTCTATTCTGAAGATTTTGTATGAGGCCTGTCCATCACTTGGTCCATCGGTCATTATGGTAAAGCAAGAACCATTTACACATGCACATTCAGACTCTTGTGTTCTCAATATCTTATTCCTCCAACTCTTGATAGTGTCTGTTATTATGCCATTGTATTTTAACACAGCCACTGCCCCATTGTCTGGGCCANAAATTCCNATTNNTAGCCAATTGGNGC

(attachments)

influenza test.txt (550 Bytes)

I think your sequence is reverse complemented. Try rerunning the reverse complement.

best,
richard

Hi Richard,

Well, that’s embarrassing! I thought NextClade would have interpreted sequence in either direction, thanks for clarifying!

Regards
Jose

no worries. For some viruses, we allow reverse complements. But for flu this is apparently not switched on.

One more question, my segment was identified as NA clade C.5.3.1

I cannot find too much information on this clade – is there a list somewhere of known influenza A clades, to try to put it in context?

Thanks again,
Jose

these are just labels; there is no phenotypic significance attached to them.