What is the meaning of "alt" in the file clades.tsv?

Hi. I’m new to Nextstrain, and have a few questions as the following. After checking the file clades.tsv, I found that it has four columns with names clade, gene, site and alt. For my understanding, clade is the name of the clade, such as 19A, 19B, 20A, 20B, and 20C etc. Gene is the name of the gene, such as nuc, site is the site position, which is 23403, 8782, 14408, 23403, 28881, 1059, 8782, etc. The very last column has name alt, and values C, T, G, A etc. My question is, is my understanding correct and what is the meaning of the very last column and what is the relation between alt and clade? Any comments are appreciated. The file clades.tsv is as the following.

clade gene site alt

19A nuc 8782 C
19A nuc 14408 C

19B nuc 8782 T
19B nuc 28144 C

20A nuc 8782 C
20A nuc 14408 T
20A nuc 23403 G

20B nuc 8782 C
20B nuc 14408 T
20B nuc 23403 G
20B nuc 28881 A
20B nuc 28882 A

20C nuc 1059 T
20C nuc 8782 C
20C nuc 14408 T
20C nuc 23403 G
20C nuc 25563 T

This is the new nucleotide (or amino acid) encoded by that site as a result of mutation. The reference genome would have a distinct base (or amino acid residue, when gene is shown as an actual gene name).

1 Like