I am new to this nextstrain. As I was looking through, I found the “show entropy” button in this link. https://nextstrain.org/ncov/global By trying out, I know roughly what it corresponds to. I know entropy in communication system roughly means to quantify the disorder of a system. People usually say the entropy of a probability distribution and there is a formula corresponding to it, and it has the maximum value when it is a uniform distribution, meaning no info is given at all, and all possibility is possible. I was confused here. And I was just wondering, what is the definition of it? or is there a definition.

Regards.

The graph shows the entropy of an alignment column and measures the diversity of the viruses at a particular position in their genome. If for example a fraction `p`

had base `A`

and `1-p`

base `G`

at position `x`

, the entropy would be `-p log(p) - (1-p) log(1-p)`

. If there are more than two states, this generalizes to `-sum(p_i log(p_i))`

where `i`

runs over all states at the position.