I have problems interpreting the entropy data, when I saw them I assumed that it goes from 0 to 1, but in my analyzes I find higher values, (1.1 / 1.2 / 1.3) I have read the posts on nextstrain and some papers but in general, they all limit themselves to say that entropy is a measure of diversity.
Thanks for your time
Entropy is normalised Shannon entropy, measuring the “uncertainty” inherent in the possible nucleotides or codons at a given position.
Events represent a count of changes in the nucleotide or codon at that position across the (displayed) (sub-)tree. They rely on the ancestral state reconstruction to infer where these changes occured within the tree.
(Docs here and the code which computes it is here.)
then I must assume that there is an error in the calculation of my data???
Do I have to change the formula?
thanks for your answer
We also have values > 1 (e.g.) so I’ll try to find time to take a closer look
mathematical statistics - Why am I getting information entropy greater than 1? - Cross Validated explains why we have entropy values > 1.
As an example, looking at a recent nCoV build at spike position 371 we have entropy of 1.056 which is the sum of each of the 4 observed codons: R (1177 tips / 3199 total tips) entropy=0.36, H (1378 / 3199) entropy: 0.36, L (1 / 3199) entropy: 0.00252, P (643 / 3199) entropy: 0.322.
Thanks Thanks Thanks Thanks