3 simple questions that I could not find answers on NextStrain

I am deeply impressed with the amount of work that you guys have done, especially the nice visualization. No other places have done better, I think.

However, I still have difficulty to find answers for 3 basic questions that I feel not that bazaar or weird at all. I would deeply appreciate if someone could help out.

  1. Right now, about 4,000 genomes of novel coronavirus included auspice. How could I get the data to generate a histogram for mutations based on these 4,000 samples? For example, bp1 has 5 samples mutated, bp2-100 has 0 mutated samples, bp101 has 7 samples mutated.

  2. Let’s say that I now have a genome FASTA file (nobody told me what it is from). Is there a way for me to upload this file somewhere and then the website tells me that it is a SARS genome, or a SARS-COV-2 genome, or a HIV genome, etc? If it is a coronavirus genome, it would be good to further tell me whether it is alpha type, or beta type, etc.

  3. Let’s say that I now have a genome FASTA file for an Influenza virus. Based on the genome data alone, is there a way to tell whether this virus is limited in animals (avian flu) or it is from a human subject? I know that mutations could cause some virus to “jump” from animals to humans. But what “signature” mutation, for example, that causes the SARS-COV-2 jumps from bats to XXX and then to human?

