My understanding is that Nextstrain will remove duplicates from the input datasets by looking at both sequence names and the actual sequences. What happens if 2 sequences have the same name but different sequences? Will the run fail or just throw a warning?
A related question is that Nextstrain will strip the “hCoV-19/” from sequence names. Does that happen for all input data, and before de-duplication?
We use the latest master branch.
Thank you for such a useful tool and great support to the users!