I merged the South America dataset with mine (South Brazil).
The problem is that I have hundreads of sequences for South Brazil, so some “mugration” events do not make sense, ie some lineages appear to have arisen in South Brazil (but I am not sure that this is true, I believe it is a problem of undersampling for the rest of Brazil).
I also performed an independent phylogenetic reconstruction using only samples of certain pango lineages. Bootstrap values are very low, as expected, considering the narrow timeframe and the slow-evolving virus.
However, it makes me consider how much I can rely on the ancestral reconstruction of the nextstrain tree. How do I know if the viral migration events are in fact representing the reality?