Hi, I’m working on a story about what insights might be gained using genetic epidemiology with regards to the growing outbreak in Haiti. The first sequences from the country were published last week, consisting of sequences before Feb 2021. I understand newer sequences including B.1.1.7 and P.1 samples will be released soon and I’d like to have something to interpret them with.
In order to get the Haiti specific sequences (there are ~31) on GISAID I go to downloads, custom selection, search for “Haiti” and download the sequences.
I do not see the link for “nextmeta” or “nextfasta” as shown in the documentation. I emailed gisaid to ask about this.
So I downloaded “Region-specific Auspice source files”, “Global” and “North America”… and ran the pipeline against the files in “global” just to get a running build… but am not sure if/how these files are sampled which of course makes anything downstream uncertain (and I do note that only a few of the Haiti samples are in this set).
I used “example_multiple_inputs” as a template, with one input the set of all samples from Haiti, 2000 proximity samples from the worldwide set and worldwide background samples grouped by year/month with five samples per group.
This gives me something that looks reasonable… the main takeaway that I can see is that the main cluster of sequences have have no near ancestor in time from another country… and the most similar sequences are fairly widely distributed geographically… so I would tentatively interpret presence of in country transmission over the winter peak and just generally and unsurprisingly not enough sequencing to conclude much else about the transmission chains.
It’s not wildly exciting, but is there somewhere that I can share the methodology/results/interpretation for critique? I’d like to develop a reasonable baseline process that I can use going forward as more sequences