I am working on setting up a local build for my state PHL and would greatly appreciate any insight from the Nextstrain team and/or individuals who have set up local builds for their region/state.
My first question is what exactly is happening during the subsampling step for the global builds and the regional/state specfic builds available on the Nextstrain website. For example, the Iowa focused subsampling build maintained by CDC/AMD has 177 Iowa isolates, whereas GISAID has roughly 250 for Iowa for the same time frame while also filtering for complete and high coverage sequences. What additional filtering might be going on here?
Secondly, is there a way to easily download a multifasta file from Nexstrain for a given build. I see the option to download the metadata, but was wondering if there is at least a way to then grab the sequences from GISAID if their user aggrement prevents downloading fasta’s from Nextstrain directly. My resson for doing this is that I would like a current representative sample of national/international sequnces to provide context for the local sequences I would include in my local build.
Finally, for those that may have already solved the above issue. How often are you downloading a new set of national/international sequences for your local build?
These are great questions, @whottel! I’ll try to answer the first two and leave the last for other PHL folks.
My first question is what exactly is happening during the subsampling step for the global builds and the regional/state specfic builds available on the Nextstrain website.
Secondly, is there a way to easily download a multifasta file from Nexstrain for a given build.
We can’t provide direct links to sequences used for builds, but you can download the metadata file that you mentioned and use the “gisaid_epi_isl” column to select the records you want from the GISAID search interface.
We are also discussing ways we could provide a collection of global context sequences for users to download. We haven’t quite figured out how that will work though.