I am working on setting up a local build for my state PHL and would greatly appreciate any insight from the Nextstrain team and/or individuals who have set up local builds for their region/state.
My first question is what exactly is happening during the subsampling step for the global builds and the regional/state specfic builds available on the Nextstrain website. For example, the Iowa focused subsampling build maintained by CDC/AMD has 177 Iowa isolates, whereas GISAID has roughly 250 for Iowa for the same time frame while also filtering for complete and high coverage sequences. What additional filtering might be going on here?
Secondly, is there a way to easily download a multifasta file from Nexstrain for a given build. I see the option to download the metadata, but was wondering if there is at least a way to then grab the sequences from GISAID if their user aggrement prevents downloading fasta’s from Nextstrain directly. My resson for doing this is that I would like a current representative sample of national/international sequnces to provide context for the local sequences I would include in my local build.
Finally, for those that may have already solved the above issue. How often are you downloading a new set of national/international sequences for your local build?