Hi, I’m new to Nextstrain and I was wondering, why do we have the “subsampling” step during the snakemake workflow? Is it because the total number of sequences is huge and we’d like to maybe only focus on a small part of it based on some criteria? If so, maybe it is better to be called something like “criteria-ing” or “criteria-based-sampling”? Moreover, what exactly is this “subsampling”? Does it mean the criteria-based-sequences are chosen uniformly or so? Many thanks.
Related Topics
Topic | Replies | Views | Activity | |
---|---|---|---|---|
Using existing alignment | 5 | 523 | January 29, 2022 | |
Subsampling and Data Download | 2 | 555 | March 19, 2021 | |
Multiple subsampling from same alignment
|
2 | 364 | September 1, 2021 | |
Snakemake Q: passing list of param values @ CLI --> one build per value | 4 | 947 | November 23, 2020 | |
Running local samples in global background | 3 | 553 | August 3, 2020 |