Hi @ECG, Thanks for writing! I’m afraid I’m not 100% clear on what you mean by filtering for German sequences in two different ways. Certainly if you are filtering German sequences on different builds, then these frequencies may change due to subsampling. For example, if you access the Germany build maintained by Neher Lab and filter to Germany, it shows about 27% currently.
If you use the Nextstrain Europe build and filter to Germany, it’s also about 27%.
On the global build this significantly lower at about 12% - but the global builds are now extremely subsampled (~4,000 sequences out of >500,000 available), so I’d urge caution when using it to look at country levels - it’s better used just for larger-scale, longer-time-period interpretation.
The denominator of the % for frequencies is the total number of sequences that are visible in the tree given the currently applied filters. If no filters are on, that means it’ll use all the sequences in the tree from the appropriate time-slice. Note that this is different from Covariants.org as CoVariants uses raw sequence data (not a tree) that’s not significantly subsampled, whereas Nextstrain frequencies here reflect the sequences in the tree & thus any subsampling therein.
Frequencies are shown per-week on Nextstrain.org. Note that for CoVariants.org they’re shown per week for the Per Variant plots, but per 2 weeks for the Per Country plots (to smooth jitter).
I hope all this helps!