From a newbie: difficulty finding multiple coincident mutations in spike

earturo · February 27, 2021, 10:26pm

Using the web-based nextstrain tool using (/ncov/global) I am searching for coincident mutations occurring within the SARS-CoV-2 spike protein S that authors of peer-reviewed literature claim are deposited to GISAID; for example changes to S at positions 261 and 453 from among the Dutch mink sequenced in spring/summer 2020. To do this, I type ‘genotype S 261’ in the Filter Data field and select 261 D from the pull-down menu. From there I see one genome on the genome tree that contains this S sequence change, and none from the Netherlands or among mink where I’d expected them (if I select to filter by mink hosts). Interestingly, if I search for Y453F alone at covariants.org I can link from there to auspice and filtering this build for G261D does yield mink sequences on the tree, …but when I hover over any of the dotted highlighted genomes, they do not explicitly say that that genome contains a mutation in S at 261 to D. I am new to nextstrain, and I am clearly missing something. Perhaps only the tips of the tree are shown and not the whole list of genomes? If so, how do I change that using the web-based server? Or must I build nextstrain on my local computer? Please help. Any advice is welcome.

james · March 8, 2021, 12:15am

Hi @earturo – due to the size of the data available on GISAID, each nextstrain “build” (i.e. each tree) will be showing a subset of the entire dataset – often less than 1% of the total genomes available. Our subsampling strategy depends on the aim of the build – for instance @emmahodcroft’s build covariants / S.Y435F will preferentially select samples with that mutation whereas nextstrain / ncov / global subsamples geographically.

You can filter the above covariants build to highlight genomes with a Spike 261D mutation, but again please be aware that this dataset is not representative of all genomes on GISAID.

To explicitly test the claims from your post I believe you’d have to login to GISAID, download all the data, and filter accordingly.

Topic		Replies	Views
Spike protein sequences filtered for lineage General	1	592	February 10, 2022
SARS-CoV-2 Mutation Data General	4	1767	May 25, 2021
Search by mutation? Help and Getting Started	1	543	May 26, 2021
1 fundamental (maybe naive) question on nextStrain	1	444	May 19, 2021
Regarding Build for USA- Missing Data Help and Getting Started	9	547	October 27, 2021

From a newbie: difficulty finding multiple coincident mutations in spike

Related topics