to identify my SARS-CoV-2 variants and export all the S-protein mutation regions to see what was covered in my reads. We noticed that the E484K and deletion 69-70 regions were not reported in the output. Here is an output for mostly B.1.1.7 and the deletion 69-70 region was not reported. Does Nextclade report these regions?
I was also wondering what are the recommended % genome recoveries necessary for accurate SARS-CoV-2 variant calling using NextClade? I was thinking you could get a low % genome recovery from the consensus alignment, but maybe your primers sequenced key regions for accurate variant calling. What key regions on the SARS-CoV-2 genome should be sequenced in order for NextClade to accurately call a variant? Thanks!
I ran all my sequences in IDSeq which in turn uses Nextclade, how can I find the accession number?
I’m not quite sure what you mean by accession number, as Nextclade just looks at the sequences you give it. I don’t know how IDSeq uses Nextclade. Maybe you could ask at an IDSeq forum? Or show me a screenshot and explain in more detail what you mean. Happy to help! I’m actually curious how IDSeq incorporates Nextclade, so if you could share a bit that would be super interesting.
Why do you get low % genome recovery? Are you working with waste water and trying to generate a consensus from a waste water sequence? Or are you trying to be more cost-effective and only sequence parts of the genome of patient samples?
Hi @corneliusroemer thanks so much for your response this is so helpful! I am working on a paper, and we basically sequencing wastewater for SARS-CoV-2 using minION mk1b using the ARTIC v3 and v4.1 primers. We also used RT-ddPCR with the GT Molecular SARS-CoV-2 kit for variant calling. We analyzed all our data and noticed that we get 50%-75% genome recovery when the N1 and N2 genes are >30k GC/100mL and 75%-100% when >48k GC/100mL. I guess I just wanted to know what the best % genome recovery gives you the most accurate variant calling (for the discussion part of the paper). What is a good cutoff % recovery? I did a rapid literature review search and most publications are setting their % genome recovery to >90%. I just wanted to know your input, since you are an expert on how NextClade works.