Deciding on thresholds for calling consensus sequence

Hello everyone,
I’m trying to decide on thresholds for calling consensus sequence for SARS-COV-2 samples.
For variants with frequencies above 0.7 - of course I will include them in the consensus.
But what do you recommend regarding borderline variants? For example a base having 60% reference base and 40% alternative?
should I call a wobble for this position using the IUPAC nucleotides? or should I call N? or maybe the reference nucleotide?

Thank you very much!

I don’t think there are generally accepted standards or thresholds. My ad-hoc recommendation would be something like

Use IUPAC ambiguity codes when there is uncertainty about what the consensus state should be (like between 20 and 80%). Otherwise, report the majority base. Don’t replace ambiguous positions with a reference nucleotide. In doubt, put N.



I noticed that for samples that has a missing amlicon, or an area with no coverage (but still the overall %N is below 5) - around the missing area, where coverage is still low (~10-30 read) I get a lot of ambiguities which I am sure are not TRUE. also I get a lot of short inserts/deletions in these areas.
How do you recommend handling these situations?
