Tree rooting too early

My SARS-CoV-2 tree is rooting too early (like… 2006 early).

What parameters do I need to fix? I imagine it’s in the augur refine step.

augur refine \
            --tree {input.tree} \
            --alignment {input.alignment} \
            --metadata {input.metadata} \
            --output-tree {output.tree} \
            --output-node-data {output.node_data} \
            --timetree \
            --coalescent "opt" \
            --date-confidence \
            --date-inference "marginal" \
            --divergence-units "mutations"

Don’t you want to add --clock-rate 0.0008 ?

I’ll try that. Where did 0.0008 come from?

That’s what most people are using, 2 point mutations per month, obtained from root to tip regression and/or beast analysis. The nextstrain/ncov pipeline
uses --clock-rate 0.0008 and is adding --clock-std-dev 0.0004

It seems that the latter is doing a relaxed clock model in treetime but I’m not sure.

yes, the 0.008/year comes root to tip regression. it might also help to explicitly root the tree to an early sequence. The clock-std-dev the uncertainty of the rate estimate and the marginal dating distributions are calculated for the center rate and +/- the std-dev to get a sense of how the dating depends on uncertainty of the rate.