Error in Rule Refine

Hello,

I have been trying to run the ncov example data, however, I run into a missing data error at the refine step:

And when I check results/global I see aligned.fasta file that appears to be correct.
Any thoughts on why the refine step seems to be unable to find this file?

Thanks,
Wes

can you confirm that the file results/global/aligned.fasta is present?

Yes, results/global/aligned.fasta is present.

Hi @whottel, this looks like an issue that we could try to catch earlier in Augur (before it gets to TreeTime). To help us figure out what’s happening, could you run the following commands and paste the output for each in response here?

For example, what output do you get when you run the following command from your nextstrain Conda environment?

python3 -c "import Bio; print(Bio.__version__)"

Which version of Augur do you have installed?

augur --version

What do you see when you look at the first few lines of the aligned FASTA file?

head results/global/aligned.fasta

python3 -c “import Bio; print(Bio.version)”

1.65

augur --version

augur 11.2.0

head results/global/aligned.fasta

>MN908947
attaaaggtttataccttcccaggtaacaaaccaaccaactttcgatctcttgtagatct
gttctctaaacgaactttaaaatctgtgtggctgtcactcggctgcatgcttagtgcact
cacgcagtataattaataactaattactgtcgttgacaggacacgagtaactcgtctatc
ttctgcaggctgcttacggtttcgtccgtgttgcagccgatcatcagcacatctaggttt
cgtccgggtgtgaccgaaaggtaagatggagagccttgtccctggtttcaacgagaaaac
acacgtccaactcagtttgcctgttttacaggttcgcgacgtgctcgtacgtggctttgg
agactccgtggaggaggtcttatcagaggcacgtcaacatcttaaagatggcacttgtgg
cttagtagaagttgaaaaaggcgttttgcctcaacttgaacagccctatgtgttcatcaa
acgttcggatgctcgaactgcacctcatggtcatgttatggttgagctggtagcagaact

Thanks, @whottel! I recreated the error you experienced by manually installing BioPython 1.65. I traced the issue back to a bug in BioPython’s alignment reader when it is run in Python 3.8. When I upgraded BioPython to TreeTime’s minimum version (1.66), I got the same error. When I upgraded to Augur’s minimum version (1.67), I did not get the error.

There are a couple different solutions to this problem that all involve upgrading BioPython.

Upgrade BioPython

To fix this problem, you can upgrade BioPython to the latest version Augur currently supports (1.76) with either Conda or pip like so:

# Upgrade with conda
conda install biopython==1.76

# Or upgrade with pip
python3 -m pip install biopython==1.76

Create a fresh Nextstrain environment from scratch

I’m a little worried though that your environment has the latest version of Augur and a version of BioPython from late 2014. If you wanted to make sure all of your software is installed with the latest versions, you can start from scratch with a fresh Conda environment like this:

# Remove the current Nextstrain environment.
conda activate base
conda env remove -n nextstrain

# Update Conda.
conda update conda

# Create a fresh Conda environment for Nextstrain.
conda create -n nextstrain -c conda-forge -c bioconda nextstrain

# Confirm your environment worked.
conda activate nextstrain
augur --version

Let Snakemake manage your Conda environment automatically

Alternately, you can run the ncov workflow with Snakemake’s --use-conda flag (snakemake --use-conda [other arguments you normally provide]) and the workflow will automatically create a standalone Conda environment for you. Whenever that environment changes in our main repo and you pull down the changes, Snakemake will detect the change and update the environment for you.

Great!

I tried using the --use-conda flag and everything seems to have worked.

Thanks for all the help!

-Wes

1 Like