Hi !
I’m trying to build nextstrain on the latest data, using a build I tested before (and it worked great). However, now I’m getting an error in the alignment rule. This doesn’t occur using the example build however.
Here is the output of the logs/align.txt file : /home/danesh/miniconda2/envs/nextstrain/bin/mafft: line 2719: 14699 Killed "$prefix/addsingle" -Q 100 $legacygapopt -W $tuplesize -O $outnum $addsinglearg $addarg $add2ndhalfarg -C $numthreads $memopt $weightopt $treeinopt $treeoutopt $distoutopt $seqtype $model -f "-"$gop -h $aof $param_fft $localparam $algopt $treealg $scoreoutarg < infile > /dev/null 2>> "$progressfile"
Is it because this dataset is larger than in the example or than the dataset I used before using the same build ? Although I get the same error on the cluster.
This has the signs of an out-of-memory condition during alignment with MAFFT. The clue here from the log file is:
mafft: line 2719: 14699 Killed "$prefix/addsingle"
which indicates that subprocess 14699 (addsingle) of mafft was killed unexpectedly.
When the computer runs out of memory, it will start terminating processes with the most aggressive memory usage until the memory pressure is relieved. During a Nextstrain build, that process is often MAFFT. It likely ran fine for you in the past because fewer sequences were being used.
I’d suggest checking what resources are available on your cluster and if you can increase the memory available to your job.