Error in rule tree

Hello. I have been working on setting up a functional nextstrain pipeline for the last few weeks at case western reserve university in Cleveland. I was able to get the example covid build to work just fine, but when I try to run the same build with the GISAID metadata and sequences, I continue to get the same error. The build will make it all the way up until the “augur tree” command and then stagnate there until it returns this error message:

.
I have been troubleshooting this error for quite awhile now through the nextstrain tools, but can’t seem to get it to bypass this error. I’m thinking it might have something to do with IQ tree or maybe just the massive amount of data. I’d love to see if anyone has any input on what might be causing this problem. Thanks!

Hi @awertz99 I’m sorry you’re having trouble! If you look in the two log files mentioned, does it provide any more details about the error? As it doesn’t say much specific here. If you can post any error messages from the log files perhaps we can help!

Did you hit ctrl-C by any chance? the ^C looks like the user terminated the process.

Hello @emmahodcroft and @rneher. I believe I did cut it off too early on my last build but this time I let the build run until failure and here is the error message I received.

I noticed it says “Maximum recursion depth reached”. Do you know how to actually change the AUGUR_RECURSION_LIMIT? and if so should I be able to change it and run the same build where it left off? or would you suggest just running a brand new build? Let me know!

yes, you can set the variable AUGUR_RECURSION_LIMIT for example by typing
export AUGUR_RECURSION_LIMIT=10000 on the commmand line.

And yes - once you set this you should be able to restart the build from where it left off!

Hello again @emmahodcroft and @rneher. My build failed again, but this time it looks like it is due to timetree. Here is the error, I’m not too sure what the root cause of it might be. And FYI, I am running this build with the GISAID global data + 11 cleveland specific sequences.

It looks like you are running a really big tree, right?

As a result, the optimization of the skyline seems to have run into numerical issues. You could try to replace the --coalescent skyline with --coalescent opt

@rneher yes it will end up being a pretty big tree I believe. So this would be making a change to the augur refine command, how do I make that change on the command line? Or does this need to be done in an augur/ncov file?

You need to change this here:

1 Like

@rneher my build just finished through! Got a beautiful looking tree with 11,500 sequences, thanks so much for the help!

Hello @emmahodcroft and @rneher. I recently updated my global GISAID data which in turn significantly increased the amount of data coming from the fasta and tsv files. When I try to run a build with the updated data, I get this error in the rule align.


I’m assuming between the updated data and looking at some of the other discussion posts that this error is occurring due to a lack of enough memory available for mafft. Would you guys know how to check how much memory is available to mafft and/or how to allocate enough memory for the build as a whole? Thanks!

mafft easily needs more than 100GB for the most recent data. Unless you have a node with very large memory, you have to do one of the following

  • chunk the alignment into smaller files and run mafft on chunk
  • use nextalign
  • use something else like minimap2

@rneher I figured it was a pretty large number. Regarding, using nextalign, how would I go about specifying its use over mafft?

@rneher How do I use the config flag to choose nextalign for my build? I saw in the code it looked like there may be a use_nextalign variable, do I need to specify it to a certain value?

You would want these two lines defining genes and use_nextalign:

in your config

@rneher Awesome, I got my build to specifically use nextalign, but it looks like it is not recognizing the nextalign command. I downloaded a MacOS version of nextalign from nextclade. Do I need to move the nextalign download to a specific folder so the command can be recognized? I also tried just pulling the most recent ncov folder from github so that it includes the most recent code including nextalign. It didn’t seem to work either, is there a specific version of nextstrain/ncov i need to pull?


here is the error I am getting just in case it is helpful.

Yeah that error indicates that nextalign isn’t being found in your environment. If you run which nextalign you should get no output, confirming that it cannot be found.

There are a number of approaches, however looking at your previous error messages it seems you are running inside a conda environment named nextstrain and so I would recomment placing the nextalign executable inside the /Users/audricwertz/miniconda3/envs/nextstrain/bin folder. After this, which nextalign should indicate that it can be found, and you should be able to run nextalign --help to check that it runs ok.

P.S. If you run nextalign --help and see something like “Killed, code 9” then this indicates that MacOS isn’t running it because it’s from an unknown developer. You should be able to go to System Preferences → Security & Privacy → and then check “allow nextalign to run” or similar.

I wrote a quick summary on how to switch to nextalign:

Nextalign stuff has been working perfectly. Ran into a new error I haven’t seen before during the augur tree command.


@rneher and @james, have you guys seen this error before? do I need specify omp_set_max_active_levels? Not sure how it would have deprecated, don’t believe I updated anything major since my last build that worked. Any ideas?