Fix_tree.py error for mpox trees

In the step

python3 scripts/fix_tree.py             
--alignment results/aspen/masked.fasta             
--input-tree results/aspen/tree_raw.nwk                          
--output results/aspen/tree_fixed.nwk

For some trees I got this error

Below NODE_0000128: ('C', 5339, 'T') in NODE_0000129 reverted in NODE_0000130
Below NODE_0000258: ('C', 29762, 'T') in NODE_0000259 reverted in NODE_0000401
Traceback (most recent call last):
  File "/code/mpox/phylogenetic/scripts/fix_tree.py", line 63, in <module>
    reversion["child"].clades.remove(reversion["grandchild"])
ValueError: list.remove(x): x not in list

These are trees that were built with somewhat random samples just for testing so they may be strange in biological sense. Is that the reason? I don’t understand the set up enough to debug.

Data and code are latest version.
Thanks!

Hi @dlu, this is an issue that others have run into before, too. A while back, I pushed a fix to the proximal issue of the script trying to remove items that don’t exist in the list, but this doesn’t necessarily fix the ultimate issue with the input data that causes that script to fail.

@corneliusroemer, is there anything you’d suggest here? My first thought is that @dlu could try running the mpox workflow from the fix-fix-tree branch in the PR linked above…

1 Like

Hi John, thanks for a quick reply! Let me bring in this change to our system.

With the change, fix tree ran through, but with 2 different test runs, some nodes show up twice in tree_fixed.nwk which lead to error in rule export. With another test run, rule refine had an error.


I don’t need to fix these issues, but wanted to better understand them. My intuition is the samples on the trees are just challenging for tree build, like they are too different or too distant, and some of the steps just couldn’t properly handle them. Does that sound right?