Augur refine error when call --timetree \

Hi, community menmbers,

I use the timetree within the augur refine and got an error

augur refine \
    -t H7N9_HA_454/o4_ha_iqtree.newick \
    -a H7N9_HA_454/o3_ha_align.fasta \
    --metadata H7N9_HA_454/o1_h7_ha_metadata.tsv \
    --timetree \
    --output-tree H7N9_HA_454/o5_ha_refined_tree.tree \
    --output-node-data H7N9_HA_454/o5_ha_refined_node_data.json

Error

augur refine is using TreeTime version 0.8.4
Traceback (most recent call last):
  File "/home/yzu/miniconda3/envs/nst/bin/augur", line 10, in <module>
    sys.exit(main())
  File "/home/yzu/miniconda3/envs/nst/lib/python3.8/site-packages/augur/__main__.py", line 10, in main
    return augur.run( argv[1:] )
  File "/home/yzu/miniconda3/envs/nst/lib/python3.8/site-packages/augur/__init__.py", line 75, in run
    return args.__command__.run(args)
  File "/home/yzu/miniconda3/envs/nst/lib/python3.8/site-packages/augur/refine.py", line 195, in run
    metadata, columns = read_metadata(args.metadata)
  File "/home/yzu/miniconda3/envs/nst/lib/python3.8/site-packages/augur/utils.py", line 81, in read_metadata
    return MetadataFile(fname, query, as_data_frame).read()
  File "/home/yzu/miniconda3/envs/nst/lib/python3.8/site-packages/augur/util_support/metadata_file.py", line 22, in read
    self.check_metadata_duplicates()
  File "/home/yzu/miniconda3/envs/nst/lib/python3.8/site-packages/augur/util_support/metadata_file.py", line 60, in check_metadata_duplicates
    self.metadata[self.key_type]
  File "/home/yzu/miniconda3/envs/nst/lib/python3.8/site-packages/pandas/core/frame.py", line 4060, in query
    res = self.eval(expr, **kwargs)
  File "/home/yzu/miniconda3/envs/nst/lib/python3.8/site-packages/pandas/core/frame.py", line 4191, in eval
    return _eval(expr, inplace=inplace, **kwargs)
  File "/home/yzu/miniconda3/envs/nst/lib/python3.8/site-packages/pandas/core/computation/eval.py", line 353, in eval
    ret = eng_inst.evaluate()
  File "/home/yzu/miniconda3/envs/nst/lib/python3.8/site-packages/pandas/core/computation/engines.py", line 80, in evaluate
    res = self._evaluate()
  File "/home/yzu/miniconda3/envs/nst/lib/python3.8/site-packages/pandas/core/computation/engines.py", line 121, in _evaluate
    return ne.evaluate(s, local_dict=scope)
  File "/home/yzu/miniconda3/envs/nst/lib/python3.8/site-packages/numexpr/necompiler.py", line 823, in evaluate
    signature = [(name, getType(arg)) for (name, arg) in
  File "/home/yzu/miniconda3/envs/nst/lib/python3.8/site-packages/numexpr/necompiler.py", line 823, in <listcomp>
    signature = [(name, getType(arg)) for (name, arg) in
  File "/home/yzu/miniconda3/envs/nst/lib/python3.8/site-packages/numexpr/necompiler.py", line 705, in getType
    raise ValueError("unknown type %s" % a.dtype.name)
ValueError: unknown type object

When remove the --timetree \, no more error.
augur version: 13.0.2
I had tried conda update --all.
So, any idea?

Thanks

Is it possible to share your input files so that one can try to reproduce the error?

Yes, the github address was sent.

this is a problem of augur parsing your metadata. It would be helpful to have the file to debug augur. Is this data from GISAID?

Hi, Richard

Thanks a lot for your reply.

The data not only from GISAID, but also from NCBI and ourself. I prepared the metadata.
I will double check the metadata, see what happen.

Best,
Yang

just to be clear. I think this is a problem of augur that is surfaced because of a peculiarity of your metadata. augur should not fail like this. We’d therefore be interested in finding out what exactly triggered this and how we can fix augur. So if you can share your metadata (or just a part of it), we’d be grateful.

Hi, Richard. This is the GitHub address: GitHub - virologist/nextstrain: My Nextstrain /Not publish

thanks. This runs without problem for me. What is your pandas version? I am using 1.2.3

Hi, Richard

I figured it out. Thanks for your help.
It’s because my conda env has lost the pandas since I updated last time.
It’s weird and I don’t have a sense of why my env lost pandas.
After I reinstall the pandas (version is ‘1.1.3’), it works.
Thank you again.

Best wishes
Yang

I get exactly a very similar error message after I updated – all; now under python 3.9, The previous instances of nextstrain passed well through the timetree. After removing pandas 1.3.4. and running 1.2.3. this is not improving.
thank you for your help

Looks like it might be a problem with pandas itself. See this chain of issues:

It seems that pandas devs are in no particular hurry to address this. It might be challenging for us to catch all the cases where different versions of pandas decide to use which dtype, especially considering this code is processing user-provided inputs.

We might need to forcefully convert all the metadata fields to the types we expect them to be. There will be a lot. Someone with more knowledge of this particular part of augur might know how to handle this.

Hi, Ivan, thanks a lot for your reply.