300 or more distinct discrete states found in ancestral reconstruction

Mike.Lloyd · February 11, 2021, 12:59am

I am encountering the following error:

augur traits is using TreeTime version 0.8.0
Assigned discrete traits to 4631 out of 4631 taxa.

NOTE: previous versions (<0.7.0) of this command made a 'short-branch
length assumption. TreeTime now optimizes the overall rate numerically
and thus allows for long branches along which multiple changes
accumulated. This is expected to affect estimates of the overall rate
while leaving the relative rates mostly unchanged.
ERROR: 300 or more distinct discrete states found. TreeTime is currently not set up to handle that many states.
[Wed Feb 10 19:49:39 2021]
Error in rule traits:
    jobid: 14
    output: results/NorthAmerica/traits.json
    log: logs/traits_NorthAmerica.txt (check log file(s) for error message)
    shell:

        augur traits             --tree results/NorthAmerica/tree.nwk             --metadata results/NorthAmerica/metadata_adjusted.tsv             --output results/NorthAmerica/traits.
json             --columns country_exposure division_exposure             --confidence             --sampling-bias-correction 3 2>&1 | tee logs/traits_NorthAmerica.txt

This is new error for me, I was trying to re-run the NorthAmerican build from a week or so ago. Can someone comment on what I am doing wrong here?

Here is my build:

builds:
  NorthAmerica:
    subsampling_scheme: NA
    auspice_config: "my_profiles/NorthAmerica/auspice.json"
    region: global

subsampling:
  NA:
    global:
      group_by: "country year month"
      seq_per_group: 1000


## You could also specify a title which will be used for all builds
## where you haven't already specified a title above:
## (Uncomment this to use!):
title: "SARS-CoV-2 Build Focused on North America"

files:
    description: my_profiles/NorthAmerica/description.md

exposure:
  NorthAmerica:
    trait: "division"
    exposure: "division_exposure"

#########
## Finally, you can specify what traits you want to reconstruct, per build.
## If you have exposure information for a trait, specify it as well, so that
## this is incorporated into the final run. (Otherwise, it will simply be excluded)

## First, specify the traits you want to reconstruct, per build:
## (uncomment to use!)
traits:
   NorthAmerica:
        sampling_bias_correction: 3
        columns: ["country_exposure"]

I have pre-filtered the metadata to only those samples I want to include from the public NA build, which is why I used 1000 as my filter number here.

Thanks.

ldelaye · March 12, 2021, 5:57am

Dear @Mike.Lloyd, were you able to solve your question? I have the same issue.
Best!
L

Mike.Lloyd · March 12, 2021, 2:34pm

I changed my build to not be based on division_exposure I believe the issue is I had too many divisions in my build. Not sure you build specifics, but I would suggest trying a coarser level of exposure for the trait reconstruction.

ldelaye · March 12, 2021, 7:29pm

@Mike.Lloyd: yes, It seems that if you have too many divisions you get the error message. I changed the subsampling scheme from global to country and that worked.

rneher · March 13, 2021, 9:50am

augur currently implements a cap that the number of discrete states (countries, divisions, locations) can not exceed 300. This used to be a fundamental limitation in TreeTime but it no longer is. But the cap in augur is still in place.

that said, you likely won’t get useful results from such a sparse high dimensional reconstruction. To avoid the error, you can simply switch off reconstriction of ancestral divisions.

Topic		Replies	Views
ERROR: 300 or more distinct discrete states found Help and Getting Started	1	271	December 19, 2023
Inconsistencies in the result of augur traits Help and Getting Started	1	241	May 8, 2023
Using trait subcommand to infer location of unsampled nodes Help and Getting Started	2	373	September 8, 2021
Extracting discrete transition states General	1	386	October 5, 2023
Problems with `augur traits` and `augur frequencies` using supplied sequences Help and Getting Started	3	952	June 23, 2021

300 or more distinct discrete states found in ancestral reconstruction

Related topics