The augur export doesn’t seem to recognize my metadata and associate it with my tree… I think?
log excerpt:
augur export v2 --tree results/new-york/tree.nwk --metadata results/new-york/metadata_adjusted.tsv.xz --node-data results/new-york/branch_lengths.json results/new-york/nt_muts.json results/new-york/aa_muts.json results/new-york/emerging_lineages.json results/new-york/clades.json results/new-york/recency.json results/new-york/traits.json results/new-york/logistic_growth.json results/new-york/mutational_fitness.json results/new-york/distances.json results/new-york/epiweeks.json --auspice-config my_profiles/nyc-local/ny_auspice_config.json --include-root-sequence --colors results/new-york/colors.tsv --lat-longs my_profiles/nyc-local/lat_longs.tsv --title 'Genomic epidemiology of novel coronavirus - New York-focused subsampling' --description results/new-york/ --output results/new-york/ncov_with_accessions.json 2>&1 | tee logs/export_new-york.txt
WARNING: You asked for a color-by for trait 'legacy_clade_membership', but it has no values on the tree. It has been ignored.
WARNING: You asked for a color-by for trait 'pangolin_lineage', but it has no values on the tree. It has been ignored.
WARNING: You asked for a color-by for trait 'zip_code', but it has no values on the tree. It has been ignored.
WARNING: You asked for a color-by for trait 'borough', but it has no values on the tree. It has been ignored.
WARNING: You asked for a color-by for trait 'ethnicity', but it has no values on the tree. It has been ignored.
WARNING: You asked for a color-by for trait 'date_of_death', but it has no values on the tree. It has been ignored.
WARNING: You asked for a color-by for trait 'specimen_type', but it has no values on the tree. It has been ignored.
I wonder whether your metadata file is read appropriately by augur filter. Can you try reading in an uncompressed version, without .xz ending?
Where do you specify that you want all these traits colored? They don’t appear in your config? Or only in the colors.tsv?
You may need to add a parameter to augur filter of --color-by-metadata followed by all the metadata columns you want included. See augur export — Augur 14.0.0 documentation
Reading it in uncompressed does not change the error.
I specify it in the config file which I only posted an excerpt of previously.
I did get one of the traits to show up with the config as below. Previously it had the key specified as pangolin_lineage (which is how it is in the metadata that is fed into nextstrain ncov pipeline) but which is later renamed by default/parameters.yaml to pango_lineage in the subsequent metadata_adjusted.tsv. However, columns such as zip_code do not get renamed.
Even specifying --color-by-metadata does not fix the issue. The metadata file is quite large so could have some peculiarities to the field values but is standardized to a degree by being exported out of a SQL database.
augur export matches the metadata with the nodes in the tree via the strain name provided in the strain column. Can you confirm that the strain names in the metadata match the strain names in results/new-york/tree.nwk?
Also quick note, I don’t see the legacy_clade_membership column in your metadata excerpt.