For some covid sequences I’m working with, I have two sets of geographical data: a county and a zip code. I’d prefer to be able to visualize either set of data, rather than picking one over the other to be my Location geographic resolution. Until now, I solved this problem by having two metadata files, one with the county in the location field and one with the zipcode in the location field, and then ran the analysis twice, using two different nextstrain profiles.
However, this strikes me as a sloppy way to do things, and also requires me to keep two different sets of metadata up to date. What I’d like to do is add a geographical resolution that is finer than Location. Then I could put the county in the Location field, and the zipcode in this new field.
I tried just added a new field to my metadata file (‘zip_code’) and modifying my auspice_config.json file like this:
…but when I try and run this, during the refinement step, when it adjusts the tree for exposure, I get an error:
Exception: Where there’s SAMPLING_TRAIT we should always have EXPOSURE_TRAIT
I’m a little lost at this point. I know location doesn’t have an exposure field, so I was hoping any new geographic resolution fields would work the same way. I tried creating a second field, zip_code_exposure, and copying the contents of the zip_code field to it, but this had no impact. Do I need to define an exposure column in my config or builds file?
Other potentially relevant information:
- I’m using the multi-file configuration to combine a GISAID dataset with my lab’s dataset. I don’t think the lack of a zip_code or zip_code_exposure field in the GISAID metadata file would cause problems, but you never know.
- I did have transmission lines turned on in my auspice config file, but turning it to off seemed to have no impact after rerunning the pipeline.
Thank you for your time.