Adding fifth geographical resolution?

mbosmeny · March 4, 2021, 7:42pm

Hello,

For some covid sequences I’m working with, I have two sets of geographical data: a county and a zip code. I’d prefer to be able to visualize either set of data, rather than picking one over the other to be my Location geographic resolution. Until now, I solved this problem by having two metadata files, one with the county in the location field and one with the zipcode in the location field, and then ran the analysis twice, using two different nextstrain profiles.

However, this strikes me as a sloppy way to do things, and also requires me to keep two different sets of metadata up to date. What I’d like to do is add a geographical resolution that is finer than Location. Then I could put the county in the Location field, and the zipcode in this new field.

I tried just added a new field to my metadata file (‘zip_code’) and modifying my auspice_config.json file like this:

“geo_resolutions”: [
“zip_code”,
“location”,
“division”,
“country”,
“region”
],

…but when I try and run this, during the refinement step, when it adjusts the tree for exposure, I get an error:

Exception: Where there’s SAMPLING_TRAIT we should always have EXPOSURE_TRAIT

I’m a little lost at this point. I know location doesn’t have an exposure field, so I was hoping any new geographic resolution fields would work the same way. I tried creating a second field, zip_code_exposure, and copying the contents of the zip_code field to it, but this had no impact. Do I need to define an exposure column in my config or builds file?

Other potentially relevant information:

I’m using the multi-file configuration to combine a GISAID dataset with my lab’s dataset. I don’t think the lack of a zip_code or zip_code_exposure field in the GISAID metadata file would cause problems, but you never know.
I did have transmission lines turned on in my auspice config file, but turning it to off seemed to have no impact after rerunning the pipeline.

Thank you for your time.

rneher · March 6, 2021, 6:11pm

the code and logic is for the exposure is a little convoluted. What traits are you reconstructing and what are your settings for exposure?

rneher · March 6, 2021, 6:12pm

in other words, what is your analysis using in these fields:

github.com

nextstrain/ncov/blob/master/nextstrain_profiles/nextstrain/builds.yaml#L77


    subsampling_scheme: nextstrain_region
    region: South America
    auspice_config: "nextstrain_profiles/nextstrain/south-america_auspice_config.json"

# remove S dropout sequences and sequences without division label in US
filter:
  exclude_where: "division='USA' purpose_of_sequencing='S dropout'"

# if different exposure traits should be used for some builds, specify here
# otherwise the default exposure in defaults/parameters.yaml will used
exposure:
  global:
    trait: "region"
    exposure: "region_exposure"

  africa:
    trait: "country"
    exposure: "country_exposure"

  asia:
    trait: "country"

or here for the defaults:

github.com

nextstrain/ncov/blob/master/defaults/parameters.yaml#L103


  proportion_wide: 0.0

  # Diffusion frequency settings
  minimal_frequency: 0.01
  stiffness: 20
  inertia: 0.2

#
# Region-specific settings
#
traits:
  default:
    sampling_bias_correction: 2.5
    columns: ["country_exposure"]

exposure:
  default:
    trait: "country"
    exposure: "country_exposure"

# Default subsampling schemes designed for different geographic scales. With the

mbosmeny · March 8, 2021, 3:34am

I must apologize; I gave you bad info. So after a bit of diagnosing, it turns out that my problem was that when I rebuilt my builds.yaml to attempt this zipcode build, I left out a line.

skip_travel_history_adjustment: True

That line is in the multiple inputs example profile, and once I added it back in, the build pipeline ran perfectly. My geographical resolution setup worked.

Thanks for the quick feedback. I wouldn’t have spotted my error without going back to look at the builds.yaml to answer your question!

james · March 8, 2021, 10:56pm

Nice work! I’ve made an issue suggesting that we add a page of docs with all the available config options here which may make this smoother in the future.

Topic		Replies	Views
KeyError: 'geo_resolutions' Help and Getting Started	1	396	August 8, 2021
Transmission lines for fifth (custom) geographic unit Help and Getting Started	2	492	May 20, 2021
Geo resolution had no demes Help and Getting Started	4	165	May 10, 2024
My analysis dont show me the especific location that i expect Help and Getting Started	2	414	November 21, 2022
Best practices for reporting workplace outbreaks/super spreading event data General	6	569	August 22, 2020

Adding fifth geographical resolution?

Related topics