Only global build found in ./auspice

Hi all, I’m trying to run a few samples through next strain in a USA background from GISAID. I’m able to get the builds.yaml and config.yaml file written and have two separate builds listed, one for Maryland and one for USA. However, after running, only one file called ncov_global.json appears in ./auspice, despite neither of my builds or the profile being named this or using global subsampling at all. Am I missing something obvious?

Hi @aeroder – could you post the snakemake command you’re running, and the content of the builds YAML you’ve created? My guess is that it’s currently running the getting-started config, which only defines a global build, rather than your config which defines a build for Maryland and USA.

Hi James,

Here’s the snakelike command:

snakemake --cores 4 --profile ./my_profiles/nihcov

and here is the builds.yaml file:

builds:


# with a build name that will produce the following URL fragment on Nextstrain/auspice:
# /ncov/north-america/usa/maryland
north-america_usa_maryland:

subsampling_scheme: maryland 
    region: North America
    country: USA
    division: Maryland

# Here, Maryland is in USA, is in North America.

# This build focuses on the entire U.S.
# with a build name that will produce the following URL fragment on Nextstrain/auspice:
# /ncov/north-america/usa
north-america_usa:

    subsampling_scheme: usa
    region: North America
    country: USA
    # Here, USA is in North America


# Define custom subsampling schemes

maryland:
  # Focal samples for country
  country:
    group_by: "division year month"
    max_sequences: 5000
    exclude: "--exclude-where 'country!={country}'"
  # Contextual samples from country's region
  region:
    group_by: "country year month"
    seq_per_group: 1000
    exclude: "--exclude-where 'country={country}' 'region!={region}'"
    priorities:
      type: "proximity"
      focus: "country"
  # Contextual samples from the rest of the world,
  # excluding the current region to avoid resampling.
  global:
    group_by: "country year month"
    seq_per_group: 100
    exclude: "--exclude-where 'region={region}'"
    priorities:
      type: "proximity"
      focus: "country"

usa:
  # Focal samples for country
  country:
    group_by: "year month"
    max_sequences: 20000
    exclude: "--exclude-where 'country!={country}'"
  # Contextual samples from country's region
  # Contextual samples from the rest of the world,
  # excluding the current region to avoid resampling.
  global:
    group_by: "country year month"
    seq_per_group: 100
    exclude: "--exclude-where 'region={region}'"
    priorities:
      type: "proximity"
      focus: "country"

# Here, you can specify what type of auspice_config you want to use
# and what description you want. These will apply to all the above builds.
# If you want to specify specific files for each build - you can!
# See the 'example_advanced_customization' builds.yaml
files:
  auspice_config: "my_profiles/example/my_auspice_config.json"
  description: "my_profiles/example/my_description.md"

I think you’re really close, I only had to change some whitespace formatting (which may just be copy&paste errors from this discussion board) and modify the YAML structure as the subsampling dictionary wasn’t there (and thus the subsampling schemes were being interpreted as part of the builds dictionary).

I’ve made a branch in our ncov repo which incorporates your YAML file and produces the two JSONs as desired. The important files are:

And it runs successfully with snakemake --profile ./my_profiles/nihcov.

Note that I’ve modified the (a) subsampling thresholds and (b) input data paths so that I could have a quick debugging build, which you’ll want to change back!

Let us know how you get on! :crossed_fingers:

I also wonder if in your original implementation, your my_profiles/nihcov/config.yaml contained the default settings which point to the my_profiles/example/builds.yaml rather than your custom builds.yaml? This would explain both why you only ended up with a global build and why snakemake didn’t throw a bunch of errors with your slightly-misformatted builds.yaml file.