Loss of country information in final auspice output

Hi,
I just prepared a build with a focus of a country (n~200) and subsampled global samples excluding the focal country (n~4800, with 5 in ‘country year month’). The pipeline ran generally well. But in the output visualized with auspice, apparently a substantial sequences lost their ‘country’ information. In fact, only sequences from the same region (Asia) as my focal country keep their country information, the other have region (e.g. Africa, Europe…) as their country.

My subsampling scheme is as follow:

subsampling:
scheme:
# Focal samples for country
country:
group_by: “year month”
max_sequences: 1000
exclude: “–exclude-where ‘country!={country}’”

global:
  group_by: "country year month"
  seq_per_group: 5
  exclude: "--exclude-where 'country!={country}'"
  priorities:
    type: "proximity"
    focus: "country"

Any help would be greatly appreciated! Thanks in advance.

We designed this functionality for our regional builds (e.g. nextstrain/ncov/asia) where we wanted to highlight those countries from the region (e.g. Asia). If this is happening for your build, it implies the snakemake rule adjust_metadata_regions is running. This leads me to believe you have region: Asia set in your build name (the builds block inside builds.yaml); removing this should mean the rule is no longer run. If this is not the case, could you include the builds block for your run so we can debug further?

Yes, you are right. After removing region: Asia, the problem resolved.

Thank you so much for your reply.

1 Like