Download and rename all builds from nextstrain.org/groups

Hi,

I want to rename old nextstrain builds on nextstrain.org/groups/niph to make it easier to navigate between different agents and builds inside the auspice tree viewer menu.

But over time we have used a mix of different naming styles for the build files. And it seems that this causes some issues when downloading the datasets using nextstrain remote download. (my plan was to download everything, rename the files locally and then upload again) For example, this dataset: https://nextstrain.org/groups/niph/2022.04.29-ncov/omicron-BA-two is renamed to 2022.json and 2022.04.json when downloaded.

Is there a way I can for example package all the uploaded datasets into a tarball and then download them? Or is it possible to rename files remotely? Or are there other ways I can rename these files?

If I understand correctly, underscores in the filenames are used as a separator, so that the above mentioned build file looks like this in the auspice menu:

What I would like to have is that instead in the drop-down menu under “niph” we have the different agents (e.g., “ncov”, “rsv”, “flu”), and then another subgroup (e.g., “omicron”), and then the date. What would be the best naming style to achieve this?

Thanks!

Jon

Oh, that’s a bug for sure! I can reproduce the bug:

$ nextstrain remote download https://nextstrain.org/groups/niph/2022.04.29-ncov/omicron-BA-two
Downloading https://nextstrain.org/groups/niph/2022.04.29-ncov/omicron-BA-two as 2022.04.json
Downloading https://nextstrain.org/groups/niph/2022.04.29-ncov/omicron-BA-two (root-sequence) as 2022.json
Downloading https://nextstrain.org/groups/niph/2022.04.29-ncov/omicron-BA-two (tip-frequencies) as 2022.json

but it should download as three separate files:

2022.04.29-ncov_omicron-BA-two.json
2022.04.29-ncov_omicron-BA-two_root-sequence.json
2022.04.29-ncov_omicron-BA-two_tip-frequencies.json

I’ll look into fixing that bug. Sorry you’re running into it.

In the meantime, you could manually download the files to names of your choosing by interacting with the nextstrain.org API directly. To download everything, for example:

mkdir niph-group
cd niph-group

  nextstrain remote ls https://nextstrain.org/groups/niph \
| parallel -v -j1 "curl -fsSL --compressed {1} -H'Accept: {2}' -o {=1 s,^https://nextstrain.org/groups/niph/,,; s,/,_,g=}{3}" \
    :::: /dev/stdin \
    ::: application/vnd.nextstrain.dataset.{main,root-sequence,tip-frequencies,measurements}+json \
    :::+ {,_root-sequence,_tip-frequencies,_measurements}.json

You can then adjust local file names as you see fit (see also my comment below) and then re-upload to your group:

nextstrain remote upload --dry-run https://nextstrain.org/groups/niph *.json

Remove the --dry-run flag to actually upload after you confirm the names/URLs are as you desire.

If what you want is to move everything and not keep the old URLs, you probably want to remove those first with nextstrain remote rm (probably with its --recursive option).

The dropdowns will be built by splitting on slashes in the URL (i.e. underscores in the local file names above), so you’d use local file names like:

ncov_omicron_YYYY-MM-DD.json
ncov_delta_YYYY-MM-DD.json
rsv_a_YYYY-MM-DD.json
rsv_b_YYYY-MM-DD.json

Note that currently for Groups datasets, the dropdown selector mechanism only really supports changing the final selector, e.g. changing the date in your case. Changing one of the earlier dropdown selectors will result in a bad URL, e.g. starting from ncov/omicron/2024-07-17, changing the second selector from omicron → delta results in loading ncov/delta (which will likely not exist) instead of ncov/delta/2024-07-17 as you might expect. We do intend to fix this eventually.

1 Like

FWIW, I’ve opened an issue against Nextstrain CLI.

Thanks @trs! This is super helpful!
The possibility to change between different levels of dropdown selectors would be nice, but I see that this can also quickly get messy if one uploads long filenames with lots of underscores…

By the way, I updated nextstrain cli to version 8.5.1 and the download works nicely now. Thanks a lot! :slight_smile:

1 Like

Ha, excellent! I kicked off the 8.5.1 release before signing off for the night and then came back this morning to let you know, but you beat me to it. :slight_smile: Glad to hear it works for you.

1 Like