No release for SARS-CoV-2 May update of nextclade_data?

I’ve been watching the releases page for the nextclade_data repo:

So I missed that there was an update on 2022-05-10:

Going forward, should I be watching the CHANGELOG file, or perhaps this was an oversight?

HI Mike @mike_honey!

Yes, it was an omission. It was a rare time when I released the datasets and not Cornelius. It unfortunately is not automated and I forgot to make an entry on GitHub Releases.

I created the release just now.

A couple of things which can help you to keep the data updated:

  • If you are using Nextclade Web and have a stale tab opened by the time a dataset is released, you should receive a notification in the browser (it’s a recent addition). Also, the dataset description contains the date when it was released and a link to the changelog.

  • If you are using Nextclade CLI, we recommend to download dataset(s) fresh once in a while, let’s say daily. This way it’s always fresh.

We should probably also automate creation of GitHub Releases in the data repo (Automate creation of GitHub releases · Issue #77 · nextstrain/nextclade_data · GitHub).

Hi Ivan,

No worries, thanks for confirming, and for those tips.

Is this month’s release far away? I’m thinking I might wait for that one, if it is on a similar schedule.

Thanks
Mike

@mike_honey I am not aware of any particular schedule. I think Cornelius just releases it when there are new clades or lineages.

1 Like

My bad, sorry, I forgot to make the gh release.

I’m working on a new release, should happen this week.

By the way, if you want to be on the edge, you can always use this reference tree which is built daily: auspice

It’s less carefully checked for correct topology and sometimes some mutations are missing in some sequences but it contains the very latest designations.

You can download the tree JSON via e.g. https://nextstrain.org/charon/getDataset?prefix=staging/nextclade/sars-cov-2/21L and then pass the json to nextclade cli:

nextclade run -d sars-cov-2 --input-tree nightly_tree.json --output-tsv output.tsv input_sequences.fasta 

This way you overwrite the latest release dataset tree with the nightly build.

It looks like the same scenario has occurred this month - CHANGELOG.md updated without a release?

It looks like the same scenario has occurred this month - CHANGELOG.md updated without a release?

@mike_honey not quite :slight_smile:

I prepared a dataset, and published it to master, but hadn’t yet released to the release branch, so clades.nextstrain.org was still using the old dataset, only master.clades.nextstrain.org showed the new one (this is a common development practice, master contains all changes and can be updated daily or more, and releases are done periodically)

I’ve just released the dataset (put it on release branch) and this time I didn’t forget to also create a Github release, which is really nothing but a separate announcement. The github release has no implications other than to show on Github. It doesn’t impact which datasets are used by Nextclade.

I hope that makes sense.

Sure - thanks for clarifying. For my purposes (nexclade dataset for CLI) it looks like I should watch CHANGELOG.md.