First, I want to say how much I love Nexstrain and Nextclade. It has made my life so much easier as a government public health worker so thank you so much for all that you do.
As the Avian Influenza is progressing, I was wondering if its possible to have a Nextclade view of Influenza A(H5N1) concatenated version like how it is with SARS-CoV-2. So the full Influenza genome pasted as it is in UShER and the Nexstrain WGS tree (ordered like so: PB2,PB1,PA,HA,NP,NA,MP,NS). Then in Nextclade you can see the whole genome (glued together) and also by the individual segment (like how you can see the SARS-CoV-2 S gene and others separately). I think this would be great for B3.13 and D1.1 genotypes. B3.13 reference currently used is A/cattle/Texas/24-008749-002/2024.
We could try to make the CLI version of this but I think this view will be helpful as the virus continues to spread nationwide. Would appreciate thoughts and how to get this started.
Other lower priority features that would be nice is the ability to download from GISAID and have it automatically concatenate in the order of interest. (I have the code for this already but might not be accessible for those less bioinformatic savvy; this is also probably a GISAID problem).
Thank you so much again and appreciate all that you do!
As a software engineer working on parts of the Nextclade software, I am unaware of any current work being done to add support for a concatenated flu for Nextclade. But this could also mean that all the fame and glory could be yours!
If you haven’t yet, take a look on how to create and publish a new Nextclade dataset. A Nextclade dataset is what describes and configures a virus, and it’s nothing more than a directory with some files. You can find dataset author guides as well as all existing official and community datasets here: https://github.com/nextstrain/nextclade_data/blob/master/README.md
If you decide to give it a try and once you have a working prototype, don’t hesitate to share your work, for example by submitting a pull request to nextstrain/nextclade_data repo. Or you could host the dataset(s) yourself. This encourages discussion and improvements. No guarantee, but this way it has a chance to be listed along with other community datasets. In fact, the existing H5N1 datasets you are using are contributed by our friends from MonclaLab!
You can also host your custom datasets yourself and create special Nextclade links for your fiends and colleagues to use your datasets.
For a more practical scientific discussion regarding concatenated flu, let’s see if our scientists have something to say here. To amplify, feel free to also submit an issue to the nextstrain/nextclade_data repo, requesting new dataset(s)
Regarding the concatenation of GISAID data, it’s probably too specific to GISAID, particular virus and a particular way the virus is configured for this functionality to be in Nextclade. However it might be an idea for a nice side project, especially if the dataset business flies off!
Feel free to ping my nickname here or on GitHub if you need technical help with Nextclade and/or datasets.
And this build can be used as a tree in Nextclade even though we don’t have an officially maintained “cattle-flu” dataset.
This link
will load the tree and the reference sequence into nextclade and you can analyze your own data. These could be individual segments, or concatenated genomes (in the order you use).
Given the persistence of the cattle outbreak, it might be about time for a dedicated nextclade dataset for just this genotype.
Just FYI: any Nextstrain tree can now be used as make-shift Nextclade dataset. At the bottom of the page, there is a button 'View in other platforms" that includes a link to Nextclade with the URL parameters set appropriately.
let us know if you have any questions, or if you feel like making your own dataset! happy to help.
I may not have fully followed what was previously posted, but is there a way currently to filter by the B3.13 or D1.1 genotypes in say this build: auspice
Based on the previous post the B3.13 genotype has a dedicated build, but could this or a seperate build be expanded to include the D1.1 genotype as well?