The "division" column in the dataset

We are trying to do some data analysis on the next strain dataset. Our understanding of the “division” is that:

  1. This means that this “case” strain, lineage is found in that particular state itself ?
    Or that only means
  2. That this strain/gnome is researched in that state while it can come from any other state of india ? ie. for example the case might be from Delhi while it was researched in ICMR, Jammu & Kashmir ?

Please let us know about the same, so that we can have a better understanding of the data.

Hi @Mushahid ! Thanks for reaching out. Unfortunately we have to take what is submitted as ‘division’ from GISAID as face-value - so we don’t know how different submitters might be filling in this field. We believe most scientists uploading data treat this as in the case of 1. - that this is the division where the sample was isolated or where the person lives. From discussions with some submitters (though not in India, I’m afraid) and when we are able to correlate cases with media reports, for example, this seems to be how most people use it.

However, it’s also very possible that some submitters use it as in case 2. - that it’s the location where the sample is processed.

Probably the best way to find a reliable answer for this would be to contact the submitters for the sequences you’re interested in - they should be able to let you know!