ERROR: Problem reading in data/example_sequences.fasta: Duplicate key '2019-nCoV'

emtf · February 1, 2021, 10:24am

I was trying my own profiles and got the following error message,
“”"
ERROR: Problem reading in data/example_sequences.fasta:
Duplicate key ‘2019-nCoV’
“”"
I know that there is duplicate keys in the input sequences file. So, is there any way to locate it in NextStrain ? Otherwise, I would use python coding instead.

[Mon Feb 1 18:16:10 2021]
Job 61: Constructing colors file

        python3 scripts/assign-colors.py             --ordering defaults/color_ordering.tsv             --color-schemes defaults/color_schemes.tsv             --output results/global/colors.tsv             --metadata data/example_metadata.tsv 2>&1 | tee logs/colors_global.txt
        

[Mon Feb  1 18:16:10 2021]
Job 69: Use metadata on submission date to construct submission recency field


        python3 scripts/construct-recency-from-submission-date.py             --metadata data/example_metadata.tsv             --output results/global/recency.json 2>&1 | tee logs/recency_global.txt
        

[Mon Feb  1 18:16:10 2021]
Job 33: 
        Adjusting metadata for build 'north-america_usa_washington'
        


        python3 scripts/adjust_regional_meta.py             --region 'North America'             --metadata data/example_metadata.tsv             --output results/north-america_usa_washington/metadata_adjusted.tsv 2>&1 | tee logs/adjust_metadata_regions_north-america_usa_washington.txt
        

[Mon Feb  1 18:16:10 2021]
Job 118: 
        Pre-filtering sequences for minimal length (before aligning)
        


        augur filter             --sequences data/example_sequences.fasta             --metadata data/example_metadata.tsv             --min-length 27000             --output results/prefiltered.fasta 2>&1 | tee logs/prefiltered.txt
        
ERROR: Problem reading in data/example_sequences.fasta:
Duplicate key '2019-nCoV'
[Mon Feb  1 18:16:11 2021]
Error in rule prefilter:
    jobid: 118
    output: results/prefiltered.fasta
    log: logs/prefiltered.txt (check log file(s) for error message)
    shell:
        
        augur filter             --sequences data/example_sequences.fasta             --metadata data/example_metadata.tsv             --min-length 27000             --output results/prefiltered.fasta 2>&1 | tee logs/prefiltered.txt
        
        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
Logfile logs/prefiltered.txt:
ERROR: Problem reading in data/example_sequences.fasta:
Duplicate key '2019-nCoV'

Topic		Replies	Views
Error in augur tree: "Duplicated sequence name" Help and Getting Started	8	1959	February 9, 2022
KeyError: UndefinedVariableError: name is not defined Help and Getting Started	7	1687	March 21, 2022
What happens when 2 genomes have the same name but different sequences? Help and Getting Started	1	399	June 29, 2021
Ncov: Errors from combine_metadata.py due to unexpected behavior in sanitize_metadata.py Help and Getting Started	28	1245	May 1, 2023
Error message executing new tutorial Help and Getting Started	11	1623	July 16, 2020

ERROR: Problem reading in data/example_sequences.fasta: Duplicate key '2019-nCoV'

Related topics