I was trying my own profiles and got the following error message,
“”"
ERROR: Problem reading in data/example_sequences.fasta:
Duplicate key ‘2019-nCoV’
“”"
I know that there is duplicate keys in the input sequences file. So, is there any way to locate it in NextStrain ? Otherwise, I would use python coding instead.
[Mon Feb 1 18:16:10 2021]
Job 61: Constructing colors filepython3 scripts/assign-colors.py --ordering defaults/color_ordering.tsv --color-schemes defaults/color_schemes.tsv --output results/global/colors.tsv --metadata data/example_metadata.tsv 2>&1 | tee logs/colors_global.txt [Mon Feb 1 18:16:10 2021] Job 69: Use metadata on submission date to construct submission recency field python3 scripts/construct-recency-from-submission-date.py --metadata data/example_metadata.tsv --output results/global/recency.json 2>&1 | tee logs/recency_global.txt [Mon Feb 1 18:16:10 2021] Job 33: Adjusting metadata for build 'north-america_usa_washington' python3 scripts/adjust_regional_meta.py --region 'North America' --metadata data/example_metadata.tsv --output results/north-america_usa_washington/metadata_adjusted.tsv 2>&1 | tee logs/adjust_metadata_regions_north-america_usa_washington.txt [Mon Feb 1 18:16:10 2021] Job 118: Pre-filtering sequences for minimal length (before aligning) augur filter --sequences data/example_sequences.fasta --metadata data/example_metadata.tsv --min-length 27000 --output results/prefiltered.fasta 2>&1 | tee logs/prefiltered.txt ERROR: Problem reading in data/example_sequences.fasta: Duplicate key '2019-nCoV' [Mon Feb 1 18:16:11 2021] Error in rule prefilter: jobid: 118 output: results/prefiltered.fasta log: logs/prefiltered.txt (check log file(s) for error message) shell: augur filter --sequences data/example_sequences.fasta --metadata data/example_metadata.tsv --min-length 27000 --output results/prefiltered.fasta 2>&1 | tee logs/prefiltered.txt (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!) Logfile logs/prefiltered.txt: ERROR: Problem reading in data/example_sequences.fasta: Duplicate key '2019-nCoV'