Hello everyone,
I’m running nextstrain job on local data, the metadata.tsv and sequences.fasta and the builds.yaml was constructed based on the ncov tutorial,
First, I run the pipeline using this command:
$ nextstrain build . --configfile builds.yaml --cores 4 -p
I got this error:
Error in rule sanitize_metadata:
Finished job 11.
jobid: 18
4 of 39 steps (10%) done
output: results/sanitized_metadata_refrences.tsv.xz
log: logs/sanitize_metadata_refrences.txt (check log file(s) for error message)
shell:
python3 scripts/sanitize_metadata.py --metadata data/references_metadata.tsv --metadata-id-columns strain name 'Virus name' --database-id-columns 'Accession ID' gisaid_epi_isl genbank_accession --parse-location-field Location --rename-fields 'Virus name=strain' Type=type 'Accession ID=gisaid_epi_isl' 'Collection date=date' 'Additional location information=additional_location_information' 'Sequence length=length' Host=host 'Patient age=patient_age' Gender=sex Clade=GISAID_clade 'Pango lineage=pango_lineage' pangolin_lineage=pango_lineage Lineage=pango_lineage 'Pangolin version=pangolin_version' Variant=variant 'AA Substitutions=aa_substitutions' aaSubstitutions=aa_substitutions 'Submission date=date_submitted' 'Is reference?=is_reference' 'Is complete?=is_complete' 'Is high coverage?=is_high_coverage' 'Is low coverage?=is_low_coverage' N-Content=n_content GC-Content=gc_content --strip-prefixes hCoV-19/ SARS-CoV-2/ --output results/sanitized_metadata_refrences.tsv.xz 2>&1 | tee logs/sanitize_metadata_refrences.txt
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
[Sun Oct 31 13:11:09 2021]
Error in rule sanitize_metadata:
jobid: 13
output: results/sanitized_metadata.tsv.xz
log: logs/sanitize_metadata.txt (check log file(s) for error message)
shell:
python3 scripts/sanitize_metadata.py --metadata data/metadata.tsv --metadata-id-columns strain name 'Virus name' --database-id-columns 'Accession ID' gisaid_epi_isl genbank_accession --parse-location-field Location --rename-fields 'Virus name=strain' Type=type 'Accession ID=gisaid_epi_isl' 'Collection date=date' 'Additional location information=additional_location_information' 'Sequence length=length' Host=host 'Patient age=patient_age' Gender=sex Clade=GISAID_clade 'Pango lineage=pango_lineage' pangolin_lineage=pango_lineage Lineage=pango_lineage 'Pangolin version=pangolin_version' Variant=variant 'AA Substitutions=aa_substitutions' aaSubstitutions=aa_substitutions 'Submission date=date_submitted' 'Is reference?=is_reference' 'Is complete?=is_complete' 'Is high coverage?=is_high_coverage' 'Is low coverage?=is_low_coverage' N-Content=n_content GC-Content=gc_content --strip-prefixes hCoV-19/ SARS-CoV-2/ --output results/sanitized_metadata.tsv.xz 2>&1 | tee logs/sanitize_metadata.txt
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
[Sun Oct 31 13:11:09 2021]
Finished job 29.
5 of 39 steps (13%) done
Traceback (most recent call last):
File "/home/bioinformatics/miniconda3/envs/nextstrain/lib/python3.8/site-packages/snakemake/io.py", line 653, in touch
lutime(self.file, times)
File "/home/bioinformatics/miniconda3/envs/nextstrain/lib/python3.8/site-packages/snakemake/io.py", line 67, in lutime
os.utime(f, times, follow_symlinks=False)
FileNotFoundError: [Errno 2] No such file or directory
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/bioinformatics/miniconda3/envs/nextstrain/lib/python3.8/site-packages/snakemake/io.py", line 667, in touch_or_create
self.touch()
File "/home/bioinformatics/miniconda3/envs/nextstrain/lib/python3.8/site-packages/snakemake/io.py", line 656, in touch
raise MissingOutputException(
snakemake.exceptions.MissingOutputException: Job Output file logs/sanitize_metadata_refrences.txt of rule sanitize_metadata shall be touched but does not exist. completed successfully, but some output files are missing.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/bioinformatics/miniconda3/envs/nextstrain/lib/python3.8/site-packages/snakemake/__init__.py", line 699, in snakemake
success = workflow.execute(
File "/home/bioinformatics/miniconda3/envs/nextstrain/lib/python3.8/site-packages/snakemake/workflow.py", line 1069, in execute
success = self.scheduler.schedule()
File "/home/bioinformatics/miniconda3/envs/nextstrain/lib/python3.8/site-packages/snakemake/scheduler.py", line 441, in schedule
self._error_jobs()
File "/home/bioinformatics/miniconda3/envs/nextstrain/lib/python3.8/site-packages/snakemake/scheduler.py", line 557, in _error_jobs
self._handle_error(job)
File "/home/bioinformatics/miniconda3/envs/nextstrain/lib/python3.8/site-packages/snakemake/scheduler.py", line 614, in _handle_error
self.get_executor(job).handle_job_error(job)
File "/home/bioinformatics/miniconda3/envs/nextstrain/lib/python3.8/site-packages/snakemake/executors/__init__.py", line 611, in handle_job_error
super().handle_job_error(job)
File "/home/bioinformatics/miniconda3/envs/nextstrain/lib/python3.8/site-packages/snakemake/executors/__init__.py", line 277, in handle_job_error
job.postprocess(
File "/home/bioinformatics/miniconda3/envs/nextstrain/lib/python3.8/site-packages/snakemake/jobs.py", line 1009, in postprocess
self.dag.handle_log(self)
File "/home/bioinformatics/miniconda3/envs/nextstrain/lib/python3.8/site-packages/snakemake/dag.py", line 638, in handle_log
f.touch_or_create()
File "/home/bioinformatics/miniconda3/envs/nextstrain/lib/python3.8/site-packages/snakemake/io.py", line 679, in touch_or_create
with open(file, "w") as f:
FileNotFoundError: [Errno 2] No such file or directory: 'logs/sanitize_metadata_refrences.txt'
I think there are an issue with the sanitize_metadata and generating the log.
Second, I tried this command:
$ snakemake --profile . -p
I got this error:
IncompleteFilesException:
The files below seem to be incomplete. If you are sure that certain files are not incomplete, mark them as complete with
snakemake --cleanup-metadata <filenames>
To re-generate the files rerun your command with the --rerun-incomplete flag.
Incomplete files:
results/sanitized_metadata.tsv.xz
results/sanitized_metadata_refrences.tsv.xz
Then run:
$ snakemake --profile . -p --rerun-incomplete
and got this error:
Error in rule sanitize_metadata:
jobid: 13
output: results/sanitized_metadata.tsv.xz
[Sun Oct 31 13:29:09 2021]
log: logs/sanitize_metadata.txt (check log file(s) for error message)
Error in rule sanitize_metadata:
shell:
python3 scripts/sanitize_metadata.py --metadata data/metadata.tsv --metadata-id-columns strain name 'Virus name' --database-id-columns 'Accession ID' gisaid_epi_isl genbank_accession --parse-location-field Location --rename-fields 'Virus name=strain' Type=type 'Accession ID=gisaid_epi_isl' 'Collection date=date' 'Additional location information=additional_location_information' 'Sequence length=length' Host=host 'Patient age=patient_age' Gender=sex Clade=GISAID_clade 'Pango lineage=pango_lineage' pangolin_lineage=pango_lineage Lineage=pango_lineage 'Pangolin version=pangolin_version' Variant=variant 'AA Substitutions=aa_substitutions' aaSubstitutions=aa_substitutions 'Submission date=date_submitted' 'Is reference?=is_reference' 'Is complete?=is_complete' 'Is high coverage?=is_high_coverage' 'Is low coverage?=is_low_coverage' N-Content=n_content GC-Content=gc_content --strip-prefixes hCoV-19/ SARS-CoV-2/ --output results/sanitized_metadata.tsv.xz 2>&1 | tee logs/sanitize_metadata.txt
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
jobid: 18
output: results/sanitized_metadata_refrences.tsv.xz
log: logs/sanitize_metadata_refrences.txt (check log file(s) for error message)
Logfile logs/sanitize_metadata.txt not found.
shell:
python3 scripts/sanitize_metadata.py --metadata data/references_metadata.tsv --metadata-id-columns strain name 'Virus name' --database-id-columns 'Accession ID' gisaid_epi_isl genbank_accession --parse-location-field Location --rename-fields 'Virus name=strain' Type=type 'Accession ID=gisaid_epi_isl' 'Collection date=date' 'Additional location information=additional_location_information' 'Sequence length=length' Host=host 'Patient age=patient_age' Gender=sex Clade=GISAID_clade 'Pango lineage=pango_lineage' pangolin_lineage=pango_lineage Lineage=pango_lineage 'Pangolin version=pangolin_version' Variant=variant 'AA Substitutions=aa_substitutions' aaSubstitutions=aa_substitutions 'Submission date=date_submitted' 'Is reference?=is_reference' 'Is complete?=is_complete' 'Is high coverage?=is_high_coverage' 'Is low coverage?=is_low_coverage' N-Content=n_content GC-Content=gc_content --strip-prefixes hCoV-19/ SARS-CoV-2/ --output results/sanitized_metadata_refrences.tsv.xz 2>&1 | tee logs/sanitize_metadata_refrences.txt
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
Logfile logs/sanitize_metadata_refrences.txt not found.
Traceback (most recent call last):
File "/home/bioinformatics/miniconda3/envs/nextstrain/lib/python3.8/site-packages/snakemake/io.py", line 653, in touch
lutime(self.file, times)
File "/home/bioinformatics/miniconda3/envs/nextstrain/lib/python3.8/site-packages/snakemake/io.py", line 67, in lutime
os.utime(f, times, follow_symlinks=False)
FileNotFoundError: [Errno 2] No such file or directory
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/bioinformatics/miniconda3/envs/nextstrain/lib/python3.8/site-packages/snakemake/io.py", line 667, in touch_or_create
self.touch()
File "/home/bioinformatics/miniconda3/envs/nextstrain/lib/python3.8/site-packages/snakemake/io.py", line 656, in touch
raise MissingOutputException(
snakemake.exceptions.MissingOutputException: Job Output file logs/sanitize_metadata.txt of rule sanitize_metadata shall be touched but does not exist. completed successfully, but some output files are missing.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/bioinformatics/miniconda3/envs/nextstrain/lib/python3.8/site-packages/snakemake/__init__.py", line 699, in snakemake
success = workflow.execute(
File "/home/bioinformatics/miniconda3/envs/nextstrain/lib/python3.8/site-packages/snakemake/workflow.py", line 1069, in execute
success = self.scheduler.schedule()
File "/home/bioinformatics/miniconda3/envs/nextstrain/lib/python3.8/site-packages/snakemake/scheduler.py", line 441, in schedule
self._error_jobs()
File "/home/bioinformatics/miniconda3/envs/nextstrain/lib/python3.8/site-packages/snakemake/scheduler.py", line 557, in _error_jobs
self._handle_error(job)
File "/home/bioinformatics/miniconda3/envs/nextstrain/lib/python3.8/site-packages/snakemake/scheduler.py", line 614, in _handle_error
self.get_executor(job).handle_job_error(job)
File "/home/bioinformatics/miniconda3/envs/nextstrain/lib/python3.8/site-packages/snakemake/executors/__init__.py", line 611, in handle_job_error
super().handle_job_error(job)
File "/home/bioinformatics/miniconda3/envs/nextstrain/lib/python3.8/site-packages/snakemake/executors/__init__.py", line 277, in handle_job_error
job.postprocess(
File "/home/bioinformatics/miniconda3/envs/nextstrain/lib/python3.8/site-packages/snakemake/jobs.py", line 1009, in postprocess
self.dag.handle_log(self)
File "/home/bioinformatics/miniconda3/envs/nextstrain/lib/python3.8/site-packages/snakemake/dag.py", line 638, in handle_log
f.touch_or_create()
File "/home/bioinformatics/miniconda3/envs/nextstrain/lib/python3.8/site-packages/snakemake/io.py", line 679, in touch_or_create
with open(file, "w") as f:
FileNotFoundError: [Errno 2] No such file or directory: 'logs/sanitize_metadata.txt'
So, if anyone has the same error or know how to solve it… Kindly share with us
Thank you,