**Hi @trs , thank you for your response. **
I’m running the SARS-CoV-2 workflow. I’m not sure if there is a way of sharing .txt files here, but here’s the content of my builds.yaml file:
inputs:
- name: "custom_data"
metadata: data/CCTL_sequencing/Nextstrain_metadata_test.tsv
sequences: data/CCTL_sequencing/All_CCTL_cat_02-16-23.fasta
- name: "references"
metadata: data/references_metadata.tsv
sequences: data/references_sequences.fasta
builds:
All_CCTL_sequences_clear-name_02-16-23:
subsampling_scheme: all
colors: my_profiles/colors_all_WTD.tsv
filter:
"custom_data":
min_length: 25000
skip_diagnostics: True
run_pangolin: True
use_nextalign: true
traits:
All_CCTL_sequences_clear-name_02-16-23:
sampling_bias_correction: 2.5
columns: ["division", "location"]
skip_travel_history_adjustment: True
auspice_config: "my_profiles/my_auspice_config.json"
description: "my_profiles/my_description.md"
include: "defaults/include.txt"
colors: "my_profiles/colors_all_WTD.tsv"
frequencies:
min_date: 2020-01-01
max_date: 2022-09-01
And here is the entire output:
Your config specifies 'skip_travel_history_adjustment=True'. This is now always the case, and thus this parameter can be removed.
Building DAG of jobs...
Using shell: /usr/bin/bash
Provided cores: 64
Rules claiming more threads will be scaled down.
Job counts:
count jobs
1 add_branch_labels
1 adjust_metadata_regions
2 align
1 all
1 ancestral
1 annotate_metadata_with_index
1 build_align
1 calculate_epiweeks
1 clades
1 combine_input_metadata
1 combine_samples
1 combine_sequences_for_subsampling
1 diagnostic
1 distances
1 emerging_lineages
1 export
1 filter
1 finalize
1 include_hcov19_prefix
1 index
1 join_metadata_and_nextclade_qc
1 logistic_growth
1 make_pangolin_node_data
1 mask
1 mutational_fitness
1 recency
1 refine
1 rename_emerging_lineages
1 run_pangolin
1 subsample
1 tip_frequencies
1 traits
1 translate
1 tree
35
[Fri Feb 17 17:10:03 2023]
Job 37:
Aligning sequences to defaults/reference_seq.fasta
- gaps relative to reference are considered real
python3 scripts/sanitize_sequences.py --sequences data/references_sequences.fasta --strip-prefixes hCoV-19/ SARS-CoV-2/ --output /dev/stdout 2> logs/sanitize_sequences_references.txt | nextalign --jobs=8 --reference defaults/reference_seq.fasta --genemap defaults/annotation.gff --genes ORF1a,ORF1b,S,ORF3a,E,M,ORF6,ORF7a,ORF7b,ORF8,N,ORF9b --sequences /dev/stdin --output-dir results/translations --output-basename seqs_references --output-fasta results/aligned_references.fasta --output-insertions results/insertions_references.tsv > logs/align_references.txt 2>&1;
xz -2 -T 8 results/aligned_references.fasta;
xz -2 -T 8 results/translations/seqs_references*.fasta
[Fri Feb 17 17:10:03 2023]
Job 34:
Combining metadata files results/sanitized_metadata_custom_data.tsv.xz results/sanitized_metadata_references.tsv.xz -> results/combined_metadata.tsv.xz and adding columns to represent origin
python3 scripts/combine_metadata.py --metadata results/sanitized_metadata_custom_data.tsv.xz results/sanitized_metadata_references.tsv.xz --origins custom_data references --output results/combined_metadata.tsv.xz 2>&1 | tee logs/combine_input_metadata.txt
[Fri Feb 17 17:10:03 2023]
Job 36:
Aligning sequences to defaults/reference_seq.fasta
- gaps relative to reference are considered real
python3 scripts/sanitize_sequences.py --sequences data/CCTL_sequencing/All_CCTL_cat_02-16-23.fasta --strip-prefixes hCoV-19/ SARS-CoV-2/ --output /dev/stdout 2> logs/sanitize_sequences_custom_data.txt | nextalign --jobs=8 --reference defaults/reference_seq.fasta --genemap defaults/annotation.gff --genes ORF1a,ORF1b,S,ORF3a,E,M,ORF6,ORF7a,ORF7b,ORF8,N,ORF9b --sequences /dev/stdin --output-dir results/translations --output-basename seqs_custom_data --output-fasta results/aligned_custom_data.fasta --output-insertions results/insertions_custom_data.tsv > logs/align_custom_data.txt 2>&1;
xz -2 -T 8 results/aligned_custom_data.fasta;
xz -2 -T 8 results/translations/seqs_custom_data*.fasta
Activating conda environment: /local/workdir/lcc88/Nextstrain/ncov/.snakemake/conda/4875685a
Activating conda environment: /local/workdir/lcc88/Nextstrain/ncov/.snakemake/conda/4875685a
Activating conda environment: /local/workdir/lcc88/Nextstrain/ncov/.snakemake/conda/4875685a
[Fri Feb 17 17:10:10 2023]
Error in rule align:
jobid: 37
output: results/aligned_references.fasta.xz, results/insertions_references.tsv, results/translations/seqs_references.gene.ORF1a.fasta.xz, results/translations/seqs_references.gene.ORF1b.fasta.xz, results/translations/seqs_references.gene.S.fasta.xz, results/translations/seqs_references.gene.ORF3a.fasta.xz, results/translations/seqs_references.gene.E.fasta.xz, results/translations/seqs_references.gene.M.fasta.xz, results/translations/seqs_references.gene.ORF6.fasta.xz, results/translations/seqs_references.gene.ORF7a.fasta.xz, results/translations/seqs_references.gene.ORF7b.fasta.xz, results/translations/seqs_references.gene.ORF8.fasta.xz, results/translations/seqs_references.gene.N.fasta.xz, results/translations/seqs_references.gene.ORF9b.fasta.xz
log: logs/align_references.txt (check log file(s) for error message)
conda-env: /local/workdir/lcc88/Nextstrain/ncov/.snakemake/conda/4875685a
shell:
python3 scripts/sanitize_sequences.py --sequences data/references_sequences.fasta --strip-prefixes hCoV-19/ SARS-CoV-2/ --output /dev/stdout 2> logs/sanitize_sequences_references.txt | nextalign --jobs=8 --reference defaults/reference_seq.fasta --genemap defaults/annotation.gff --genes ORF1a,ORF1b,S,ORF3a,E,M,ORF6,ORF7a,ORF7b,ORF8,N,ORF9b --sequences /dev/stdin --output-dir results/translations --output-basename seqs_references --output-fasta results/aligned_references.fasta --output-insertions results/insertions_references.tsv > logs/align_references.txt 2>&1;
xz -2 -T 8 results/aligned_references.fasta;
xz -2 -T 8 results/translations/seqs_references*.fasta
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
[Fri Feb 17 17:10:10 2023]
Error in rule align:
jobid: 36
output: results/aligned_custom_data.fasta.xz, results/insertions_custom_data.tsv, results/translations/seqs_custom_data.gene.ORF1a.fasta.xz, results/translations/seqs_custom_data.gene.ORF1b.fasta.xz, results/translations/seqs_custom_data.gene.S.fasta.xz, results/translations/seqs_custom_data.gene.ORF3a.fasta.xz, results/translations/seqs_custom_data.gene.E.fasta.xz, results/translations/seqs_custom_data.gene.M.fasta.xz, results/translations/seqs_custom_data.gene.ORF6.fasta.xz, results/translations/seqs_custom_data.gene.ORF7a.fasta.xz, results/translations/seqs_custom_data.gene.ORF7b.fasta.xz, results/translations/seqs_custom_data.gene.ORF8.fasta.xz, results/translations/seqs_custom_data.gene.N.fasta.xz, results/translations/seqs_custom_data.gene.ORF9b.fasta.xz
log: logs/align_custom_data.txt (check log file(s) for error message)
conda-env: /local/workdir/lcc88/Nextstrain/ncov/.snakemake/conda/4875685a
shell:
python3 scripts/sanitize_sequences.py --sequences data/CCTL_sequencing/All_CCTL_cat_02-16-23.fasta --strip-prefixes hCoV-19/ SARS-CoV-2/ --output /dev/stdout 2> logs/sanitize_sequences_custom_data.txt | nextalign --jobs=8 --reference defaults/reference_seq.fasta --genemap defaults/annotation.gff --genes ORF1a,ORF1b,S,ORF3a,E,M,ORF6,ORF7a,ORF7b,ORF8,N,ORF9b --sequences /dev/stdin --output-dir results/translations --output-basename seqs_custom_data --output-fasta results/aligned_custom_data.fasta --output-insertions results/insertions_custom_data.tsv > logs/align_custom_data.txt 2>&1;
xz -2 -T 8 results/aligned_custom_data.fasta;
xz -2 -T 8 results/translations/seqs_custom_data*.fasta
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
Parsed 2 metadata TSVs
custom_data (results/sanitized_metadata_custom_data.tsv.xz): 3057 strains x 10 columns
references (results/sanitized_metadata_references.tsv.xz): 1 strains x 38 columns
Combined metadata: 3058 strains x 40 columns
[Fri Feb 17 17:10:10 2023]
Finished job 34.
1 of 35 steps (3%) done
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: /local/workdir/lcc88/Nextstrain/ncov/.snakemake/log/2023-02-17T171003.442033.snakemake.log
Thank you