Error in rule align: argument '--jobs'

leocaserta · February 17, 2023, 8:36pm

Hello,
I was receiving a warning message of duplicate sequences, but after removing the duplicates, I’m still receiving error message about rule align.
The log fiiles of align_references.txt and align_custom_data.txt shos the following message:

error: Found argument ‘–jobs’ which wasn’t expected, or isn’t valid in this context

If you tried to supply `--jobs` as a value rather than a flag, use `-- --jobs`

USAGE:
nextalign [OPTIONS]

For more information try --help

is there anything that I could fix to avoid this error?

Thank you

trs · February 17, 2023, 9:27pm

Hi @leocaserta. That error message implies something is going wrong with the command generation for those workflow rules. Possibly a configuration and/or shell escaping/quoting issue. It’s hard to debug without more information, however. Can you share the workflow you’re running, your config, and the entire output of running it (incl. the full error)?

leocaserta · February 17, 2023, 10:28pm

**Hi @trs , thank you for your response. **
I’m running the SARS-CoV-2 workflow. I’m not sure if there is a way of sharing .txt files here, but here’s the content of my builds.yaml file:

inputs:
  - name: "custom_data"
    metadata: data/CCTL_sequencing/Nextstrain_metadata_test.tsv
    sequences: data/CCTL_sequencing/All_CCTL_cat_02-16-23.fasta 
  - name: "references"
    metadata: data/references_metadata.tsv
    sequences: data/references_sequences.fasta

builds:
  All_CCTL_sequences_clear-name_02-16-23:
    subsampling_scheme: all
    colors: my_profiles/colors_all_WTD.tsv

filter:
  "custom_data":
    min_length: 25000
    skip_diagnostics: True 

run_pangolin: True

use_nextalign: true 

traits:
  All_CCTL_sequences_clear-name_02-16-23:
    sampling_bias_correction: 2.5
    columns: ["division", "location"] 

skip_travel_history_adjustment: True


  auspice_config: "my_profiles/my_auspice_config.json"
  description: "my_profiles/my_description.md" 
  include: "defaults/include.txt"
  colors: "my_profiles/colors_all_WTD.tsv"
 
frequencies:
  min_date: 2020-01-01
  max_date: 2022-09-01

And here is the entire output:

Your config specifies 'skip_travel_history_adjustment=True'. This is now always the case, and thus this parameter can be removed.
Building DAG of jobs...
Using shell: /usr/bin/bash
Provided cores: 64
Rules claiming more threads will be scaled down.
Job counts:
        count   jobs
        1       add_branch_labels
        1       adjust_metadata_regions
        2       align
        1       all
        1       ancestral
        1       annotate_metadata_with_index
        1       build_align
        1       calculate_epiweeks
        1       clades
        1       combine_input_metadata
        1       combine_samples
        1       combine_sequences_for_subsampling
        1       diagnostic
        1       distances
        1       emerging_lineages
        1       export
        1       filter
        1       finalize
        1       include_hcov19_prefix
        1       index
        1       join_metadata_and_nextclade_qc
        1       logistic_growth
        1       make_pangolin_node_data
        1       mask
        1       mutational_fitness
        1       recency
        1       refine
        1       rename_emerging_lineages
        1       run_pangolin
        1       subsample
        1       tip_frequencies
        1       traits
        1       translate
        1       tree
        35

[Fri Feb 17 17:10:03 2023]
Job 37:
        Aligning sequences to defaults/reference_seq.fasta
            - gaps relative to reference are considered real



        python3 scripts/sanitize_sequences.py             --sequences data/references_sequences.fasta             --strip-prefixes hCoV-19/ SARS-CoV-2/             --output /dev/stdout 2> logs/sanitize_sequences_references.txt             | nextalign             --jobs=8             --reference defaults/reference_seq.fasta             --genemap defaults/annotation.gff             --genes ORF1a,ORF1b,S,ORF3a,E,M,ORF6,ORF7a,ORF7b,ORF8,N,ORF9b             --sequences /dev/stdin             --output-dir results/translations             --output-basename seqs_references             --output-fasta results/aligned_references.fasta             --output-insertions results/insertions_references.tsv > logs/align_references.txt 2>&1;
        xz -2 -T 8 results/aligned_references.fasta;
        xz -2 -T 8 results/translations/seqs_references*.fasta


[Fri Feb 17 17:10:03 2023]
Job 34:
        Combining metadata files results/sanitized_metadata_custom_data.tsv.xz results/sanitized_metadata_references.tsv.xz -> results/combined_metadata.tsv.xz and adding columns to represent origin



        python3 scripts/combine_metadata.py --metadata results/sanitized_metadata_custom_data.tsv.xz results/sanitized_metadata_references.tsv.xz --origins custom_data references --output results/combined_metadata.tsv.xz 2>&1 | tee logs/combine_input_metadata.txt


[Fri Feb 17 17:10:03 2023]
Job 36:
        Aligning sequences to defaults/reference_seq.fasta
            - gaps relative to reference are considered real



        python3 scripts/sanitize_sequences.py             --sequences data/CCTL_sequencing/All_CCTL_cat_02-16-23.fasta             --strip-prefixes hCoV-19/ SARS-CoV-2/             --output /dev/stdout 2> logs/sanitize_sequences_custom_data.txt             | nextalign             --jobs=8             --reference defaults/reference_seq.fasta             --genemap defaults/annotation.gff             --genes ORF1a,ORF1b,S,ORF3a,E,M,ORF6,ORF7a,ORF7b,ORF8,N,ORF9b             --sequences /dev/stdin             --output-dir results/translations             --output-basename seqs_custom_data             --output-fasta results/aligned_custom_data.fasta             --output-insertions results/insertions_custom_data.tsv > logs/align_custom_data.txt 2>&1;
        xz -2 -T 8 results/aligned_custom_data.fasta;
        xz -2 -T 8 results/translations/seqs_custom_data*.fasta

Activating conda environment: /local/workdir/lcc88/Nextstrain/ncov/.snakemake/conda/4875685a
Activating conda environment: /local/workdir/lcc88/Nextstrain/ncov/.snakemake/conda/4875685a
Activating conda environment: /local/workdir/lcc88/Nextstrain/ncov/.snakemake/conda/4875685a
[Fri Feb 17 17:10:10 2023]
Error in rule align:
    jobid: 37
    output: results/aligned_references.fasta.xz, results/insertions_references.tsv, results/translations/seqs_references.gene.ORF1a.fasta.xz, results/translations/seqs_references.gene.ORF1b.fasta.xz, results/translations/seqs_references.gene.S.fasta.xz, results/translations/seqs_references.gene.ORF3a.fasta.xz, results/translations/seqs_references.gene.E.fasta.xz, results/translations/seqs_references.gene.M.fasta.xz, results/translations/seqs_references.gene.ORF6.fasta.xz, results/translations/seqs_references.gene.ORF7a.fasta.xz, results/translations/seqs_references.gene.ORF7b.fasta.xz, results/translations/seqs_references.gene.ORF8.fasta.xz, results/translations/seqs_references.gene.N.fasta.xz, results/translations/seqs_references.gene.ORF9b.fasta.xz
    log: logs/align_references.txt (check log file(s) for error message)
    conda-env: /local/workdir/lcc88/Nextstrain/ncov/.snakemake/conda/4875685a
    shell:

        python3 scripts/sanitize_sequences.py             --sequences data/references_sequences.fasta             --strip-prefixes hCoV-19/ SARS-CoV-2/             --output /dev/stdout 2> logs/sanitize_sequences_references.txt             | nextalign             --jobs=8             --reference defaults/reference_seq.fasta             --genemap defaults/annotation.gff             --genes ORF1a,ORF1b,S,ORF3a,E,M,ORF6,ORF7a,ORF7b,ORF8,N,ORF9b             --sequences /dev/stdin             --output-dir results/translations             --output-basename seqs_references             --output-fasta results/aligned_references.fasta             --output-insertions results/insertions_references.tsv > logs/align_references.txt 2>&1;
        xz -2 -T 8 results/aligned_references.fasta;
        xz -2 -T 8 results/translations/seqs_references*.fasta

        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

[Fri Feb 17 17:10:10 2023]
Error in rule align:
    jobid: 36
    output: results/aligned_custom_data.fasta.xz, results/insertions_custom_data.tsv, results/translations/seqs_custom_data.gene.ORF1a.fasta.xz, results/translations/seqs_custom_data.gene.ORF1b.fasta.xz, results/translations/seqs_custom_data.gene.S.fasta.xz, results/translations/seqs_custom_data.gene.ORF3a.fasta.xz, results/translations/seqs_custom_data.gene.E.fasta.xz, results/translations/seqs_custom_data.gene.M.fasta.xz, results/translations/seqs_custom_data.gene.ORF6.fasta.xz, results/translations/seqs_custom_data.gene.ORF7a.fasta.xz, results/translations/seqs_custom_data.gene.ORF7b.fasta.xz, results/translations/seqs_custom_data.gene.ORF8.fasta.xz, results/translations/seqs_custom_data.gene.N.fasta.xz, results/translations/seqs_custom_data.gene.ORF9b.fasta.xz
    log: logs/align_custom_data.txt (check log file(s) for error message)
    conda-env: /local/workdir/lcc88/Nextstrain/ncov/.snakemake/conda/4875685a
    shell:

        python3 scripts/sanitize_sequences.py             --sequences data/CCTL_sequencing/All_CCTL_cat_02-16-23.fasta             --strip-prefixes hCoV-19/ SARS-CoV-2/             --output /dev/stdout 2> logs/sanitize_sequences_custom_data.txt             | nextalign             --jobs=8             --reference defaults/reference_seq.fasta             --genemap defaults/annotation.gff             --genes ORF1a,ORF1b,S,ORF3a,E,M,ORF6,ORF7a,ORF7b,ORF8,N,ORF9b             --sequences /dev/stdin             --output-dir results/translations             --output-basename seqs_custom_data             --output-fasta results/aligned_custom_data.fasta             --output-insertions results/insertions_custom_data.tsv > logs/align_custom_data.txt 2>&1;
        xz -2 -T 8 results/aligned_custom_data.fasta;
        xz -2 -T 8 results/translations/seqs_custom_data*.fasta

        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

Parsed 2 metadata TSVs
        custom_data (results/sanitized_metadata_custom_data.tsv.xz): 3057 strains x 10 columns
        references (results/sanitized_metadata_references.tsv.xz): 1 strains x 38 columns
Combined metadata: 3058 strains x 40 columns
[Fri Feb 17 17:10:10 2023]
Finished job 34.
1 of 35 steps (3%) done
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: /local/workdir/lcc88/Nextstrain/ncov/.snakemake/log/2023-02-17T171003.442033.snakemake.log

Thank you

rneher · February 18, 2023, 11:00pm

Job 37 above generates the command

but for quite some time now, nextalign requires the subcommand run. Could it be that you are running a rather old workflow but have recently updated nextalign?

leocaserta · February 19, 2023, 3:00am

Hi @rneher, I think that’s exactly what happened. I was having trouble running the first step of the installation of nextrain-ncov (curl -fsSL --proto ‘=https’ https://nextstrain.org/cli/installer/linux | bash), so I coppied an old nextstrain conda environment.

I’m wondering, is it possible to just download a copy of the ncov repository containing the workflow and install the environment by running “mamba env create -f nextstrain.yaml” ? (using the file inside of /workflow/envs)

corneliusroemer · February 20, 2023, 4:49pm

Hi @leocaserta

Yes, you should be able to install a recent nextstrain conda base environment using for example:

mamba create -n nextstrain -c nextstrain nextstrain-base

Maybe you also have to specify extra channels -c conda-forge -c bioconda.

However, would you be able to let us know what the trouble is you are facing when running curl -fsSL --proto ‘=https’ https://nextstrain.org/cli/installer/linux | bash? This should work, if not, it’s a bug and we’d be happy to investigate and fix it, not just for you but also for others who may encounter the same issue.

leocaserta · February 20, 2023, 10:40pm

I’m not sure why, but this time the installation went perfect, it was installed in my home directory. I think before I had some environment activated and it was giving error mesages because of that, not sure.
Thank you all for your help!

trs · February 28, 2023, 6:58pm

@leocaserta Glad to hear it’s sorted out for you now. If you encounter the issue again, we’d be keen to hear about it so we can fix it for good for others too.

Topic		Replies	Views
Error message executing new tutorial Help and Getting Started	11	1621	July 16, 2020
Error in rule align- sequence length Help and Getting Started	9	618	July 12, 2021
Error in rule "align" Help and Getting Started	3	759	January 19, 2022
Error in jobid: 22 Help and Getting Started	0	342	September 5, 2022
Snakemake step error Help and Getting Started	0	648	January 26, 2021

Error in rule align: argument '--jobs'

Related topics