Input stream ("/dev/stdin") is empty or corrupted. Aborting

Hello Everyone, Good Morning/Evening

I was running Nextstrain on local data, I got this error message:

[Wed Jan 26 12:59:21 2022]
Error in rule align:
    jobid: 11
    output: results/aligned_RUNTEST.fasta.xz, results/insertions_RUNTEST.tsv, results/translations/seqs_RUNTEST.gene.ORF1a.fasta.xz, results/translations/seqs_RUNTEST.gene.ORF1b.fasta.xz, results/translations/seqs_RUNTEST.gene.S.fasta.xz, results/translations/seqs_RUNTEST.gene.ORF3a.fasta.xz, results/translations/seqs_RUNTEST.gene.E.fasta.xz, results/translations/seqs_RUNTEST.gene.M.fasta.xz, results/translations/seqs_RUNTEST.gene.ORF6.fasta.xz, results/translations/seqs_RUNTEST.gene.ORF7a.fasta.xz, results/translations/seqs_RUNTEST.gene.ORF7b.fasta.xz, results/translations/seqs_RUNTEST.gene.ORF8.fasta.xz, results/translations/seqs_RUNTEST.gene.N.fasta.xz, results/translations/seqs_RUNTEST.gene.ORF9b.fasta.xz
    log: logs/align_RUNTEST.txt (check log file(s) for error message)
    shell:

        python3 scripts/sanitize_sequences.py             --sequences data/RUNTEST.sequences.fasta             --strip-prefixes hCoV-19/ SARS-CoV-2/             --output /dev/stdout 2> logs/sanitize_sequences_RUNTEST.txt             | nextalign             --jobs=1             --reference defaults/reference_seq.fasta             --genemap defaults/annotation.gff             --genes ORF1a,ORF1b,S,ORF3a,E,M,ORF6,ORF7a,ORF7b,ORF8,N,ORF9b             --sequences /dev/stdin             --output-dir results/translations             --output-basename seqs_RUNTEST             --output-fasta results/aligned_RUNTEST.fasta             --output-insertions results/insertions_RUNTEST.tsv > logs/align_RUNTEST.txt 2>&1;
        xz -2 results/aligned_RUNTEST.fasta;
        xz -2 results/translations/seqs_RUNTEST*.fasta

        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
Logfile logs/align_RUNTEST.txt:
[ERROR] Nextalign: Error: when running the internal parallel pipeline: When parsing input sequences: Input stream ("/dev/stdin") is empty or corrupted. Aborting.


Removing output files of failed job align since they might be corrupted:
results/insertions_RUNTEST.tsv
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: /mnt/e/Bioinformatics/nextstrain/nextstrain/ncov/.snakemake/log/2022-01-26T125920.612835.snakemake.log

This is the content of the log file:

RUNTEST.txt

[ERROR] Nextalign: Error: when running the internal parallel pipeline: When parsing input sequences: Input stream ("/dev/stdin") is empty or corrupted. Aborting.

sanitize_sequences_RUNTEST.txt

Traceback (most recent call last):
  File "/mnt/e/Bioinformatics/nextstrain/nextstrain/ncov/scripts/sanitize_sequences.py", line 2, in <module>
    from augur.io import open_file, read_sequences, write_sequences
ModuleNotFoundError: No module named 'augur'

I tried the solutions mentioned on this link `/dev/stdout` not handled properly for the snakemake workflow · Issue #754 · nextstrain/ncov · GitHub, but did not work

I appreciate the help,

Thank you

I updated Nextstrain-cli from 3.0.3. to nextstrain-cli, 3.0.6 by running nextstrain update , Then python3.8 -m pip install --upgrade nextstrain-cli. Afterthat, I ran the script again.

The first error message is gone, but the second still appearing:

Error in rule sanitize_metadata:
    jobid: 26
    output: results/sanitized_metadata_RUNTEST.tsv.xz
    log: logs/sanitize_metadata_RUNTEST.txt (check log file(s) for error message)
    shell:

        python3 scripts/sanitize_metadata.py             --metadata data/RUNTEST.metadata.tsv             --metadata-id-columns strain name 'Virus name'             --database-id-columns 'Accession ID' gisaid_epi_isl genbank_accession             --parse-location-field Location             --rename-fields 'Virus name=strain' Type=type 'Accession ID=gisaid_epi_isl' 'Collection date=date' 'Additional location information=additional_location_information' 'Sequence length=length' Host=host 'Patient age=patient_age' Gender=sex Clade=GISAID_clade 'Pango lineage=pango_lineage' pangolin_lineage=pango_lineage Lineage=pango_lineage 'Pangolin version=pangolin_version' Variant=variant 'AA Substitutions=aa_substitutions' aaSubstitutions=aa_substitutions 'Submission date=date_submitted' 'Is reference?=is_reference' 'Is complete?=is_complete' 'Is high coverage?=is_high_coverage' 'Is low coverage?=is_low_coverage' N-Content=n_content GC-Content=gc_content             --strip-prefixes hCoV-19/ SARS-CoV-2/                          --output results/sanitized_metadata_RUNTEST.tsv.xz 2>&1 | tee logs/sanitize_metadata_RUNTEST.txt

        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
Logfile logs/sanitize_metadata_RUNTEST.txt:
Traceback (most recent call last):
  File "/mnt/e/Bioinformatics/nextstrain/nextstrain/ncov/scripts/sanitize_metadata.py", line 2, in <module>
    from augur.io import open_file, read_metadata
ModuleNotFoundError: No module named 'augur'

I have checked the issue here and this is the output of which python3

/home/linuxbrew/.linuxbrew/bin/python3

any ideas on how to solve it?

Thank you

Hi @AroobAlhumaidy. Are you by chance running this inside of WSL on Windows? I ask because the file paths (/mnt/e/Bioinformatics) in the error suggest that to me. We’ve seen issues sometimes with our workflows when using Windows drives from inside WSL.

If you are using WSL, can you try a potential workaround of using a non-Windows path in WSL? (I’m not personally familiar enough with WSL to be more specific here…) Trying that out will at least let us verify the cause of the issue, even if it’s not an ideal solution.

Can you also copy and paste the exact command you’re running?

Hello @trs,
Yes, I am using WSL on Windows

I will try to do the workaround

Here is the exact command I’m running: $ snakemake --profile my_profiles/RUNTEST/ -p

I appreciate your help,

Ok, thanks! Command looks fine. I’ll be interested to hear how the workaround goes.