Error in rule colours: jobid 5

llau · October 14, 2020, 1:12pm

Hi,

I’m currently getting started with the nanopore sequencing for covid strain testing at Sickkids DPLM. I’m use to Illumina WES/WGS sequencing so this is something totally different for me.

I ran the installation and setup on our HPC centos7 with no issues using my own installation of miniconda3.

Unfortunately, when trying to run the example step:
snakemake --cores 1 --profile ./my_profiles/getting_started

I’m getting the following error:
[Wed Oct 14 06:09:17 2020]
Job 5: Constructing colors file

    python3 scripts/assign-colors.py             --ordering defaults/color_ordering.tsv             --color-schemes defaults/color_schemes.tsv             --output results/global/colors.tsv             --metadata data/example_metadata.tsv 2>&1 | tee logs/colors_global.txt

Traceback (most recent call last):
File “scripts/assign-colors.py”, line 22, in
for line in f.readlines():
File “/hpf/largeprojects/pray/llau/programs/miniconda3/miniconda3_2020/envs/nextstrain/lib/python3.6/encodings/ascii.py”, line 26, in decode
return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: ‘ascii’ codec can’t decode byte 0xc3 in position 7071: ordinal not in range(128)
[Wed Oct 14 06:09:18 2020]
Error in rule colors:
jobid: 5
output: results/global/colors.tsv
log: logs/colors_global.txt (check log file(s) for error message)
shell:

    python3 scripts/assign-colors.py             --ordering defaults/color_ordering.tsv             --color-schemes defaults/color_schemes.tsv             --output results/global/colors.tsv             --metadata data/example_metadata.tsv 2>&1 | tee logs/colors_global.txt
    
    (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

The log file has the same error as above this:
(nextstrain) [llau@qlogin11 logs]$ cat colors_global.txt
Traceback (most recent call last):
File “scripts/assign-colors.py”, line 22, in
for line in f.readlines():
File “/hpf/largeprojects/pray/llau/programs/miniconda3/miniconda3_2020/envs/nextstrain/lib/python3.6/encodings/ascii.py”, line 26, in decode
return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: ‘ascii’ codec can’t decode byte 0xc3 in position 7071: ordinal not in range(128)

I’ve tried to add :
with open(args.ordering, ‘r+’, encoding=‘utf-8’) as f:
Unfortunately, that caused more errors downstream in the write out:
f.write(trait_name + “\t” + trait_value + “\t” + color + “\n”)

Any help would be greatly appreciated! Thanks so much!
Lynette

trs · October 14, 2020, 6:42pm

Welcome, @llau! I believe the issue here is that your HPC system is using an ASCII (not UTF-8) locale (like C) but the assign-colors.py script assumes a UTF-8 locale (like en_CA.UTF-8).

For troubleshooting purposes, can you run the locale command on your HPC system and paste the output here?

The long-term fix is to update the assign-colors.py script to be explicit about its file encoding instead of assuming the locale’s encoding will be UTF-8. As a workaround in the meantime, however, you can try running snakemake after overriding the default locale to use a UTF-8 one. For example:

LC_ALL=en_CA.UTF-8 snakemake --cores 1 --profile ./my_profiles/getting_started

We fixed a similar issue in Augur itself earlier this year, but your problem seems to be within the ncov-specific assign-colors.py.

Let me know what locale says and if the workaround above works for you?

llau · October 14, 2020, 6:53pm

Hi trs! Thanks so much for your help!!!

(nextstrain) [llau@qlogin11 ~]$ locale

locale: Cannot set LC_CTYPE to default locale: No such file or directory

locale: Cannot set LC_MESSAGES to default locale: No such file or directory

locale: Cannot set LC_ALL to default locale: No such file or directory

LANG=C.UTF-8

LC_CTYPE=“C.UTF-8”

LC_NUMERIC=“C.UTF-8”

LC_TIME=“C.UTF-8”

LC_COLLATE=“C.UTF-8”

LC_MONETARY=“C.UTF-8”

LC_MESSAGES=“C.UTF-8”

LC_PAPER=“C.UTF-8”

LC_NAME=“C.UTF-8”

LC_ADDRESS=“C.UTF-8”

LC_TELEPHONE=“C.UTF-8”

LC_MEASUREMENT=“C.UTF-8”

LC_IDENTIFICATION=“C.UTF-8”

LC_ALL=
(nextstrain) [llau@qlogin11 ~]$

I’m trying the workaround now!

trs · October 14, 2020, 9:02pm

llau:

locale: Cannot set LC_CTYPE to default locale: No such file or directory
locale: Cannot set LC_MESSAGES to default locale: No such file or directory
locale: Cannot set LC_ALL to default locale: No such file or directory

Ah, I think these warning messages from locale are telling. The C.UTF-8 locale that’s set is a UTF-8 locale (so that’s good!), but it appears unsupported on your HPC system (so that’s bad!). It appears this means that something (Python or the system) ends up falling back to the basic C locale, which uses ASCII.

llau · October 15, 2020, 7:40pm

Thanks so much @trs!

Good to know - I have contacted our HPC admins to help as I’m beta testing there roll out of centos7.

I have included one of the samples sequenced here in the GSAID fasta and meta data file. Do you know how long approximate it will run for and how much RAM is needed? I used 4 cores but it got killed as it exceeded the 32Gb I requested after ~10hrs. I have tried again with 100Gb…

Thanks again for the help!

Lynette

trs · November 9, 2020, 8:31pm

@llau I’d be curious to hear if you got this working or not.

Apologies for missing your question about the runtime and RAM. I don’t have a good handle on the current numbers for those, but we could probably dig up some metrics from our own recent runs if they’d still be helpful.

llau · November 9, 2020, 8:44pm

Hi @trs!

Yes! I did! Thanks so much! I managed to get it working and finished in ~10hrs using 64Gb and 4 threads! But this is using all the gsaid sequences with ours.

Thanks so much again!

trs · November 9, 2020, 8:56pm

Awesome! Glad to hear it’s working!

Topic		Replies	Views
Error running snakemake Help and Getting Started	2	1952	January 5, 2021
Snakemake step error Help and Getting Started	0	648	January 26, 2021
Help with error in jobid: 7 Help and Getting Started	1	641	March 23, 2021
KeyError: 'Nextclade_pango' in ncov build General	8	768	October 5, 2022
Error in rule "align" Help and Getting Started	3	759	January 19, 2022

Error in rule colours: jobid 5

Related topics