Missing Nullarbor report directory

Hello,
I am trying to run Nullarbor on 4 bacterial genomes within a GVL VM. I have been following the instructions from the Nullarbor tutorial.
I first ran:
nullarbor.pl --name test1 --ref /home/ubuntu/data/Reference/Fujisawa.fasta --input ./nullarbor_test1.tab --outdir Test1_run1
Then as prompted, ran
nice make -j 2 -C /home/ubuntu/Analysis/Nullarbor/Test/Test1_run1
It seems to be running fine, and is not producing any error messages. It is generating directories for each of the individual isolates, but not the ā€œreportā€ directory where the final .html file should be. The program made it to the end of the Prokka annotation and then ended.
Many thanks for any help you can provide!

Please could you try running the make recipe for just building the report and noting any errors?

cd /home/ubuntu/Analysis/Nullarbor/Test/Test1_run1
make report

Hi Matt,
This is the message that I get at the end of running make report
Use of uninitialized value in substr at /home/linuxbrew/.linuxbrew/bin/snippy-core line 99.
substr outside of string at /home/linuxbrew/.linuxbrew/bin/snippy-core line 99.
Makefile:34: recipe for target ā€˜core.alnā€™ failed
make: *** [core.aln] Error 255
Thanks for your help!

Okay, good! Thatā€™s something to work withā€¦

You might be suffering the same problem as here: Snippy 3.2 error.

Please could you show me the output from both:

grep ">" /home/ubuntu/data/Reference/Fujisawa.fasta

head /home/ubuntu/Analysis/Nullarbor/Test/Test1_run1/{SAMPLE_NAME}/{SAMPLE_NAME}/snps.tab

Where {SAMPLE_NAME}/{SAMPLE_NAME} is just a snippy output directory for one of your samples?

Here are the two outputs:

>gi|336065242|ref|NC_015601.1| Erysipelothrix rhusiopathiae str. Fujisawa chromosome, complete genome

CHROM	POS	TYPE	REF	ALT	EVIDENCE	FTYPE	STRAND	NT_POS	AA_POS	EFFECT	LOCUS_TAG	GENE	PRODUCT
gi|336065242|ref|NC_015601.1|	636	snp	C	T	T:64 C:0							
gi|336065242|ref|NC_015601.1|	1023	snp	A	C	C:81 A:0							
gi|336065242|ref|NC_015601.1|	1164	snp	A	G	G:82 A:0							
gi|336065242|ref|NC_015601.1|	1386	snp	C	T	T:73 C:0							
gi|336065242|ref|NC_015601.1|	1409	snp	G	T	T:71 G:0							
gi|336065242|ref|NC_015601.1|	2375	snp	C	A	A:78 C:0							
gi|336065242|ref|NC_015601.1|	2507	snp	C	A	A:74 C:0							
gi|336065242|ref|NC_015601.1|	3040	snp	T	C	C:74 T:0							
gi|336065242|ref|NC_015601.1|	3199	snp	G	A	A:111 G:0

That all looks good, Iā€™m going to have to ask for more stuff - sorry!

Could you show me:

grep ">" /home/ubuntu/Analysis/Nullarbor/Test/Test1_run1/ref.fa


To explain a bit about what Iā€™m on about - the Makefile that nullarbor.pl generates is essentially a way of ordering processes that are dependent upon one another. This means that all of the previous, dependent parts of the pipeline must be complete before the next part is able to run.

Your initial problem was that you were missing the report, but this is one of the last steps of the process and is dependent upon everything else in the pipeline completing successfully.

It looks like the problem here is actually with snippy to some degree - snippy-core is not building the core SNP alignment for some reason. If we can find out why snippy-core isnā€™t working correctly, then you should get your report!

The output of that command is:
>NC_015601.1 NC_015601.1 Erysipelothrix rhusiopathiae str. Fujisawa chromosome, complete genome
Thanks so much for helping work through this!

Sorry this took so long - I wanted to try and replicate your problem on a fresh VM to try and understand better what might be happening.

In summary, you were doing everything right and this is a nullarbor.pl bug, which results in the following behaviour:

  1. The reference sequence is converted using seqret and copied into the run directory. seqret changes the fasta header of some fasta files (ones with gi|XXX|ref|ACCESSION formatting).
  2. For each isolate, snippy is run using the original reference fasta, not the converted one.
  3. Once this is complete, snippy-core is run using the converted reference fasta (potentially with a different fasta header).
  4. The mismatch in sequence names between snps.tab and ref.fa causes the error.

The fix is very simple, and just requires changing $ref to $REF on line 366 of /home/linuxbrew/.linuxbrew/bin/nullarbor.pl.

However, nullarbor2.pl is also available on GVL and doesnā€™t appear to have the same problem. I suggest that you re-rerun your analysis using nullarbor2.pl. In my testing, it ran to the end with no intervention and created a report.

EDIT: Please let me know how you get on with this!

In business!!! Using nullarbor2.pl solved the problem. Thanks so much for your help Matt!

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.