Missing Nullarbor report directory

tayaforde · May 2, 2018, 10:18pm

Hello,
I am trying to run Nullarbor on 4 bacterial genomes within a GVL VM. I have been following the instructions from the Nullarbor tutorial.
I first ran:
nullarbor.pl --name test1 --ref /home/ubuntu/data/Reference/Fujisawa.fasta --input ./nullarbor_test1.tab --outdir Test1_run1
Then as prompted, ran
nice make -j 2 -C /home/ubuntu/Analysis/Nullarbor/Test/Test1_run1
It seems to be running fine, and is not producing any error messages. It is generating directories for each of the individual isolates, but not the “report” directory where the final .html file should be. The program made it to the end of the Prokka annotation and then ended.
Many thanks for any help you can provide!

mattbull · May 3, 2018, 7:09am

Please could you try running the make recipe for just building the report and noting any errors?

cd /home/ubuntu/Analysis/Nullarbor/Test/Test1_run1
make report

tayaforde · May 3, 2018, 7:19am

Hi Matt,
This is the message that I get at the end of running make report
Use of uninitialized value in substr at /home/linuxbrew/.linuxbrew/bin/snippy-core line 99.
substr outside of string at /home/linuxbrew/.linuxbrew/bin/snippy-core line 99.
Makefile:34: recipe for target ‘core.aln’ failed
make: *** [core.aln] Error 255
Thanks for your help!

mattbull · May 3, 2018, 7:25am

Okay, good! That’s something to work with…

You might be suffering the same problem as here: Snippy 3.2 error.

Please could you show me the output from both:

grep ">" /home/ubuntu/data/Reference/Fujisawa.fasta

head /home/ubuntu/Analysis/Nullarbor/Test/Test1_run1/{SAMPLE_NAME}/{SAMPLE_NAME}/snps.tab

Where {SAMPLE_NAME}/{SAMPLE_NAME} is just a snippy output directory for one of your samples?

tayaforde · May 3, 2018, 7:30am

Here are the two outputs:

>gi|336065242|ref|NC_015601.1| Erysipelothrix rhusiopathiae str. Fujisawa chromosome, complete genome

CHROM	POS	TYPE	REF	ALT	EVIDENCE	FTYPE	STRAND	NT_POS	AA_POS	EFFECT	LOCUS_TAG	GENE	PRODUCT
gi|336065242|ref|NC_015601.1|	636	snp	C	T	T:64 C:0							
gi|336065242|ref|NC_015601.1|	1023	snp	A	C	C:81 A:0							
gi|336065242|ref|NC_015601.1|	1164	snp	A	G	G:82 A:0							
gi|336065242|ref|NC_015601.1|	1386	snp	C	T	T:73 C:0							
gi|336065242|ref|NC_015601.1|	1409	snp	G	T	T:71 G:0							
gi|336065242|ref|NC_015601.1|	2375	snp	C	A	A:78 C:0							
gi|336065242|ref|NC_015601.1|	2507	snp	C	A	A:74 C:0							
gi|336065242|ref|NC_015601.1|	3040	snp	T	C	C:74 T:0							
gi|336065242|ref|NC_015601.1|	3199	snp	G	A	A:111 G:0

mattbull · May 3, 2018, 7:44am

That all looks good, I’m going to have to ask for more stuff - sorry!

Could you show me:

grep ">" /home/ubuntu/Analysis/Nullarbor/Test/Test1_run1/ref.fa

To explain a bit about what I’m on about - the Makefile that nullarbor.pl generates is essentially a way of ordering processes that are dependent upon one another. This means that all of the previous, dependent parts of the pipeline must be complete before the next part is able to run.

Your initial problem was that you were missing the report, but this is one of the last steps of the process and is dependent upon everything else in the pipeline completing successfully.

It looks like the problem here is actually with snippy to some degree - snippy-core is not building the core SNP alignment for some reason. If we can find out why snippy-core isn’t working correctly, then you should get your report!

tayaforde · May 3, 2018, 10:40am

The output of that command is:
>NC_015601.1 NC_015601.1 Erysipelothrix rhusiopathiae str. Fujisawa chromosome, complete genome
Thanks so much for helping work through this!

mattbull · May 3, 2018, 5:38pm

Sorry this took so long - I wanted to try and replicate your problem on a fresh VM to try and understand better what might be happening.

In summary, you were doing everything right and this is a nullarbor.pl bug, which results in the following behaviour:

The reference sequence is converted using seqret and copied into the run directory. seqret changes the fasta header of some fasta files (ones with gi|XXX|ref|ACCESSION formatting).
For each isolate, snippy is run using the original reference fasta, not the converted one.
Once this is complete, snippy-core is run using the converted reference fasta (potentially with a different fasta header).
The mismatch in sequence names between snps.tab and ref.fa causes the error.

The fix is very simple, and just requires changing $ref to $REF on line 366 of /home/linuxbrew/.linuxbrew/bin/nullarbor.pl.

However, nullarbor2.pl is also available on GVL and doesn’t appear to have the same problem. I suggest that you re-rerun your analysis using nullarbor2.pl. In my testing, it ran to the end with no intervention and created a report.

EDIT: Please let me know how you get on with this!

tayaforde · May 3, 2018, 7:31pm

In business!!! Using nullarbor2.pl solved the problem. Thanks so much for your help Matt!

system · May 10, 2018, 7:31pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.