Assembly stuck in the SMRT portal


#1

Hi, I’m new in CLIMB and I’m working on pacbio-sequenced bacterial genomes. The last week I followed the tutorial posted here for assembling the genomes:


Everything worked perfectly, three genomes were assembled in overnight runs. Yet, I’ve been struggling with a couple of jobs since three days ago. First, I launched a de novo assembly and after ~24h it didn’t pass the filter stage of the process. I decided to stop the job and launch a new one but it seems it got stuck exactly at the same point. This behavior is very different to what I observed for the first three assemblies which quickly passed the Filtering stage and produced the “Adapters” and “Subread Filtering” graphs. The genomes corresponding to the stuck jobs are very similar to those successfully assembled in terms of size and number of reads. Do you think there is something wrong with the program or I’m just being impatient?
This is what I see when checking the status of the assemblies:


#2

Glad you managed to get some assemblies working nicely!

If the input data for this assembly is broadly similar to the others, I think its probably stopped at this point. We’ve had a couple of minor problems with the underlying system in Birmingham this week, so depending on where this VM is hosted, this might explain the stalling.

I would first try to reboot this instance using https://bryn.climb.ac.uk. It will take a little while to reboot (~10 mins before all services are back up), but then you should be able to try the assembly again.

Please let me know what happens after a reboot. There’s lots of other things we can try to make this work, but we’ll start with the simple stuff!


#3

Hi Matt,

Thank you very much for the quick reply. I already rebooted the instance and it seems the “IT Crowd” rule is working. The previous assembly changed the status from “In progress” to “Failed” and the new assembly I launched is behaving like the first assemblies, i.e. it quickly passed the Filtering step and produced the first chart and metrics. I’ll let you know by tomorrow if the assembly finished properly.

Very best,


#4

Nice! Hope everything continues going to plan…


#5

Hi Matt,

Here are the updates for my assemblies in the SMRT portal:

The good news is that after rebooting the instance the assembly I launched finished perfectly in an overnight run.

The bad news is that the next assembly failed and I got the error message: No space left on device

I’ve been reading up about this issue and I think it might be related with the space in the /dev/sda1 filesystem since now I just have 19G available in it as compared to the 54G I had before running the assembly. Still, I don’t have a clue about how to solve it.

This means that the output files of the assemblies are stored in the /dev/sda1?. Could you help me to solve this new issue please?


#6

Could you please show me the output of the log from the button next to the “No space left on device” message?

I don’t know exactly where smrtanalysis keeps its intermediate files, so that should help narrow it down.

Thanks for the awesome error reporting BTW, it really helps narrow down the cause of the problem quickly!


#7

Thank you very much for the quick reply and your help Matt, I appreciate it.

Of course, but the output of the log is very large and the limit of characters here is not enough to paste it. There is a way to upload a file here?


#8

I’ve PM’d you with an email address to send that logfile to.


#9

I sent the file, thanks.


#10

Thanks for the file.

It appears that smrtanalysis stores jobfiles for every run in /mnt/gvl/apps/smrtanalysis/userdata/jobs, which is on /dev/vda1 and filling it to capacity.

To free up some space, you can delete old jobs from the SMRTPortal when you’ve downloaded your assemblies and summary files for that assembly, or you can follow the instruction given here to migrate the data directory to a new filesystem (volume).


#11

Hi Matt,

Awesome work, thank you very much for getting this solved. Here is a summary of the points/tips that could help users facing similar issues with the SMRT portal:

  • If the job gets stalled at some point for ages you can try rebooting the instance and relaunch the job. That worked for me. Remember to mount your attached volume after rebooting (in case you are using one for your jobs in the SMRT portal which I highly recommend as space issues may lead to job fails [see below]).

  • As Matt pointed out before, it seems that smrtanalysis doesn’t purge the files after a run and stores the jobfiles in dev/vda1. This causes job fails and “No space left on device” errors after several runs. You can check how the available space in dev/vda1 is decreasing after each run with df -h. The easiest way to avoid this is removing old runs once you’ve saved your output files. The good news about this is that you can get your output files from the /mnt/gvl/apps/smrtanalysis/userdata/jobs directory instead of clicking/downloading them from the SMRT portal webpage.

If I’m wrong about some of this information I’m sure Matt can correct me.

Thanks again!


#12

Thanks for the informative write-up!

To solve the previous-jobs-filling-up-root-disk problem, you can follow the directions here, which briefly consist of:

# Elevate permissions to superuser
sudo su

# Copy current runfiles to a new directory on the attached volume
cp -au /mnt/gvl/apps/smrtanalysis/userdata/jobs/016 /mountpoint/of/newfilesystem/pacbio/userdata/jobs/

# Remove the old run file directory
rm -Rf /mnt/gvl/apps/smrtanalysis/userdata/jobs/016

# Symlink the old jobs location to the new jobs directory on the new volume
ln -s /mountpoint/of/newfilesystem/pacbio/userdata/jobs/ /mnt/gvl/apps/smrtanalysis/userdata/jobs/016

# Make sure the smrtanalysis user can read/write the new directory
chown -R smrtanalysis:smrtanalysis /mountpoint/of/newfilesystem/pacbio/userdata/