Interested in installing NASP - how to source Python 3?


#1

Hello,

I’m interested in installing the NASP pipeline on my virtual machine: https://github.com/TGenNorth/NASP
It needs to be run in Python 3. I noticed that python 3.5 is already available on GVL, but I wondered what the best way to implement it is. Should I be working within a virtual python environment?

Thanks!


#2

You can install NASP system-wide or in a virtualenv.
I prefer the latter but you have to remeber to activate your virutalenv every time you want ot use NASP

ubuntu@gvl:~$ python3 -m venv /home/ubuntu/nasp
ubuntu@gvl:~$ source /home/ubuntu/nasp/bin/activate
(nasp) ubuntu@gvl:~$ pip3 install nasp
Collecting nasp
  Downloading https://files.pythonhosted.org/packages/c7/52/0eebb341d2a1022f3695faa7e4c83212b27ba14364de4c5006d396f3ac5a/nasp-1.1.2-py2.py3-none-any.whl (3.8MB)
    100% |████████████████████████████████| 3.8MB 246kB/s
Installing collected packages: nasp
Successfully installed nasp-1.1.2
You are using pip version 9.0.3, however version 18.0 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
(nasp) ubuntu@liying-gvl:~$ nasp
Welcome to NASP version 1.1.2.

Where would you like output files to be written [nasp_results]?

Rad


#3

Hi Rad,
Thanks for the quick response! Setting up the virtual environment worked as suggested. As part of the NASP settings (http://tgennorth.github.io/NASP/usage.html), it asks about what system to use for job management:

What system do you use for job management (PBS/TORQUE, SLURM, SGE/OGE, and 'none' are currently supported) [PBS]?

If I choose ‘none’, the NASP program doesn’t seem to run properly - it outputs multiple lines reading total_mem = 62. Does CLIMB support any of the other job management options?

Thanks!


#4

Job management software is mostly used in a traditional HPC environment so, “none” is the correct answer.

Have you tried running NASP with the example data? (https://github.com/TGenNorth/NASP/tree/master/examples/example_1)

For future troubleshooting, the content of the nasp_results/runlog.txt would be useful.

Rad


#5

Thanks for the suggestion! I have tried running NASP with the example data, but am running into the same issue. This is the output in the runlog.txt file:

09/19/2018 11:34:24 INFO     $PATH=/home/ubuntu/nasp/bin:/home/ubuntu/bin:/home/ubuntu/.local/bin:/home/linuxbrew/.linuxbrew/bin:/home/linuxbrew/.linuxbrew/sbin:/mnt/galaxy/tools/bin:/usr/lib/postgresql/9.5/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/usr/lib/jvm/java-8-oracle/bin:/usr/lib/jvm/java-8-oracle/db/bin:/usr/lib/jvm/java-8-oracle/jre/bin
09/19/2018 11:34:24 INFO     $PYTHONPATH=
09/19/2018 11:34:24 INFO     LOADEDMODULES=
09/19/2018 11:34:38 INFO     Reference = ('reference', '/home/ubuntu/Analysis/NASP/example_1/reference.fasta')
09/19/2018 11:34:40 INFO     FindDups = True
09/19/2018 11:34:42 INFO     JobSubmitter = NONE
09/19/2018 11:34:42 INFO     RunName = example_test1
09/19/2018 11:34:42 INFO     Samtools = ('Samtools', '/home/linuxbrew/.linuxbrew/bin/samtools', '', {})
09/19/2018 11:34:42 INFO     Index = ('Index', '/home/ubuntu/nasp/bin', '', {'name': 'nasp_index', 'num_cpus': '1', 'mem_requested': '2', 'walltime': '4', 'queue': '', 'args': ''})
09/19/2018 11:35:09 INFO     Looking for external fastas in /home/ubuntu/Analysis/NASP/example_1...
09/19/2018 11:35:09 INFO     ('reference', '/home/ubuntu/Analysis/NASP/example_1/reference.fasta')
09/19/2018 11:35:09 INFO     ('example_1', '/home/ubuntu/Analysis/NASP/example_1/example_1.fasta')
09/19/2018 11:35:28 INFO     AssemblyImporter = ('AssemblyImporter', '/home/linuxbrew/.linuxbrew/bin/delta-filter', '', {'num_cpus': '1', 'mem_requested': '4', 'walltime': '4', 'queue': '', 'args': ''})
09/19/2018 11:35:28 INFO     DupFinder = ('DupFinder', '/home/linuxbrew/.linuxbrew/bin/nucmer', '', {'num_cpus': '1', 'mem_requested': '4', 'walltime': '4', 'queue': '', 'args': ''})
09/19/2018 11:35:39 INFO     Looking for read files in /home/ubuntu/Analysis/NASP/example_1...
09/19/2018 11:35:39 INFO     ('example_1_L001', '/home/ubuntu/Analysis/NASP/example_1/example_1_L001_R1_001.fastq.gz', '/home/ubuntu/Analysis/NASP/example_1/example_1_L001_R2_001.fastq.gz')
09/19/2018 11:35:47 INFO     Getting Aligners...
09/19/2018 11:35:51 INFO     ('BWA-mem', '/home/linuxbrew/.linuxbrew/bin/bwa', '', {'num_cpus': '4', 'mem_requested': '10', 'walltime': '36', 'queue': '', 'args': ''})
09/19/2018 11:35:55 INFO     Getting SNP Callers...
09/19/2018 11:35:59 INFO     ('GATK', '/home/ubuntu/programs/GenomeAnalysisTK.jar', '-stand_call_conf 100 -ploidy 1', {'num_cpus': '4', 'mem_requested': '10', 'walltime': '36', 'queue': '', 'args': ''})
09/19/2018 11:36:07 INFO     ('SAMtools', '/home/linuxbrew/.linuxbrew/bin/bcftools', '', {'num_cpus': '4', 'mem_requested': '10', 'walltime': '36', 'queue': '', 'args': ''})
09/19/2018 11:36:08 INFO     Picard = ('Picard', '/home/ubuntu/programs/picard.jar', '', {})
09/19/2018 11:36:12 INFO     CoverageFilter = 10
09/19/2018 11:36:12 INFO     ProportionFilter = 0.9
09/19/2018 11:36:14 INFO     MatrixGenerator = ('MatrixGenerator', '/home/ubuntu/nasp/lib/python3.6/site-packages/nasp/nasptool_linux_64', '', {'name': 'nasp_matrix', 'num_cpus': '8', 'mem_requested': '8', 'walltime': '48', 'queue': '', 'args': ''})
09/19/2018 11:36:24 INFO     command = format_fasta --inputfasta /home/ubuntu/Analysis/NASP/example_1/reference.fasta --outputfasta /home/ubuntu/Analysis/NASP/example_test1/reference/reference.fasta
/home/linuxbrew/.linuxbrew/bin/bwa index /home/ubuntu/Analysis/NASP/example_test1/reference/reference.fasta
java -Xmx2G -jar /home/ubuntu/programs/picard.jar CreateSequenceDictionary R=/home/ubuntu/Analysis/NASP/example_test1/reference/reference.fasta O=/home/ubuntu/Analysis/NASP/example_test1/reference/reference.dict
/home/linuxbrew/.linuxbrew/bin/samtools faidx /home/ubuntu/Analysis/NASP/example_test1/reference/reference.fasta
09/19/2018 11:36:24 DEBUG    submit_command = while [ `free -m | grep cache: | awk '{ print $4 }'` -lt 1500 ]; do sleep 300; done; format_fasta --inputfasta /home/ubuntu/Analysis/NASP/example_1/reference.fasta --outputfasta /home/ubuntu/Analysis/NASP/example_test1/reference/reference.fasta; /home/linuxbrew/.linuxbrew/bin/bwa index /home/ubuntu/Analysis/NASP/example_test1/reference/reference.fasta; java -Xmx2G -jar /home/ubuntu/programs/picard.jar CreateSequenceDictionary R=/home/ubuntu/Analysis/NASP/example_test1/reference/reference.fasta O=/home/ubuntu/Analysis/NASP/example_test1/reference/reference.dict; /home/linuxbrew/.linuxbrew/bin/samtools faidx /home/ubuntu/Analysis/NASP/example_test1/reference/reference.fasta
09/19/2018 11:36:24 INFO     jobid = 16352
09/19/2018 11:36:24 INFO     command = find_duplicates --nucmerpath /home/linuxbrew/.linuxbrew/bin/nucmer --reference /home/ubuntu/Analysis/NASP/example_test1/reference/reference.fasta
09/19/2018 11:36:24 DEBUG    submit_command = while kill -0 16352; do sleep 300; done; while [ `free -m | grep cache: | awk '{ print $4 }'` -lt 3000 ]; do sleep 300; done; find_duplicates --nucmerpath /home/linuxbrew/.linuxbrew/bin/nucmer --reference /home/ubuntu/Analysis/NASP/example_test1/reference/reference.fasta
09/19/2018 11:36:24 INFO     jobid = 16362
09/19/2018 11:36:24 INFO     command = format_fasta --inputfasta /home/ubuntu/Analysis/NASP/example_1/example_1.fasta --outputfasta /home/ubuntu/Analysis/NASP/example_test1/external/example_1.fasta
convert_external_genome --nucmerpath /home/linuxbrew/.linuxbrew/bin/nucmer --nucmerargs '' --deltafilterpath /home/linuxbrew/.linuxbrew/bin/delta-filter --deltafilterargs '' --reference /home/ubuntu/Analysis/NASP/example_test1/reference/reference.fasta --external /home/ubuntu/Analysis/NASP/example_test1/external/example_1.fasta --name example_1
09/19/2018 11:36:24 DEBUG    submit_command = while kill -0 16352; do sleep 300; done; while [ `free -m | grep cache: | awk '{ print $4 }'` -lt 3000 ]; do sleep 300; done; format_fasta --inputfasta /home/ubuntu/Analysis/NASP/example_1/example_1.fasta --outputfasta /home/ubuntu/Analysis/NASP/example_test1/external/example_1.fasta; convert_external_genome --nucmerpath /home/linuxbrew/.linuxbrew/bin/nucmer --nucmerargs '' --deltafilterpath /home/linuxbrew/.linuxbrew/bin/delta-filter --deltafilterargs '' --reference /home/ubuntu/Analysis/NASP/example_test1/reference/reference.fasta --external /home/ubuntu/Analysis/NASP/example_test1/external/example_1.fasta --name example_1
09/19/2018 11:36:24 INFO     jobid = 16368
09/19/2018 11:36:24 INFO     command = /home/linuxbrew/.linuxbrew/bin/bwa mem -R '@RG\tID:example_1_L001\tSM:example_1_L001'  -t 4 /home/ubuntu/Analysis/NASP/example_test1/reference/reference.fasta /home/ubuntu/Analysis/NASP/example_1/example_1_L001_R1_001.fastq.gz /home/ubuntu/Analysis/NASP/example_1/example_1_L001_R2_001.fastq.gz | /home/linuxbrew/.linuxbrew/bin/samtools view -S -b -h - | /home/linuxbrew/.linuxbrew/bin/samtools sort - example_1_L001-bwamem 
 /home/linuxbrew/.linuxbrew/bin/samtools index example_1_L001-bwamem.bam
09/19/2018 11:36:24 DEBUG    submit_command = while kill -0 16352; do sleep 300; done; while [ `free -m | grep cache: | awk '{ print $4 }'` -lt 7500 ]; do sleep 300; done; /home/linuxbrew/.linuxbrew/bin/bwa mem -R '@RG\tID:example_1_L001\tSM:example_1_L001'  -t 4 /home/ubuntu/Analysis/NASP/example_test1/reference/reference.fasta /home/ubuntu/Analysis/NASP/example_1/example_1_L001_R1_001.fastq.gz /home/ubuntu/Analysis/NASP/example_1/example_1_L001_R2_001.fastq.gz | /home/linuxbrew/.linuxbrew/bin/samtools view -S -b -h - | /home/linuxbrew/.linuxbrew/bin/samtools sort - example_1_L001-bwamem ;  /home/linuxbrew/.linuxbrew/bin/samtools index example_1_L001-bwamem.bam
09/19/2018 11:36:24 INFO     jobid = 16374
09/19/2018 11:36:24 INFO     command = java -Xmx10G -jar /home/ubuntu/programs/GenomeAnalysisTK.jar -T UnifiedGenotyper -dt NONE -glm BOTH -I /home/ubuntu/Analysis/NASP/example_test1/bwamem/example_1_L001-bwamem.bam -R /home/ubuntu/Analysis/NASP/example_test1/reference/reference.fasta -nt 4 -o example_1_L001-bwamem-gatk.vcf -out_mode EMIT_ALL_CONFIDENT_SITES -baq RECALCULATE -stand_call_conf 100 -ploidy 1
09/19/2018 11:36:24 DEBUG    submit_command = while kill -0 16374; do sleep 300; done; while [ `free -m | grep cache: | awk '{ print $4 }'` -lt 7500 ]; do sleep 300; done; java -Xmx10G -jar /home/ubuntu/programs/GenomeAnalysisTK.jar -T UnifiedGenotyper -dt NONE -glm BOTH -I /home/ubuntu/Analysis/NASP/example_test1/bwamem/example_1_L001-bwamem.bam -R /home/ubuntu/Analysis/NASP/example_test1/reference/reference.fasta -nt 4 -o example_1_L001-bwamem-gatk.vcf -out_mode EMIT_ALL_CONFIDENT_SITES -baq RECALCULATE -stand_call_conf 100 -ploidy 1
09/19/2018 11:36:24 INFO     jobid = 16380
09/19/2018 11:36:24 INFO     command = /home/linuxbrew/.linuxbrew/bin/samtools mpileup -uD -d 10000000 -f /home/ubuntu/Analysis/NASP/example_test1/reference/reference.fasta /home/ubuntu/Analysis/NASP/example_test1/bwamem/example_1_L001-bwamem.bam | /home/linuxbrew/.linuxbrew/bin/bcftools view -ceg  - > /home/ubuntu/Analysis/NASP/example_test1/samtools/example_1_L001-bwamem-samtools.vcf
09/19/2018 11:36:24 DEBUG    submit_command = while kill -0 16374; do sleep 300; done; while [ `free -m | grep cache: | awk '{ print $4 }'` -lt 7500 ]; do sleep 300; done; /home/linuxbrew/.linuxbrew/bin/samtools mpileup -uD -d 10000000 -f /home/ubuntu/Analysis/NASP/example_test1/reference/reference.fasta /home/ubuntu/Analysis/NASP/example_test1/bwamem/example_1_L001-bwamem.bam | /home/linuxbrew/.linuxbrew/bin/bcftools view -ceg  - > /home/ubuntu/Analysis/NASP/example_test1/samtools/example_1_L001-bwamem-samtools.vcf
09/19/2018 11:36:24 INFO     jobid = 16386
09/19/2018 11:36:24 INFO     command = /home/ubuntu/nasp/lib/python3.6/site-packages/nasp/nasptool_linux_64 matrix --dto-file /home/ubuntu/Analysis/NASP/example_test1/matrix_dto.xml --num-threads 8
09/19/2018 11:36:24 DEBUG    submit_command = while [ -s /home/ubuntu/Analysis/NASP/example_test1/nasp_matrix_dependent_pids ]; do sleep 600; for pid in `cat /home/ubuntu/Analysis/NASP/example_test1/nasp_matrix_dependent_pids`; do kill -0 "$pid" 2>/dev/null || sed -i "/^$pid$/d" /home/ubuntu/Analysis/NASP/example_test1/nasp_matrix_dependent_pids; done; done; rm /home/ubuntu/Analysis/NASP/example_test1/nasp_matrix_dependent_pids; while [ `free -m | grep cache: | awk '{ print $4 }'` -lt 46500 ]; do sleep 300; done; /home/ubuntu/nasp/lib/python3.6/site-packages/nasp/nasptool_linux_64 matrix --dto-file /home/ubuntu/Analysis/NASP/example_test1/matrix_dto.xml --num-threads 8
09/19/2018 11:36:24 INFO     jobid = 16392
09/19/2018 11:36:24 INFO     command = /home/ubuntu/nasp/lib/python3.6/site-packages/nasp/nasptool_linux_64 export --type vcf bestsnp.tsv > bestsnp.vcf & /home/ubuntu/nasp/lib/python3.6/site-packages/nasp/nasptool_linux_64 export --type vcf missingdata.tsv > missingdata.vcf & /home/ubuntu/nasp/lib/python3.6/site-packages/nasp/nasptool_linux_64 export --type fasta bestsnp.tsv > bestsnp.fasta & /home/ubuntu/nasp/lib/python3.6/site-packages/nasp/nasptool_linux_64 export --type fasta missingdata.tsv > missingdata.fasta & wait
09/19/2018 11:36:24 DEBUG    submit_command = while kill -0 16392; do sleep 300; done; while [ `free -m | grep cache: | awk '{ print $4 }'` -lt 46500 ]; do sleep 300; done; /home/ubuntu/nasp/lib/python3.6/site-packages/nasp/nasptool_linux_64 export --type vcf bestsnp.tsv > bestsnp.vcf & /home/ubuntu/nasp/lib/python3.6/site-packages/nasp/nasptool_linux_64 export --type vcf missingdata.tsv > missingdata.vcf & /home/ubuntu/nasp/lib/python3.6/site-packages/nasp/nasptool_linux_64 export --type fasta bestsnp.tsv > bestsnp.fasta & /home/ubuntu/nasp/lib/python3.6/site-packages/nasp/nasptool_linux_64 export --type fasta missingdata.tsv > missingdata.fasta & wait
09/19/2018 11:36:24 INFO     jobid = 16398

#6

That logfile looks pretty reasonable to me.

Are the output directories correctly populated?

From a quick look at the code, it doesn’t look like its designed to print much to stdout, ss there something you’re expecting to see on-screen during the run?

Matt


#7

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.