Install

Requirements

Hardware

  • CPU
    • Quad Core 2.5GHz or better
      • More cores = faster run time when running multiple samples
      • Faster GHz = faster each sample runs
  • RAM
    • This really depends on your data size

      If you are analyzing a 96 sample run then you should be fine with 1GB per CPU core

      If you are analyzing a 24 sample run then you will probably need about 4GB per CPU core since there will be more data

Python Packages

All python packages can be defined in a pip requirements.txt file The pipeline comes with all of the necessary python packages already defined inside of requirements.txt.

System Packages

The pipeline requires some system level packages(software installed via your Linux distribution’s package manager) The installer looks for the system_packages.lst and installs the correct packages using that file. This file is a simple json formatted file that defines packages for each package manager

Roche Utilities

If you intend on using the roche_sync you will need to ensure that the sfffile command is in your PATH. That is, if you execute $> sfffile it returns the help message for the command.

This command should automatically be installed and put in your path if you install the Data Analysis CD #3 that was given to you with your Roche instrument.

MidParse.conf

If you inted on using the roche_sync you may need to edit the included ngs_mapper/MidParse.conf file before installing. This file is formatted to be used by the Roche utilities and more information about how it is used can be found in the Roche documentation.

Installation

  1. Clone/Download the version you want

    Assumes you already have git installed. If not you will need to get it installed by your system administrator.

    1. Set your github username

      githubuser='mygithubusername'
      
    2. Clone the code

      git clone https://${githubuser}@github.com/VDBWRAIR/ngs_mapper.git
      cd ngs_mapper
      
    3. Check which versions are available

      git tag
      
    4. Checkout the version you want

      git checkout -b v1.1.0 v1.1.0
      
  1. Build the initial documentation

    You can build the intial documentation that will be missing some features until after you install and rebuild the documentation

    The following command will only build the bare-minimum documenation and will not include the documentation inside of the code(you will build that after the install) It will generate some errors that you will most likely ignore

    python setup.py build_sphinx
    firefox doc/build/html/install.html#install-docs
    
  1. Install System Packages

    This is the only part of the installation process that you should need to become the super user

    • Red Hat/CentOS(Requires the root password)

      su -c 'python setup.py install_system_packages'
      
    • Ubuntu

      sudo python setup.py install_system_packages
      
  2. Configure the defaults

    You need to configure the ngs_mapper/config.yaml file.

    1. Copy the default config to config.yaml

      cp ngs_mapper/config.yaml.default ngs_mapper/config.yaml
      
    2. Then edit the ngs_mapper/config.yaml file which is in yaml format

      The most important thing is that you edit the NGSDATA value so that it contains the path to your NGSDATA directory.

      The path you use for NGSDATA must already exist

      mkdir -p /path/to/NGSDATA
      
  3. Python

    The ngs_mapper requires python 2.7.3+ but < 3.0

    • Ensure python is installed

      python setup.py install_python
      
    • Quick verify that Python is installed

      The following should return python 2.7.x(where x is somewhere from 3 to 9)

      $HOME/bin/python --version
      
  4. Setup virtualenv

    1. Where do you want the pipeline to install? Don’t forget this path, you will need it every time you want to activate the pipeline

      venvpath=$HOME/.ngs_mapper
      
    2. Install the virtualenv to the path you specified

      wget --no-check-certificate https://pypi.python.org/packages/source/v/virtualenv/virtualenv-1.11.6.tar.gz#md5=f61cdd983d2c4e6aeabb70b1060d6f49 -O- | tar xzf -
      $HOME/bin/python virtualenv-1.11.6/virtualenv.py --prompt="(ngs_mapper) " $venvpath
      
    3. Activate the virtualenv. You need to do this any time you want to start using the pipeline

      . ${venvpath}/bin/activate
      
  5. Install the pipeline into virtualenv

    python setup.py install
    

    It should be safe to run this more than once in case some dependencies do not fully install.

Build and view complete documentation

cd doc
make clean && make html
firefox build/html/install.html#build-and-view-complete-documentation
cd ..

Verify install

You can pseudo test the installation of the pipeline by running the functional tests

nosetests ngs_mapper/tests/test_functional.py

Offline Installation

You may want to do an offline installation where you pre-download all requirements and then install all those requirements from the predownloaded location. The installation will be quite similar to the regular installation process but differs as follows.

Online Workstation

  1. Download the python packages listed in requirements.txt
    You can do this manually by downloading all packages listed in requirements.txt via http://pypi.python.org

    or you can use pip as follows

    1. Comment out the pyBWA requirements line in requirements.txt

      sed -i 's/^git/#git/' requirements.txt
      
    2. Run the following commands to download all software needed

      This requires that you have pip installed on some online workstation

      mkdir -p software
      pip install --no-use-wheel -d software -r requirements.txt
      pip install --no-use-wheel -d software virtualenv setuptools
      git clone https://github.com/VDBWRAIR/pyBWA.git software/pyBWA
      git clone https://github.com/lh3/bwa software/bwa
      git clone https://github.com/samtools/samtools software/samtools
      wget https://www.python.org/ftp/python/2.7.8/Python-2.7.8.tgz -O software/Python-2.7.8.tgz
      wget http://www.usadellab.org/cms/uploads/supplementary/Trimmomatic/Trimmomatic-0.32.zip -O software/Trimmomatic-0.32.zip
      

Offline Workstation

  1. Copy the ngs_mapper directory over to your offline workstation

  2. Enter ngs_mapper directory

    cd ngs_mapper
    
  3. Edit the ngs_mapper/config.yaml file as described above in the normal installation

  4. Setup setuptools

    tar xzf software/setuptools*
    cp -f setuptools*/ez_setup.py ./
    
  5. Install System Packages

    Pick one of these depending if you have Ubuntu or Fedora/RedHat/CentOS

    Ubuntu

    pkgs=$(python -c "import json; print ' '.join(json.load(open('system_packages.lst'))['apt-get'])")
    sudo apt-get install $pkgs
    

    Fedora/RedHat/CentOS

    pkgs=$(python -c "import json; print ' '.join(json.load(open('system_packages.lst'))['yum'])")
    su -c "yum install $pkgs"
    
  6. Manuall install all software

    tar xzf software/Python*.tgz
    cd Python*
    ./configure --prefix $HOME
    make
    make install
    cd ..
    tar xzf software/virtualenv*
    venvpath=~/.ngs_mapper
    $HOME/bin/python virtualenv*/virtualenv.py --prompt="(ngs_mapper) " $venvpath
    . ${venvpath}/bin/activate
    python setuptools*/setup.py install
    cd software/bwa
    make
    cp bwa ${venvpath}/bin/
    cd ..
    cd samtools
    git checkout standalone
    make
    cp samtools bcftools/bcftools ${venvpath}/bin/
    cd ../../
    sed -i 's/git/#git/' requirements.txt
    pip install --no-index -f software numpy six argparse
    pip install --no-index -f software -r requirements.txt
    cd software/pyBWA
    python software/pyBWA/setup.py install
    cd ../..
    unzip software/Trimmomatic* && mv Trimmomatic* $venvpath/lib/
    python setup.py install