Metadata-Version: 2.1
Name: helen
Version: 0.0.23
Summary: RNN based assembly HELEN. It works paired with MarginPolish.
Home-page: https://github.com/kishwarshafin/helen
Author: Kishwar Shafin
Author-email: kishwar.shafin@gmail.com
License: UNKNOWN
Description: # H.E.L.E.N.
        H.E.L.E.N. (Homopolymer Encoded Long-read Error-corrector for Nanopore)
        
        
        [![Build Status](https://travis-ci.com/kishwarshafin/helen.svg?branch=master)](https://travis-ci.com/kishwarshafin/helen)
        ___________________________________________________________
        Pre-print of a paper describing the methods and overview of a suggested `de novo assembly` pipeline is now available:
        #### [Efficient de novo assembly of eleven human genomes using PromethION sequencing and a novel nanopore toolkit](https://www.biorxiv.org/content/10.1101/715722v1)
        __________________________________________________________
        
        ## Overview
        `HELEN` uses a Recurrent-Neural-Network (RNN) based Multi-Task Learning (MTL) model that can predict a base and a run-length for each genomic position using the weights generated by `MarginPolish`.
        
        © 2020 Kishwar Shafin, Trevor Pesout, Benedict Paten. <br/>
        Computational Genomics Lab (CGL), University of California, Santa Cruz.
        
        ## Why MarginPolish-HELEN ?
        * `MarginPolish-HELEN` outperforms other graph-based and Neural-Network based polishing pipelines.
        * Simple installation steps.
        * `HELEN` can use multiple GPUs at the same time.
        * Highly optimized pipeline that is faster than any other available polishing tool.
        * We have <b>sequenced-assembled-polished 11 samples</b> to ensure robustness, runtime-consistency and cost-efficiency.
        * We tested GPU usage on `Amazon Web Services (AWS)` and `Google Cloud Platform (GCP)` to ensure scalability.
        * Open source [(MIT License)](LICENSE).
        
        ## Walkthrough
        * [Docker based installation walkthrough](./docs/walkthrough_docker.md).
        * [Local installation walkthrough](./docs/walkthrough_local.md).
        
        ## Installation
        `MarginPolish-HELEN` is supported on  <b>`Ubuntu 16.10/18.04`</b> or any other Linux-based system.
        Â
        #### Install prerequisites
        Before you follow any of the methods, make sure you install all the dependencies:
        ```bash
        sudo apt-get -y install git cmake make gcc g++ autoconf bzip2 lzma-dev zlib1g-dev \
        libcurl4-openssl-dev libpthread-stubs0-dev libbz2-dev liblzma-dev libhdf5-dev \
        python3-pip python3-virtualenv virtualenv
        ```
        
        #### Method 1: Install MarginPolish-HELEN from GitHub
        You can install from the `GitHub` repository:
        ```bash
        git clone https://github.com/kishwarshafin/helen.git
        cd helen
        make install
        . ./venv/bin/activate
        
        helen --help
        marginpolish --help
        ```
        Each time you want to use it, activate the virtualenv:
        ```bash
        . <path/to/helen/venv/bin/activate>
        ```
        
        #### Method 2: Install using PyPi
        Install  prerequisites and the install `MarginPolish-HELEN` using pip:
        ```bash
        python3 -m pip install helen --user
        
        python3 -m helen.helen --help
        python3 -m helen.marginpolish --help
        ```
        
        Update the installed version:
        ```bash
        python3 -m pip install update pip
        python3 -m pip install helen --upgrade
        ```
        
        You can also add module locations to path:
        ```bash
        echo 'export PATH="$(python3 -m site --user-base)/bin":$PATH' >> ~/.bashrc
        source ~/.bashrc
        
        marginpolish --help
        helen --help
        ```
        
        #### Method 3: Use docker image
        
        ##### CPU based docker:
        ```bash
        # SEE CONFIGURATION
        docker run --rm -it --ipc=host kishwars/helen:latest helen --help
        docker run --rm -it --ipc=host kishwars/helen:latest marginpolish --help
        
        docker run -it --ipc=host --user=`id -u`:`id -g` --cpus="16" \
        -v </directory/with/inputs_outputs>:/data kishwars/helen:latest \
        helen --help
        ```
        
        ##### GPU based docker:
        ```bash
        sudo apt-get install -y nvidia-docker2
        # SEE CONFIGURATION
        nvidia-docker run -it --ipc=host kishwars/helen:latest helen torch_stat
        nvidia-docker run -it --ipc=host kishwars/helen:latest helen --help
        nvidia-docker run -it --ipc=host kishwars/helen:latest marginpolish --help
        
        # RUN HELEN
        nvidia-docker run -it --ipc=host --user=`id -u`:`id -g` --cpus="16" \
        -v </directory/with/inputs_outputs>:/data kishwars/helen:latest \
        helen --help
        ```
        ## Usage
        `MarginPolish` requires a draft assembly and a mapping of reads to the draft assembly. We commend using `Shasta` as the initial assembler and `MiniMap2` for the mapping.
        
        #### Step 1: Generate an initial assembly
        Generate an assembly using one of the ONT assemblers:
        * [Shasta long read assembler](https://github.com/chanzuckerberg/shasta).
        * [Flye assembler](https://github.com/fenderglass/Flye)
        * [Canu assembler](https://github.com/marbl/canu)
        * [WTDBG2 assembler](https://github.com/ruanjue/wtdbg2)
        
        #### Step 2: Create an alignment between reads and shasta assembly
        We recommend using `MiniMap2` to generate the mapping between the reads and the assembly. You don't have to follow these exact commands.
        ```bash
        minimap2 -ax map-ont -t 32 shasta_assembly.fa reads.fq | samtools view -hb -q 60 -F 0x904 > unsorted.bam ; samtools sort -@ 32 unsorted.bam | samtools view > reads_2_assembly.0x904q60.bam
        samtools index -@32 reads_2_assembly.0x904q60.bam
        ```
        #### Step 3: Generate images using MarginPolish
        ##### Download Model
        ```bash
        helen download_models \
        --output_dir <path/to/mp_helen_models/>
        ```
        
        ##### Run MarginPolish
        You can generate images using MarginPolish by running:
        ```bash
        marginpolish reads_2_assembly.bam \
        Assembly.fa \
        </path/to/model_name.json> \
        -t <number_of_threads> \
        -o <path/to/marginpolish_images> \
        -f
        ```
        
        You can find the models by downloading them.
        
        #### Step 4: Run HELEN
        Next, run `HELEN` to polish using a RNN.
        ```bash
        helen polish \
        --image_dir </path/to/marginpolish_images/> \
        --model_path </path/to/model.pkl> \
        --batch_size 256 \
        --num_workers 4 \
        --threads <num_of_threads> \
        --output_dir </path/to/output_dir> \
        --output_prefix <output_filename.fa> \
        --gpu_mode
        ```
        
        If you are using `CPUs` then remove the `--gpu_mode` argument.
        
        ## Help
        Please open a github issue if you face any difficulties.
        
        ## Acknowledgement
        We are thankful to [Segey Koren](https://github.com/skoren) and [Karen Miga](https://github.com/khmiga) for their help with `CHM13` data and evaluation.
        
        We downloaded our data from [Telomere-to-telomere consortium](https://github.com/nanopore-wgs-consortium/CHM13) to evaluate our pipeline against `CHM13`.
        
        We acknowledge the work of the developers of these packages: </br>
        * [Shasta](https://github.com/chanzuckerberg/shasta/commits?author=paoloczi)
        * [pytorch](https://pytorch.org/)
        * [ssw library](https://github.com/mengyao/Complete-Striped-Smith-Waterman-Library)
        * [hdf5 python (h5py)](https://www.h5py.org/)
        * [pybind](https://github.com/pybind/pybind11)
        * [hyperband](https://github.com/zygmuntz/hyperband)
        
        ## Fun Fact
        <img src="https://vignette.wikia.nocookie.net/marveldatabase/images/e/eb/Iron_Man_Armor_Model_45_from_Iron_Man_Vol_5_8_002.jpg/revision/latest?cb=20130420194800" alt="guppy235" width="240p"> <img src="https://vignette.wikia.nocookie.net/marveldatabase/images/c/c0/H.E.L.E.N._%28Earth-616%29_from_Iron_Man_Vol_5_19_002.jpg/revision/latest?cb=20140110025158" alt="guppy235" width="120p"> <br/>
        
        The name "HELEN" is inspired from the A.I. created by Tony Stark in the  Marvel Comics (Earth-616). HELEN was created to control the city Tony was building named "Troy" making the A.I. "HELEN of Troy".
        
        READ MORE: [HELEN](https://marvel.fandom.com/wiki/H.E.L.E.N._(Earth-616))
        
        
        
        © 2020 Kishwar Shafin, Trevor Pesout, Benedict Paten.
        
Platform: UNKNOWN
Requires-Python: >=3.5.*
Description-Content-Type: text/markdown
