RNAseq+Assembly+in+Trinity

Short overview of potential work flow for de-novo assembly of transcriptoms using Trinity. Please see the **getting started with your data** tutorial for details about raw read processing. For more information about Trinity go to **the Trinity website**. Some of the python scripts linked to on this page use the screed module, you can download screed **here**.

code format="bash" cat *R1.fq >> R1.fq cat *R2.fq >> R2.fq code There are a few variables in q-trim you can modify: QSCORE: Set your QSCORE following the ASCII table** here ** (e.g., QSCORE= '5' should trim reads in your fastq file if their Phred < 21) INTERCRAP: determines how many contiguous bases of low quality you are willing to ignore inside any given read (default is 5 bp). MINLENGTH: determines the minimum read length you want to retain (default = 30 bp). code format="bash" python q-trim.py infile.fq outfile.fq code code format="bash" python both.py R1_trimmed.fq R2_trimmed.fq code After downloading Trinity //cd// to the installation directory and type //make//. See the **installing programs page** for more details about this topic. code format="bash" ulimit -a Trinity.pl --seqType fq --left R1_trimmed_reads.fq --right R2_trimmed_reads.fq --CPU 10 --output out/directory code All of the Trinity program options can be found **here**. The following commands are used above:
 * 1.** Concatenate raw read files:
 * For paired end data:** Combine R1 and R2 reads into 2 separate files.
 * For single end data:** Combine all reads into a single file.
 * 1) Combine R1 reads into single file
 * 1) Combine R2 reads into single file
 * 2.** Quality trim raw reads. One option is to use [[file:cartwrightlab/q-trim.py|q-trim.py]] . See the **scripts** page from more details.
 * 1) Using q-trim
 * 3. Optional:** If you are assembling paired end reads you may want to extract only reads for which both pairs remain after trimming. You can use **[|both.py]** to do this.
 * 1) Using both.py
 * 1) Executing the above command will create 2 files in your working directory R1_trimmed.fq.both and R2_trimmed.fq.both
 * 4.** **Download** and install Trinity:
 * 5.** Execute Trinity:
 * 1) Set stacksize to unlimited.
 * 1) Example Tinity run for paired-end data.

You can use Trinity to assemble multiple paired-end library fragment sizes: set the //—group_pairs_distance// (default 500) according to the larger insert library. Pairings that exceed that distance will be treated as if they were unpaired by the Butterfly process. Trinity's defaults are tuned to a library with a 300 base fragment length and 76 base reads.

If you are running Trinity on the ittc server you can use a script called //Colony.bash//. This script monitors Java's memory use and garbage collection and makes executing Trinity more efficient. Below is a sample pbs script for submitting a Trinity job to qsub using the //Colony.bash// shell script. More information about queuing systems can be found **here**. code
 * 1) PBS -N Job_Name
 * 2) PBS -l nodes=1:ppn=16,mem=120G,walltime=48:00:00
 * 3) PBS -S /bin/bash
 * 4) PBS -q bigm
 * 5) PBS -M your_email
 * 6) PBS -m abe
 * 7) PBS -o /path/to/out.log
 * 8) PBS -e /path/to/error.log

ulimit -s unlimited cd /my/trinity/data /bio/tools/5.1/trinity/RBMM/Colony.bash -w /working/directory -o /out/directory -s fq -l R1.fq -r R2.fq --CPU 10 --bfly JavaVM64bit --bflyHeapSpace 20G --bflyMinHeapSpace 20G --bflyHeapNursery 12G --bflyJavaGCParallel --bflyJavaGCThreads 16 --repeat 5 --bflyJavaCmdLifespan_min 5 --bflyJavaCmdLifespan_max 1800 --bfly_opts "-V 10 --stderr" code code format="bash" find chrysalis/ -name "*allProbPaths.fasta" -exec cat {} \; > Trinity.fasta code
 * 1) Set stack size to unlimited.
 * 1) cd to to directory containing trinity data and trinity out directories.
 * 1) Execute trinity:
 * 6. What to do if the assembly doesn't finish**: Sometimes your Trinity run may not execute to completion. If all or the majority of the Butterfly runs have finished you can combine the results of those runs into a fasta file of contigs using the command below.
 * 1) Execute this command from your Trinity output folder to concatenate all completed Butterfly assemblies