Skip to content

SOAPdenovo-Trans #
Find similar titles

SOAPdenovo-Trans is a De novo transcriptome assembly program basing on the SOAPdenovo framework, adapt to alternative splicing and different expression level among transcripts.

Installation #

  • Download pre-compiled binary
  • Download source code, "sh make.sh"

Usage #

Configuration 파일을 만들고, 프로그램 구동

$ SOAPdenovo-Trans all -s config_file -o outputGraph

NOTE: SOAPdenovo-Trans has two versions: SOAPdenovo-Trans-31mer and SOAPdenovo-Trans-127mer.

Configuration 파일 항목

  1. avg_ins: the average insert size of this library
  2. reverse_seq: if the sequences need to be complementarily reversed (0 or 1)
  3. asm_flags: which part the reads are used. (1: only contig assembly, 2: only scaffold, 3: both contig and scaffold)
  4. rd_len_cutof: cut the reads to this length
  5. map_len: the min alignment length between a read and a contig for reliable read location

Configuration 파일 예제

#maximal read length
max_rd_len=50
[LIB]
#maximal read length in this lib
rd_len_cutof=45
#average insert size
avg_ins=200
#if sequence needs to be reversed 
reverse_seq=0
#in which part(s) the reads are used
asm_flags=3
#minimum aligned length to contigs for a reliable read location (at     least 32 for short insert size)
map_len=32
#fastq file for read 1 
q1=/path/**LIBNAMEA**/fastq_read_1.fq
#fastq file for read 2 always follows fastq file for read 1
q2=/path/**LIBNAMEA**/fastq_read_2.fq
#fasta file for read 1 
f1=/path/**LIBNAMEA**/fasta_read_1.fa
#fastq file for read 2 always follows fastq file for read 1
f2=/path/**LIBNAMEA**/fasta_read_2.fa
#fastq file for single reads
q=/path/**LIBNAMEA**/fastq_read_single.fq
#fasta file for single reads
f=/path/**LIBNAMEA**/fasta_read_single.fa
#a single fasta file for paired reads
p=/path/**LIBNAMEA**/pairs_in_one_file.fa

Options

-s  <string>        configFile: the config file of reads
-o  <string>        outputGraph: prefix of output graph file name
-g  <string>        inputGraph: prefix of input graph file names
-R  (optional)      output assembly RPKM statistics, [NO]
-f  (optional)      output gap related reads for SRkgf to fill gap, [NO]
-S  (optional)      scaffold structure exists, [NO]
-F  (optional)      fill gaps in scaffolds, [NO]
-K  <int>           kmer (min 13, max 31/127): kmer size, [23]
-p  <int>           n_cpu: number of cpu for use, [8]
-d  <int>           kmerFreqCutoff: kmers with frequency no larger than KmerFreqCutoff will be deleted, [0]
-e  <int>           EdgeCovCutoff: edges with coverage no larger than EdgeCovCutoff will be deleted, [2]
-M  <int>           mergeLevel (min 0, max 3): the strength of merging similar sequences during contiging, [1]
-L  <int>           minContigLen: shortest contig for scaffolding, [100]
-t  <int>           locusMaxOutput: output the number of transcripts no more than locusMaxOutput in one locus, [5]
-G  <int>           gapLenDiff: allowed length difference between estimated and filled gap, [50]

Incoming Links #

Related Articles #

Suggested Pages #

web biohackers.net
0.0.1_20140628_0