- Download pre-compiled binary
- Download source code, "sh make.sh"
Configuration 파일을 만들고, 프로그램 구동
$ SOAPdenovo-Trans all -s config_file -o outputGraph
NOTE: SOAPdenovo-Trans has two versions: SOAPdenovo-Trans-31mer and SOAPdenovo-Trans-127mer.
Configuration 파일 항목
- avg_ins: the average insert size of this library
- reverse_seq: if the sequences need to be complementarily reversed (0 or 1)
- asm_flags: which part the reads are used. (1: only contig assembly, 2: only scaffold, 3: both contig and scaffold)
- rd_len_cutof: cut the reads to this length
- map_len: the min alignment length between a read and a contig for reliable read location
Configuration 파일 예제
#maximal read length max_rd_len=50 [LIB] #maximal read length in this lib rd_len_cutof=45 #average insert size avg_ins=200 #if sequence needs to be reversed reverse_seq=0 #in which part(s) the reads are used asm_flags=3 #minimum aligned length to contigs for a reliable read location (at least 32 for short insert size) map_len=32 #fastq file for read 1 q1=/path/**LIBNAMEA**/fastq_read_1.fq #fastq file for read 2 always follows fastq file for read 1 q2=/path/**LIBNAMEA**/fastq_read_2.fq #fasta file for read 1 f1=/path/**LIBNAMEA**/fasta_read_1.fa #fastq file for read 2 always follows fastq file for read 1 f2=/path/**LIBNAMEA**/fasta_read_2.fa #fastq file for single reads q=/path/**LIBNAMEA**/fastq_read_single.fq #fasta file for single reads f=/path/**LIBNAMEA**/fasta_read_single.fa #a single fasta file for paired reads p=/path/**LIBNAMEA**/pairs_in_one_file.fa
-s <string> configFile: the config file of reads -o <string> outputGraph: prefix of output graph file name -g <string> inputGraph: prefix of input graph file names -R (optional) output assembly RPKM statistics, [NO] -f (optional) output gap related reads for SRkgf to fill gap, [NO] -S (optional) scaffold structure exists, [NO] -F (optional) fill gaps in scaffolds, [NO] -K <int> kmer (min 13, max 31/127): kmer size,  -p <int> n_cpu: number of cpu for use,  -d <int> kmerFreqCutoff: kmers with frequency no larger than KmerFreqCutoff will be deleted,  -e <int> EdgeCovCutoff: edges with coverage no larger than EdgeCovCutoff will be deleted,  -M <int> mergeLevel (min 0, max 3): the strength of merging similar sequences during contiging,  -L <int> minContigLen: shortest contig for scaffolding,  -t <int> locusMaxOutput: output the number of transcripts no more than locusMaxOutput in one locus,  -G <int> gapLenDiff: allowed length difference between estimated and filled gap,