Appendix V. Software and Tools
0) Process files in different format
0.1) sequence
0.2) alignment
0.3) interval
1) Homolog analysis
1.1) Sequence based search
1.2) Profile based search
hmmer: profile hmm based search for protein and nucleotide sequence
infernal: profile SCFG based search for structured noncoding RNA
hh-suite: profile hmm to profile hmm alignment
1.3) Multiple sequence alignment
2) Genome Browsers
see more in our Tutorial
3) DNA-seq
(3.1) Mapping and QC
Remove adaptor
TrimGalore: 对cutadapt进行封装,自动识别常见adaptor
QC
(3.2) Variant Calling
Mutation annotation
(3.3) Assembly
denovo assembly software
the sub-utility metaSPAdes is designed for metagenome assembly
megahit: designed for metagenome assembly
(3.4) CNV
Whole genome Seq
(3.5) SV (structural variation)
structural variation
4) RNA-seq
(4.1) RNA-seq
Expression Quantification
(4.2) Single Cell RNA-seq (scRNA-seq)
awesome-single-cell: a collection of single cell analysis tools
seurat: a widely used R package
scanpy: a widely used python package
monocle: Trajectory analysis
cellphonedb: Cell-cell interaction analysis
scenic: Transcriptional regulatory network
Tutorials
https://bioconductor.org/books/release/OSCA/
https://github.com/theislab/single-cell-tutorial
4.3 Assembly
Trinity: 利用RNA-seq数据进行转录本组装
5) Interactome
(5.1) ChIP-seq
MACS: peak calling
homer: peak calling, motif finding, etc
ChIPseeker: visualization and annotation
(5.2) CLIP-seq
(5.3) Motif analysis
sequence
MEME motif based sequence analysis tools http://meme-suite.org/
HOMER Software for motif discovery and next-gen sequencing analysis http://homer.ucsd.edu/homer/motif/
structure
RNApromo Computational prediction of RNA structural motifs involved in post transcriptional regulatory processes https://genie.weizmann.ac.il/pubs/rnamotifs08/
GraphProt modeling binding preferences of RNA-binding proteins http://www.bioinf.uni-freiburg.de/Software/GraphProt/
6) Epigenetic Data
(6.1) ChIP-seq
Bisulfate sequencing:
Segmentation of the methylome, Classification of Fully Methylated Regions (FMRs), Unmethylated Regions (UMRs) and Low-Methylated Regions (LMRs)
Annotation of DMRs
Web-based service
IP data:
Overview to CHIP-Seq: https://github.com/crazyhottommy/ChIP-seq-analysis
peak calling: MACS2
Peak annotation and visualization
Gene set enrichment analysis for ChIP-seq peaks
(6.2) DNAase-seq
Peak calling: F-Seq
Peak annotation: ChIPpeakAnno
Motif analysis: MEME-ChIP
(6.3) ATAC-seq
Pipeline recommended by Harward informatics
(7) Microbe data analysis
kraken2: k-mer based fast metagenome reads classification
metaphlan: marker gene based microbe taxonomy abundance estimation
motu: marker gene based microbe taxonomy abundance estimation
maxbin: binning contigs into metagenome-assembled genomes (MAGs)
mash: rapid estimation of distance between genome
drep: pick representative genome from sample-wise assembly
prodigal: prokaryote gene prediction
prokka: pipeline for prokaryote genome annotation
qiime2: 16S amplicon sequencing data analysis
More: Shared tools and scripts
More: Software for the ages
From: The anatomy of successful computational biology software
Last updated