Links

Appendix V. Software and Tools

0) Process files in different format

0.1) sequence

0.2) alignment

0.3) interval

1) Homolog analysis

  • blast: 方便的网页工具
  • blat: a blast like tool
  • mmseqs: 比blast更现代的同源搜索工具,推荐本地进行大量计算时使用
  • diamond: 蛋白的同源搜索工具
  • hmmer: profile hmm based search for protein and nucleotide sequence
  • infernal: profile SCFG based search for structured noncoding RNA
  • hh-suite: profile hmm to profile hmm alignment

1.3) Multiple sequence alignment

2) Genome Browsers

see more in our Tutorial

3) DNA-seq

(3.1) Mapping and QC

(3.2) Variant Calling

(3.3) Assembly

denovo assembly software
  • SPAdes
    • the sub-utility metaSPAdes is designed for metagenome assembly
  • megahit: designed for metagenome assembly

(3.4) CNV

(3.5) SV (structural variation)

4) RNA-seq

(4.1) RNA-seq

(4.2) Single Cell RNA-seq (scRNA-seq)

  • awesome-single-cell: a collection of single cell analysis tools
  • seurat: a widely used R package
  • scanpy: a widely used python package
  • monocle: Trajectory analysis
  • cellphonedb: Cell-cell interaction analysis
  • scenic: Transcriptional regulatory network
  • Tutorials
    • https://bioconductor.org/books/release/OSCA/
    • https://github.com/theislab/single-cell-tutorial
Software name
Developer
Price structure
Platform-specific
Relevant stages of experiment
10X Genomics
Free download
10X Chromium
Raw read alignment, QC and matrix generation for scRNA-seq and ATAC-seq; data normalization; dimensionality reduction and clustering
10X Genomics
Free download
10X Chromium
Visualization and analysis
Partek
License
No
Complete data analysis and visualization pipeline for scRNA-seq data
Qlucore
License
No
scRNA-seq data filtering, dimensionality reduction and clustering, visualization
Takara Bio
Free download
Takara ICell8
Raw read alignment and matrix generation for scRNA-seq
Takara Bio
Free download
Takara ICell8
Clustering and analysis of mappa data
Fluidigm
Free download
Fluidigm C1 or Biomark
Analysis and visualization of differential gene expression data for scRNA-seq
SeqGeq
FlowJo/BD Biosciences
License
No
Data normalization and QC, dimensionality reduction and clustering, analysis and visualization
Seven Bridges/BD Biosciences
License
BD Rhapsody and Precise
Cloud-based raw read alignment, QC and matrix generation
Mission Bio
Free download
Mission Bio Tapestri
Analysis of single-cell genomics data
Illumina
License
Illumina SureCell libraries
Raw read alignment and matrix generation
Qiagen
License
No
Raw read alignment, QC and matrix generation, dimensionality reduction and clustering

4.3 Assembly

  • Trinity: 利用RNA-seq数据进行转录本组装

5) Interactome

(5.1) ChIP-seq

  • MACS: peak calling
  • homer: peak calling, motif finding, etc
  • ChIPseeker: visualization and annotation

(5.2) CLIP-seq

(5.3) Motif analysis

sequence
  1. 1.
    MEME motif based sequence analysis tools http://meme-suite.org/
  2. 2.
    HOMER Software for motif discovery and next-gen sequencing analysis http://homer.ucsd.edu/homer/motif/
structure
  1. 1.
    RNApromo Computational prediction of RNA structural motifs involved in post transcriptional regulatory processes https://genie.weizmann.ac.il/pubs/rnamotifs08/
  2. 2.
    GraphProt modeling binding preferences of RNA-binding proteins http://www.bioinf.uni-freiburg.de/Software/GraphProt/

6) Epigenetic Data

(6.1) ChIP-seq

(6.2) DNAase-seq

(6.3) ATAC-seq

(7) Microbe data analysis

  • kraken2: k-mer based fast metagenome reads classification
  • metaphlan: marker gene based microbe taxonomy abundance estimation
  • motu: marker gene based microbe taxonomy abundance estimation
  • maxbin: binning contigs into metagenome-assembled genomes (MAGs)
  • mash: rapid estimation of distance between genome
  • drep: pick representative genome from sample-wise assembly
  • prodigal: prokaryote gene prediction
  • prokka: pipeline for prokaryote genome annotation
  • qiime2: 16S amplicon sequencing data analysis

More: Shared tools and scripts

More: Software for the ages

Software
Purpose
Creators
Key capabilities
Year released
Citationsa
BLAST
Sequence alignment
Stephen Altschul, Warren Gish, Gene Myers, Webb Miller, David Lipman
First program to provide statistics for sequence alignment, combination of sensitivity and speed
1990
35,617
R
Statistical analyses
Robert Gentleman, Ross Ihaka
Interactive statistical analysis, extendable by packages
1996
N/A
ImageJ
Image analysis
Wayne Rasband
Flexibility and extensibility
1997
N/A
Cytoscape
Network visualization and analysis
Trey Ideker et al.
Extendable by plugins
2003
2,374
Bioconductor
Analysis of genomic data
Robert Gentleman et al.
Built on R, provides tools to enhance reproducibility of research
2004
3,517
Galaxy
Web-based analysis platform
Anton Nekrutenko, James Taylor
Provides easy access to high-performance computing
2005
309b
MAQ
Short-read mapping
Heng Li, Richard Durbin
Integrated read mapping and SNP calling, introduced mapping quality scores
2008
1,027
Bowtie
Short-read mapping
Ben Langmead, Cole Trapnell, Mihai Pop, Steven Salzberg
Fast alignment allowing gaps and mismatches based on Burrows-Wheeler Transform
2009
1,871
Tophat
RNA-seq read mapping
Cole Trapnell, Lior Pachter, Steven Salzberg
Discovery of novel splice sites
2009
817
BWA
Short-read mapping
Heng Li, Richard Durbin
Fast alignment allowing gaps and mismatches based on Burrows-Wheeler Transform
2009
1,556
Circos
Data visualization
Martin Krzywinski et al.
Compact representation of similarities and differences arising from comparison between genomes
2009
431
SAMtools
Short-read data format and utilities
Heng Li, Richard Durbin
Storage of large nucleotide sequence alignments
2009
1,551
Cufflinks
RNA-seq analysis
Cole Trapnell, Steven Salzberg, Barbara Wold, Lior Pachter
Transcript assembly and quantification
2010
710
IGV
Short-read data visualization
James Robinson et al.
Scalability, real-time data exploration
2011
335
N/A, paper not available in Web of Science.