Bioinformatics Tutorial
Files Needed
  • Getting Started
    • Setup
    • Run jobs in a Docker
    • Run jobs in a cluster [Advanced]
  • Part I. Programming Skills
    • 1.Linux
      • 1.1.Basic Command
      • 1.2.Practice Guide
      • 1.3.Linux Bash
    • 2.R
      • 2.1.R Basics
      • 2.2.Plot with R
    • 3.Python
  • PART II. BASIC ANALYSES
    • 1.Blast
    • 2.Conservation Analysis
    • 3.Function Analysis
      • 3.1.GO
      • 3.2.KEGG
      • 3.3.GSEA
    • 4.Clinical Analyses
      • 4.1.Survival Analysis
  • Part III. NGS DATA ANALYSES
    • 1.Mapping
      • 1.1 Genome Browser
      • 1.2 bedtools and samtools
    • 2.RNA-seq
      • 2.1.Expression Matrix
      • 2.2.Differential Expression with Cufflinks
      • 2.3.Differential Expression with DEseq2 and edgeR
    • 3.ChIP-seq
    • 4.Motif
      • 4.1.Sequence Motif
      • 4.2.Structure Motif
    • 5.RNA Network
      • 5.1.Co-expression Network
      • 5.2.miRNA Targets
      • 5.3. CLIP-seq (RNA-Protein Interaction)
    • 6.RNA Regulation - I
      • 6.1.Alternative Splicing
      • 6.2.APA (Alternative Polyadenylation)
      • 6.3.Chimeric RNA
      • 6.4.RNA Editing
      • 6.5.SNV/INDEL
    • 7.RNA Regulation - II
      • 7.1.Translation: Ribo-seq
      • 7.2.RNA Structure
    • 8.cfDNA
      • 8.1.Basic cfDNA-seq Analyses
  • Part IV. MACHINE LEARNING
    • 1.Machine Learning Basics
      • 1.1 Data Pre-processing
      • 1.2 Data Visualization & Dimension Reduction
      • 1.3 Feature Extraction and Selection
      • 1.4 Machine Learning Classifiers/Models
      • 1.5 Performance Evaluation
    • 2.Machine Learning with R
    • 3.Machine Learning with Python
  • Part V. Assignments
    • 1.Precision Medicine - exSEEK
      • Help
      • Archive: Version 2018
        • 1.1.Data Introduction
        • 1.2.Requirement
        • 1.3.Helps
    • 2.RNA Regulation - RiboShape
      • 2.0.Programming Tools
      • 2.1.RNA-seq Analysis
      • 2.2.Ribo-seq Analysis
      • 2.3.SHAPE Data Analysis
      • 2.4.Integration
    • 3.RNA Regulation - dsRNA
    • 4.Single Cell Data Analysis
      • Help
  • 5.Model Programming
  • Appendix
    • Appendix I. Keep Learning
    • Appendix II. Databases & Servers
    • Appendix III. How to Backup
    • Appendix IV. Teaching Materials
    • Appendix V. Software and Tools
    • Appendix VI. Genome Annotations
Powered by GitBook
On this page
  • Table of Contents
  • Teaching Video

Was this helpful?

Edit on GitHub
  1. Part III. NGS DATA ANALYSES

4.Motif

Previous3.ChIP-seqNext4.1.Sequence Motif

Last updated 2 years ago

Was this helpful?

Table of Contents

  • "motif"这个词一般指在一组蛋白或核酸序列中多次出现的局部序列模式。本教程中提到的motif主要是针对核酸序列来说的。

  • 对核酸序列而言,蛋白调控因子(即转录因子和RNA结合蛋白)的结合位点在很多情况下会符合一定的序列模式。很多时候,分析核酸序列motif的目的是对调控因子结合的序列偏好性进行建模。

  • 为了描述这种序列模式,一个常见的做法是假设我们考虑的motif是一个"fixed length, ungapped motif",即motif由连续的几个核苷酸组成,且长度是固定的。这样我们就可以用每个位置4种核苷酸出现的频率(positional frequency matrix, PFM)对motif建模。我们这里把PFM称为核酸的"sequence motif"。PWM对真实情况进行了高度的简化,但也是实践中最常用的模型。

  • RNA会形成复杂的三维结构,RNA结构会对它和蛋白因子的相互作用发挥直接的影响。有时同样的RNA序列,只有特定结构的context下,才能被RBP结合;也存在一些RBP主要识别RNA的结构模式而非序列模式的情况。针对这一问题,也有人开发出了一些工具,用比PFM更复杂的模型,在建模时考虑一些RNA的结构特性,希望能更好的描述RBP识别的模式。我们这里把这些模型统称为RNA的"structure motif"。

  • 本章中我们将对Sequence Motif和Structure Motif的分析方法进行介绍。

Teaching Video

  • see Videos in the

4.1.Sequence Motif
4.2.Structure Motif
Files needed