Appendix I. Keep Learning
Last updated
Was this helpful?
Last updated
Was this helpful?
⭐: 必读 ✨: 推荐
for Text-books and Education Papers
选择性阅读的案头书
✨ 《生物信息学》 101 教材
《生物信息学》 樊龙江 主编
《生物信息学》 李霞,雷健波,李亦学 等 编
按需阅读和练习
Better to learn and practice 3 basic techniques (完成任何一个要求即可:1. 1000行以上的程序; 2. 认可证书,例如在线课程的正式)
R (or MATLAB)
Python (or Perl)
Linux (Editor (e.g. VIM) and Shell Script (e.g. bash))
⭐ 《笨办法学 Python》(《Learn Python The Hard Way》)OR 《Beginning Perl for Bioinformatics》
Linux 推荐章节:
第5章: 5.3.1 man page; 第6章: 6.1用户与用户组; 6.2 LINUX文件权限概念; 6.3 LINUX目录配置
第7章: 7.1目录与路径; 7.2文件与目录管理; 7.3文件内容查阅; 7.5命令与文件的查询; 7.6权限与命令间的关系; 第8章: 8.2文件系统的简单操作
第9章: 9.1压缩文件的用途与技术; 9.2 Linux系统常见的压缩命令; 9.3打包命令:tar
第10章 vim程序编辑器
第11章 认识与学习bash; 第12章 正则表达式与文件格式化处理;第13章 学习shell script
第25章 LINUX备份策略: 25.2.2完整备份的差异备份; 25.3鸟哥的备份策略; 25.4灾难恢复的考虑; 25.5重点回顾
Linux 重点学习:
Editor (e.g. VIM)
Shell Script (e.g. bash)
《Principles of Biostatistics》 by Marcello Pagano, Kimberlee Gauvreau
This is a list of explanatory papers that have appeared as primer in the Computational Biology section of the journal Nature Biotechnology, in reverse chronological order. (Last addition November 2013 / checked March 2016).
— Nature Biotechnology
The anatomy of successful computational biology software
(Stephen Altschul, Barry Demchak, Richard Durbin, Robert Gentleman, Martin Krzywinski, Heng Li, Anton Nekrutenko, James Robinson, Wayne Rasband, James Taylor & Cole Trapnell)
October 2013, Vol 31, No 10; pp 894 - 897
Understanding genome browsing
(Melissa S Cline & W James Kent)
February 2009, Vol 27, No 2; pp 153 - 155
How does multiple testing correction work?
(William S Noble)
December 2009, Vol 27, No 12 ; pp 1135 - 1137
What is Bayesian statistics?
(Sean R Eddy)
September 2004, Volume 22, No 9; pp 1177 - 1178
How to map billions of short reads onto genomes
(Cole Trapnell & Steven L Salzberg)
May 2009, Vol 27, No 5; pp 455 - 457
Where did the BLOSUM62 alignment score matrix come from?
(Sean R Eddy)
August 2004, Volume 22, No 8; pp 1035 - 1036
What is dynamic programming?
(Sean R Eddy)
July 2004, Volume 22, No 7; pp 909 - 910
How do RNA folding algorithms work?
(Sean R Eddy)
November 2004, Volume 22, No 11; pp 1457 - 1458
What is a hidden Markov model?
(Sean R Eddy)
October 2004, Volume 22, No 10; pp 1315 - 1316
What is the expectation maximization algorithm?
(Chuong B Do & Serafim Batzoglou)
August 2008, Volume 26 No 8; pp 897 - 899
What are decision trees?
(Carl Kingsford & Steven L Salzberg)
September 2008, Volume 26, No 9; pp 1011 - 1013
What is a support vector machine?
(William S Noble)
December 2006, Volume 24, No 12; pp 1565 - 1567
Inference in Bayesian networks
(Chris J Needham, James R Bradford, Andrew J Bulpitt & David R Westhead)
January 2006, Volume 24, No 1; pp 51 - 53
What are artificial neural networks?
(Anders Krogh)
February 2008, Volume 26, No 2; pp 195 - 197
How does gene expression clustering work?
(Patrik D'haeseleer)
December 2005, Volume 23, No 12; pp 1499 - 1501
What is principal component analysis?
(Markus Ringnér)
March 2008, Volume 26, No 3; pp 303 - 304
What are DNA sequence motifs?
(Patrik D'haeseleer)
April 2006, Volume 24, No 4; pp 423 - 425
How does DNA sequence motif discovery work?
(Patrik D'haeseleer)
August 2006, Volume 24, No 8; pp 959 - 961
How to apply de Bruijn graphs to genome assembly
(Phillip E C Compeau, Pavel A Pevzner & Glenn Tesler)
November 2011, Vol 29, No 11; pp 987 - 991
How does eukaryotic gene prediction work?
(Michael R Brent)
August 2007, Volume 25, No 8; pp 883 - 885
Analyzing 'omics data using hierarchical models
(Hongkai Ji & X Shirley Liu)
April 2010, Vol 28, No 4; pp 337 - 340
What is flux balance analysis?
(Jeffrey D Orth, Ines Thiele & Bernhard Ø Palsson)
March 2010, Vol 28, No 3; pp 245 - 248
How to visually interpret biological data using networks
(Daniele Merico, David Gfeller & Gary D Bader)
October 2009, Vol 27 No 10 ; pp 921 - 924
SNP imputation in association studies
(Eran Halperin & Dietrich A Stephan)
April 2009, Vol 27, No 4; pp 349 - 351
Maximizing power in association studies
(Eran Halperin & Dietrich A Stephan)
March 2009, Vol 27, No 3; pp 255 - 256
How do shotgun proteomics algorithms identify proteins?
(Edward M Marcotte)
July 2007, Volume 25, No 7; pp 755 - 757
Several Captions have been used to indicate educationally relevant papers in Plos CompBio. Here we have collected some other papers. — PloS Computational Biology
Getting Started in Computational Immunology.
(Kleinstein SH )
PLoS Comput Biol (2008) 4(8): e1000128;
Getting Started in Gene Orthology and Functional Analysis
(Fang G, Bhardwaj N, Robilotto R, Gerstein MB)
PLoS Comput Biol (2010) 6(3): e1000703;
Getting Started in Biological Pathway Construction and Analysis.
(Viswanathan GA, Seto J, Patil S, Nudelman G, Sealfon SC )
PLoS Comput Biol (2008) 4(2): e16;
Getting Started in Structural Phylogenomics
(Sjölander K )
PLoS Comput Biol (2010) 6(1): e1000621 ;
Getting Started in Text Mining
(Cohen KB, Hunter L)
PLoS Comput Biol (2008) 4(1): e20;
Getting Started in Text Mining: Part Two.
(Rzhetsky A, Seringhaus M, Gerstein MB)
PLoS Comput Biol (2009) 5(7): e1000411. ;
Getting Started in Probabilistic Graphical Models.
(Airoldi EM )
PLoS Comput Biol (2007) 3(12): e252. ;
Getting Started in Computational Mass Spectrometry-Based Proteomics.
(Vitek O)
PLoS Comput Biol (2009) 5(5): e1000366. ;
Getting Started in Gene Expression Microarray Analysis
(Slonim DK, Yanai I)
PLoS Comput Biol (2009) 5(10): e1000543;
Getting Started in Tiling Microarray Analysis
(Liu XS)
PLoS Comput Biol (2007) 3(10): e183;
⭐: 必读 ✨: 推荐
edited based on Xiaofan Liu's list
数学基础 (建议根据自己的基础进行复习)
《高等数学》
《线性代数》
《数理统计与概率论》
入门书籍 (其中1、2可选一本精读,数学基础好的推荐选2)
《机器学习》,周志华著 (★★★推荐)
《统计学习方法》,李航著 (★★★推荐)
《多元统计分析》,何晓群著
Python编程书籍
《Python机器学习基础教程》,[德]安德里亚斯·穆勒(Andreas C.Müller,[美]莎拉·吉多(Sarah Guido)著,张亮(hysic)译 (★★★推荐)
《python高性能编程》,Micha,Gorelick,戈雷利克,Ian,Ozsvald ...著
深度学习类书籍 (希望加强对模型数学原理的理解,并且进一步学习深度学习的同学可选读)
《深度学习[deep learning]》,[美] Ian,Goodfellow,[加] Yoshua,Bengio,[加] Aaron ... 著(★★★推荐)
《模式识别与机器学习(Pattern Recognition and Machine Learning)》,Christopher M. Bishop著
《机器学习:从概率的视角分析(The Machine Learning: A Probabilistic Perspective)》,Kevin P. Murphy著
注:PRML和MLAPP两本书难度较大
深度学习编程与实践书籍 (工具类书籍,不是必读)
《Keras深度学习实战》,[意大利]安东尼奥·古利
《深度学习入门之PyTorch》,廖星宇著
《深度学习框架PyTorch快速开发与实战》,邢梦来,王硕,孙洋洋著
《TensorFlow实战》,黄文坚,唐源著
edited based on Xiaofan Liu's list
机器学习入门课程
深度学习课程
⭐ Quick R () OR 《R语言实战》 (《R in action》)
⭐ 《》 (推荐章节)
by Nature
(北大 @MOOC)
(UC SanDiego @coursera)
(DragonStar Course @github)
✨ (e.g. )
doi: ()
doi: ()
doi: ()
doi: ()
doi: ()
doi: ()
doi: ()
doi: ()
doi: ()
doi: ()
doi: ()
doi: ()
doi: ()
doi: ()
doi: ()
doi: ()
doi: ()
doi: ()
doi: ()
doi: ()
doi: ()
doi: ()
doi: ()
doi: ()
doi: ()
doi: ()
doi: ()
doi: ()
doi: ()
doi: ()
doi: ()
doi: ()
doi: ()
doi: ()
doi: ()
doi: ()
⭐ 《》 by Vince Buffalo
✨ 《: Probabilistic Models of Proteins and Nucleic Acids》 ( | ) by Richard Durbin, Sean R. Eddy, Anders Krogh, Graeme Mitchison
✨ -- 周志华
⭐ ()
✨ Machine Learning by Andrew Ng 吴恩达 (CS229): @
✨ ()
✨
by Nature
(根据自己基础选择复习)
Machine Learning by Andrew Ng 吴恩达 (CS229): @ (★★★推荐)
Deep Learning by Andrew Ng 吴恩达 (CS230): @ | @ (★★★推荐)
(★★★推荐)