本教程docker使用方式:
1) 运行容器:docker exec -it bioinfo_tsinghua bash
2) 进行Linux系统的相关操作 3) 退出容器:exit
/data/images/bioinfo_tsinghua.simg
,具体使用方法见集群和singlarity使用说明。注:在 Docker 中,index好的基因组文件被放在了/home/test/mapping/BowtieIndex
和/home/test/mapping/bowtie-src/indexes
中。
.fq
FASTQ format stores sequences and Phred qualities in a single file. It is concise and compact. FASTQ is first widely used in the Sanger Institute and therefore we usually take the Sanger specification as the standard FASTQ format, or simply FASTQ format. Although Solexa/Illumina read file looks pretty much like FASTQ, they are different in that the qualities are scaled differently. In the quality string, if you can see a character with its ASCII code higher than 90, probably your file is in the Solexa/Illumina format.
.sam
.bed
/home/test/mapping/
下进行:-v
report end-to-end hits with less than v mismatches; ignore qualities-m
suppress all alignments if more than m exist (def: no limit)-M
like -m
, but reports 1 random hit (MAPQ=0) (requires --best
)--best
hits guaranteed best stratum; ties broken by quality--strata
hits in sub-optimal strata aren't reported (requires --best
)-f
raw reads文件 (FASTA)-q
raw reads 文件(FASTQ)-S
输出文件名,格式为 .sam
格式.sam
处理成 .bed
格式,方便后续可视化处理。.bed
文件到 Genome Browser 浏览时,如果文件过大,或者MT染色体不识别,可以用如下方法:
--genomeDir
specifies path to the genome directory where genome indices where generated.--readFilesIn
name(s) (with path) of the files containing the sequences to be mapped (e.g.RNA-seq FASTQ files). If using Illumina paired-end reads, the read1 and read2 files have tobe supplied.--outFileNamePrefix
all output files are written in the current directory.--outSAMtype BAM SortedByCoordinate
output sorted by coordinate, similar to samtools sort command.
需要提交各步骤代码并汇报最后一步输出文件的行数。作业所需文件见 0) Files Needed
THA2.fa
map 到 BowtieIndex/YeastGenome
上,得到 THA2.sam
。e_coli_500.fq
map 到 bowtie-src/indexes/e_coli
上,得到 e_coli_500.sam
。.bed
文件。THA2.bed
中筛选出“Illumina公司的市场数据实在是非常美妙的东西。”拥有个人博客的基因研究人员丹尼尔·麦克阿瑟(Daniel Macarthur)说,“它是如此地纯净,到了令人吃惊的地步。” 当Illumina公司目前的股价净值比高达84倍的时候,高盛仍然建议买入,声称该公司很有可能继续保持其在DNA测序领域里的领导地位。