当前位置: 首页 > news >正文

网站建设营销外包公司排名做网站之前的工作

网站建设营销外包公司排名,做网站之前的工作,高端建筑图片,php网站开发概念和简介fastp软件介绍1、软件介绍2、重要参数解析2.1 全部参数2.2 使用示例2.3 重要参数详解#xff08;1#xff09;UMI去除#xff08;2#xff09;质量过滤#xff08;3#xff09;长度过滤#xff08;4#xff09;低复杂度过滤#xff08;5#xff09;adapter过滤#… fastp软件介绍1、软件介绍2、重要参数解析2.1 全部参数2.2 使用示例2.3 重要参数详解1UMI去除2质量过滤3长度过滤4低复杂度过滤5adapter过滤6通过质量值过滤每条read7ployG/ployX8PE数据的碱基校正(correction)9整体切除 【global trimming】过滤reads顺序10输出文件切分11过表达序列分析 【overrepresented sequence analysis】3、软件质控结果文件部分说明3.1 Summary(整体结果)3.2 Adapter3.3 Insert size estimation3.4 Before filtering4、参考文件1、软件介绍 在18年之前fq质控用FASTQC等软件去接头序列用cutadapt软件数据过滤用Trimmomatic等软件。   后来海普洛斯的CTO开发了一款新软件fastp整合了上述3个功能实现了又快又好又个性化。 2、重要参数解析 2.1 全部参数 fastp: an ultra-fast all-in-one FASTQ preprocessor version 0.19.5 usage: fastp [options] ... options:-i, --in1 read1 input file name (string [])-o, --out1 read1 output file name (string [])-I, --in2 read2 input file name (string [])-O, --out2 read2 output file name (string [])-6, --phred64 indicate the input is using phred64 scoring (itll be converted to phred33, so the output will still be phred33)-z, --compression compression level for gzip output (1 ~ 9). 1 is fastest, 9 is smallest, default is 4. (int [4])--stdin input from STDIN. If the STDIN is interleaved paired-end FASTQ, please also add --interleaved_in.--stdout stream passing-filters reads to STDOUT. This option will result in interleaved FASTQ output for paired-end input. Disabled by defaut.--interleaved_in indicate that in1 is an interleaved FASTQ which contains both read1 and read2. Disabled by defaut.--reads_to_process specify how many reads/pairs to be processed. Default 0 means process all reads. (int [0])--dont_overwrite dont overwrite existing files. Overwritting is allowed by default.-V, --verbose output verbose log information (i.e. when every 1M reads are processed).-A, --disable_adapter_trimming adapter trimming is enabled by default. If this option is specified, adapter trimming is disabled-a, --adapter_sequence the adapter for read1. For SE data, if not specified, the adapter will be auto-detected. For PE data, this is used if R1/R2 are found not overlapped. (string [auto])--adapter_sequence_r2 the adapter for read2 (PE data only). This is used if R1/R2 are found not overlapped. If not specified, it will be the same as adapter_sequence (string [auto])--detect_adapter_for_pe by default, the auto-detection for adapter is for SE data input only, turn on this option to enable it for PE data.-f, --trim_front1 trimming how many bases in front for read1, default is 0 (int [0])-t, --trim_tail1 trimming how many bases in tail for read1, default is 0 (int [0])-b, --max_len1 if read1 is longer than max_len1, then trim read1 at its tail to make it as long as max_len1. Default 0 means no limitation (int [0])-F, --trim_front2 trimming how many bases in front for read2. If its not specified, it will follow read1s settings (int [0])-T, --trim_tail2 trimming how many bases in tail for read2. If its not specified, it will follow read1s settings (int [0])-B, --max_len2 if read2 is longer than max_len2, then trim read2 at its tail to make it as long as max_len2. Default 0 means no limitation. If its not specified, it will follow read1s settings (int [0])-g, --trim_poly_g force polyG tail trimming, by default trimming is automatically enabled for Illumina NextSeq/NovaSeq data--poly_g_min_len the minimum length to detect polyG in the read tail. 10 by default. (int [10])-G, --disable_trim_poly_g disable polyG tail trimming, by default trimming is automatically enabled for Illumina NextSeq/NovaSeq data-x, --trim_poly_x enable polyX trimming in 3 ends.--poly_x_min_len the minimum length to detect polyX in the read tail. 10 by default. (int [10])-5, --cut_by_quality5 enable per read cutting by quality in front (5), default is disabled (WARNING: this will interfere deduplication for both PE/SE data)-3, --cut_by_quality3 enable per read cutting by quality in tail (3), default is disabled (WARNING: this will interfere deduplication for SE data)-W, --cut_window_size the size of the sliding window for sliding window trimming, default is 4 (int [4])-M, --cut_mean_quality the bases in the sliding window with mean quality below cutting_quality will be cut, default is Q20 (int [20])-Q, --disable_quality_filtering quality filtering is enabled by default. If this option is specified, quality filtering is disabled-q, --qualified_quality_phred the quality value that a base is qualified. Default 15 means phred quality Q15 is qualified. (int [15])-u, --unqualified_percent_limit how many percents of bases are allowed to be unqualified (0~100). Default 40 means 40% (int [40])-n, --n_base_limit if one reads number of N base is n_base_limit, then this read/pair is discarded. Default is 5 (int [5])-L, --disable_length_filtering length filtering is enabled by default. If this option is specified, length filtering is disabled-l, --length_required reads shorter than length_required will be discarded, default is 15. (int [15])--length_limit reads longer than length_limit will be discarded, default 0 means no limitation. (int [0])-y, --low_complexity_filter enable low complexity filter. The complexity is defined as the percentage of base that is different from its next base (base[i] ! base[i1]).-Y, --complexity_threshold the threshold for low complexity filter (0~100). Default is 30, which means 30% complexity is required. (int [30])--filter_by_index1 specify a file contains a list of barcodes of index1 to be filtered out, one barcode per line (string [])--filter_by_index2 specify a file contains a list of barcodes of index2 to be filtered out, one barcode per line (string [])--filter_by_index_threshold the allowed difference of index barcode for index filtering, default 0 means completely identical. (int [0])-c, --correction enable base correction in overlapped regions (only for PE data), default is disabled--overlap_len_require the minimum length of the overlapped region for overlap analysis based adapter trimming and correction. 30 by default. (int [30])--overlap_diff_limit the maximum difference of the overlapped region for overlap analysis based adapter trimming and correction. 5 by default. (int [5])-U, --umi enable unique molecular identifer (UMI) preprocessing--umi_loc specify the location of UMI, can be (index1/index2/read1/read2/per_index/per_read, default is none (string [])--umi_len if the UMI is in read1/read2, its length should be provided (int [0])--umi_prefix if specified, an underline will be used to connect prefix and UMI (i.e. prefixUMI, UMIAATTCG, finalUMI_AATTCG). No prefix by default (string [])--umi_skip if the UMI is in read1/read2, fastp can skip several bases following UMI, default is 0 (int [0])-p, --overrepresentation_analysis enable overrepresented sequence analysis.-P, --overrepresentation_sampling one in (--overrepresentation_sampling) reads will be computed for overrepresentation analysis (1~10000), smaller is slower, default is 20. (int [20])-j, --json the json format report file name (string [fastp.json])-h, --html the html format report file name (string [fastp.html])-R, --report_title should be quoted with or , default is fastp report (string [fastp report])-w, --thread worker thread number, default is 2 (int [2])-s, --split split output by limiting total split file number with this option (2~999), a sequential number prefix will be added to output name ( 0001.out.fq, 0002.out.fq...), disabled by default (int [0])-S, --split_by_lines split output by limiting lines of each file with this option(1000), a sequential number prefix will be added to output name ( 0001.out.fq, 0002.out.fq...), disabled by default (long [0])-d, --split_prefix_digits the digits for the sequential number padding (1~10), default is 4, so the filename will be padded as 0001.xxx, 0 to disable padding (int [4])-?, --help print this message简书上有人以功能划分重新归纳了参数如下更加方便查看 usage: fastp -i in1 -o out1 [-I in1 -O out2] [options...] options:# I/O options 即输入输出文件设置-i, --in1 read1 input file name (string)-o, --out1 read1 output file name (string [])-I, --in2 read2 input file name (string [])-O, --out2 read2 output file name (string [])-6, --phred64 indicates the input is using phred64 scoring (itll be converted to phred33, so the output will still be phred33)-z, --compression compression level for gzip output (1 ~ 9). 1 is fastest, 9 is smallest, default is 2\. (int [2])--reads_to_process specify how many reads/pairs to be processed. Default 0 means process all reads. (int [0])# adapter trimming options 过滤序列接头参数设置-A, --disable_adapter_trimming adapter trimming is enabled by default. If this option is specified, adapter trimming is disabled-a, --adapter_sequence the adapter for read1\. For SE data, if not specified, the adapter will be auto-detected. For PE data, this is used if R1/R2 are found not overlapped. (string [auto])--adapter_sequence_r2 the adapter for read2 (PE data only). This is used if R1/R2 are found not overlapped. If not specified, it will be the same as adapter_sequence (string [])# global trimming options 剪除序列起始和末端的低质量碱基数量参数-f, --trim_front1 trimming how many bases in front for read1, default is 0 (int [0])-t, --trim_tail1 trimming how many bases in tail for read1, default is 0 (int [0])-F, --trim_front2 trimming how many bases in front for read2\. If its not specified, it will follow read1s settings (int [0])-T, --trim_tail2 trimming how many bases in tail for read2\. If its not specified, it will follow read1s settings (int [0])# polyG tail trimming, useful for NextSeq/NovaSeq data polyG剪裁-g, --trim_poly_g force polyG tail trimming, by default trimming is automatically enabled for Illumina NextSeq/NovaSeq data--poly_g_min_len the minimum length to detect polyG in the read tail. 10 by default. (int [10])-G, --disable_trim_poly_g disable polyG tail trimming, by default trimming is automatically enabled for Illumina NextSeq/NovaSeq data# polyX tail trimming-x, --trim_poly_x enable polyX trimming in 3 ends.--poly_x_min_len the minimum length to detect polyX in the read tail. 10 by default. (int [10])# per read cutting by quality options 划窗裁剪-5, --cut_by_quality5 enable per read cutting by quality in front (5), default is disabled (WARNING: this will interfere deduplication for both PE/SE data)-3, --cut_by_quality3 enable per read cutting by quality in tail (3), default is disabled (WARNING: this will interfere deduplication for SE data)-W, --cut_window_size the size of the sliding window for sliding window trimming, default is 4 (int [4])-M, --cut_mean_quality the bases in the sliding window with mean quality below cutting_quality will be cut, default is Q20 (int [20])# quality filtering options 根据碱基质量来过滤序列-Q, --disable_quality_filtering quality filtering is enabled by default. If this option is specified, quality filtering is disabled-q, --qualified_quality_phred the quality value that a base is qualified. Default 15 means phred quality Q15 is qualified. (int [15])-u, --unqualified_percent_limit how many percents of bases are allowed to be unqualified (0~100). Default 40 means 40% (int [40])-n, --n_base_limit if one reads number of N base is n_base_limit, then this read/pair is discarded. Default is 5 (int [5])# length filtering options 根据序列长度来过滤序列-L, --disable_length_filtering length filtering is enabled by default. If this option is specified, length filtering is disabled-l, --length_required reads shorter than length_required will be discarded, default is 15\. (int [15])# low complexity filtering-y, --low_complexity_filter enable low complexity filter. The complexity is defined as the percentage of base that is different from its next base (base[i] ! base[i1]).-Y, --complexity_threshold the threshold for low complexity filter (0~100). Default is 30, which means 30% complexity is required. (int [30])# filter reads with unwanted indexes (to remove possible contamination)--filter_by_index1 specify a file contains a list of barcodes of index1 to be filtered out, one barcode per line (string [])--filter_by_index2 specify a file contains a list of barcodes of index2 to be filtered out, one barcode per line (string [])--filter_by_index_threshold the allowed difference of index barcode for index filtering, default 0 means completely identical. (int [0])# base correction by overlap analysis options 通过overlap来校正碱基-c, --correction enable base correction in overlapped regions (only for PE data), default is disabled# UMI processing-U, --umi enable unique molecular identifer (UMI) preprocessing--umi_loc specify the location of UMI, can be (index1/index2/read1/read2/per_index/per_read, default is none (string [])--umi_len if the UMI is in read1/read2, its length should be provided (int [0])--umi_prefix if specified, an underline will be used to connect prefix and UMI (i.e. prefixUMI, UMIAATTCG, finalUMI_AATTCG). No prefix by default (string [])--umi_skip if the UMI is in read1/read2, fastp can skip several bases following UMI, default is 0 (int [0])# overrepresented sequence analysis-p, --overrepresentation_analysis enable overrepresented sequence analysis.-P, --overrepresentation_sampling One in (--overrepresentation_sampling) reads will be computed for overrepresentation analysis (1~10000), smaller is slower, default is 20\. (int [20])# reporting options-j, --json the json format report file name (string [fastp.json])-h, --html the html format report file name (string [fastp.html])-R, --report_title should be quoted with or , default is fastp report (string [fastp report])# threading options 设置线程数-w, --thread worker thread number, default is 3 (int [3])# output splitting options-s, --split split output by limiting total split file number with this option (2~999), a sequential number prefix will be added to output name ( 0001.out.fq, 0002.out.fq...), disabled by default (int [0])-S, --split_by_lines split output by limiting lines of each file with this option(1000), a sequential number prefix will be added to output name ( 0001.out.fq, 0002.out.fq...), disabled by default (long [0])-d, --split_prefix_digits the digits for the sequential number padding (1~10), default is 4, so the filename will be padded as 0001.xxx, 0 to disable padding (int [4])# help-?, --help print this message 2.2 使用示例 1最简单的使用示例 fastp -i in.fq -o out.fq # SE测序数据 fastp -i in.R1.fq -o out.R1.fq -I in.R2.fq -O out.R2.fq # PE测序书 fastp -i in.R1.fq.gz -I in.R2.fq.gz -o out.R1.fq.gz -O out.R2.fq.gz # 输入压缩文件输出也为压缩文件2结合常用参数的使用示例 fastp -i *_R1_raw.fastq.gz # reads1 fastq -I *_R2_raw.fastq.gz # reads2 fastq -o *_R1_trim.fastq.gz # reads1 处理结果 -O *_R2_trim.fastq.gz # reads2 处理结果 --adapter_sequenceAGATCGGAAGAGCACACGTCTGAACTCCAGTCAC # reads1 接头序列 --adapter_sequence_r2AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT # reads2 接头序列 --thread4 # 设置线程数 --length_required55 # 过滤过短序列,自定义55以下为短序列 --compression4 # 压缩比例1最快, 9最慢 --trim_poly_g # 开启polyG剪裁, 适用于Illumina NextSeq和NovaSeq系列数据 --cut_by_quality3 # 开启在3’端也就是read末尾的剪裁 --correction # 通过overlap来校正碱基 --umi # 添加了umi技术的测序数据 --umi_loc per_read # 指定UMI所在的位置, per_read指在每个插入序列中 --umi_len 5 # UMI所占碱基长度 --umi_skip 3 # 去除UMI后,再去除3bp -j *.trim.fastp.json # 适合程序读的JSON格式质控结果 -h *.trim.fastp.html # 适合人看的网页格式质控结果2.3 重要参数详解 本部分内容来自参考文件质控软件fastp常用参数说明_fastp参数_青灯照颦微的博客-CSDN博客欢迎各位去读原文。 1UMI去除 分子标签(UMI)来自于相同的分子的标记用于去重错误校正。常用在ctDNA测序illumina测序的UMI位于两个不同位置index和read开头。 --umi 启用UMI处理参数 --umi_loc   指定UMI的位置可设置下面几种 index1: 第一个index作为UMI, 对双端数据则作用于R1/R2 index2: 第二个index作为UMI, 对双端数据则作用于R1/R2 read1: read1的头部作为UMI, 对双端数据则作用于R1/R2 read2: read2的头部作为UMI, 对双端数据则作用于R1/R2 pre_index, index1_index2: pre_read: read1的头部定义umi1, read2的头部定义umi2, umi1_umi2作为UMI, 作用于R1/R2--umi_len   UMI的长度当指定UMI的位置为read1, read2,per_read时应指定UMI长度 --umi_prefix   UMI设置前缀例 UMIAATTCCGGprefixATC即设置–umi_prefixATC则被加在read_name行的UMI序列将会是ATC_AATTCCGG --umi_skip    UMI去除并加到read_name后再去除(跳过)的碱基数例–umi_skip4 表示去除UMI后再去除4bp。 fastp是将UMI提取后加在对应read的name行如果UMI在read中那么UMI会从read中移除如果UMI在index中会被保留。 2质量过滤 -q, --qualified_quality_phred   设置碱基质量值不小于多少时该碱基为合格碱基默认碱基质量值是15即默认碱基质量15是合格碱基15为不合格碱基 -u --unqualified_percent_limit   设置允许不合格碱基的占比为多少时去掉这条read默认是40即默认不合格碱基占比40%时去掉该read -Q, --disable_quality_filtering   设置该参数则禁用默认质量过滤参数(-q, -u)。 3长度过滤 -l, --length_required   设置read的最小长度默认是15即长度15的read被去掉 --length_limit   设置read的最大长度, 默认为0是没有最大长度限制 4低复杂度过滤 -Y, --complexity_threshold   设置read的复杂度过滤阈值默认为30即当read复杂度30时去掉该read。复杂度 - 复杂度的定义为 一个碱基与其下一个相邻碱基不同的碱基个数占比 - 例一条长为51bp的read有3个碱基与其下一个碱基不同seq AAAATTTTTTTTTTTTTTTTTTTTTGGGGGGGGGGGGGGGGGGGGGGCCCC其复杂度为complexity 3/(51-1) 6%-y, --low_complexity_filter   设置该参数则禁用默认复杂度过滤参数(-Y) 5adapter过滤 -A, --disable_adapter_trimming   设置该参数则禁用默认adapter过滤参数 -a, --adapter_sequence   指定引物序列(对应SE数据的引物序列 或 对应PE数据的R1的引物序列)。对单端(SE)数据可通过自动检测前~1Mreads的尾巴去识别adapter若设置该参数则表示禁用自动识别adapter --adapter_sequence_r2   指定R2引物序列(对PE数据的R2)。对双端(PE)数据是通过两条reads的overlap去adapter由于该方法比较稳定通常不必设置引物序列。如果为找到overlap用使用这些序列去adapter(是否设置都先通过overlap去adapter?) --detect_adapter_for_pe   默认对双端数据则默认不使用自动检测adapter(SE可自动检测)设置该参数表示对双端数据也启用自动检测方法 --adapter_fasta   接头序列文件(fasta格式)注意该fasta文件中的fasta序列长度至少6bp否则会被跳过。 ​   注fastp首先去除自动化检测到的接头序列或者使用–adapter_sequence |–adapter_sequence_r2指定的接头序列然后去除由–adapter_fasta设置的接头序列。去除的接头序列分布可以在html/json文件中查看。 6通过质量值过滤每条read 下面参数是通过滑动窗的平均质量值切除reads。 -W, --cut_window_size   设置滑动窗口大小 -M, --cut_mean_quality   设置滑动窗口的平均质量值阈值低于这个阈值则被切除 可对两端分别进行切除 对5端的参数与Trimmomatic中的LEADING参数方法相似-5, --cut_front 是去除5端低质量碱基具体是指滑动窗从5向末尾3’滑动如果窗口内的碱基平均质量值低于阈值则切除这些碱基然后窗口继续滑动直到达到阈值则不再去除--cut_front_window_size 是设置从5端开始的滑动窗的大小即每个滑动窗包含几个碱基--cut_front_mean_quality 设置从5端开始的滑动窗平均质量值阈值低于该阈值则切除这些碱基 对3端开始切除的参数与5端类似也与Trimmomatic中的TRAILING参数的方法类似-3, --cut_tail 是去除3端低质量碱基具体是指滑动窗从3向起始5’滑动如果窗口内的碱基平均质量值低于阈值则切除这些碱基然后窗口继续滑动直到达到阈值则不再去除--cut_tail_window_size 是设置从3端开始的滑动窗的大小--cut_tail_mean_quality 设置从3端开始的滑动窗平均质量值阈值低于该阈值则切除这些碱基还有切除序列的其他参数 -r, --cut_right   是切除右侧序列-3与-r参数的差别是前者是先进行碱基去除达到阈值则不再去除碱基然后继续滑动窗口后者是前者进行后继续滑动滑动窗直到发现窗口内碱基的平均质量值低于阈值则切除该窗口及右侧所有碱基。也就是使用该参数就没必要设置–cut_tail参数 。 7ployG/ployX 对Illumina的NextSeq/NovaSeq测序数据常会用ployG发生(是因为这两个平台使用两个荧光信号而没有信号时表示G)。fastp能够检测到ployG并去除默认是NextSeq/NovaSeq平台通过测序仪ID和fastq识别) -g, --trim_poly_g   启用去除尾巴ployG --poly_g_min_len   设置去除尾巴’G’的最小长度默认为10即尾巴ployG长度10时会被去除 -G, --disable_trim_poly_g   禁用去除尾巴ployG -x, --polyX    启用去除polyX(polyA, polyT, polyC, polyG)若同时设置–trim_poly_g和–ployX则先进行ployG尾巴去重再进行ployX(这样设置有助于ployA尾巴在G尾巴之前时去重ployA尾巴[常见于mRNA-Seq])。 8PE数据的碱基校正(correction) fastp通过overlap进行分析如果找到合适的overlap当overlap区域的两个错配碱基中一个碱基质量值较高一个碱基质量值极低该软件会将错配的两个碱基进行校正将低质量碱基校正为与高质量碱基互补的碱基。对应的碱基质量值也校正为相同的值。 -c, --correction 对碱基校正默认不启用该参数使用该参数是基于检测overlapoverlap的可调参数有 --overlap_len_require overlap的长度要求默认是30即默认overlap区域的长度不低于30bp否则认为无overlap--overlap_diff_limit overlap中最大错配数默认是5即默认overlap时最多有5个错配否则认为无overlap--overlap_diff_percent_limit overlap中最大错配数在重叠区的占比默认是20即默认最大错配数的碱基占比不高于20%否则认为无overlap。9整体切除 【global trimming】 整体切除一般是考虑到illumina测序最后1个cycle或最后n个cycle测序质量较低使用-t 1, --trim_tai1l1参数将所有reads的末尾1bp去除 -f, --trim_front1   对R1起始几bp进行去除例如-f 1或–trim_front11表示去除R1起始位置1bp碱基 -t, --trim_tail1   对R1末尾几bp进行去除例如-t 2或–trim_tail12表示去除R1末尾位置1bp碱基 -b, --max_len1   设置R1最大长度阈值即R1的长度大于阈值则在尾巴开始切除read直到与阈值相等默认不切除。注意最大长度在最后一步处理 -F, --trim_front2    与R1相似不设置默认则与R1指定的参数相同 -T, --trim_tail2   与R1相似不设置默认则与R1指定的参数相同 -B, --max_len2   设置R2最大长度同-b参数。[注意最大长度在最后一步处理] 过滤reads顺序 ## 过滤reads顺序 1. 对UMI进行处理(--umi) 2. 整体切除的起始位置切除(-f, -F) # 比如UMI在5‘端却不知道序列时,可trim_front1 10 trim_front2 10来强制去除插入序列 3. 整体切除的尾巴位置切除(-t, -T) 4. 5端质量值切除(-cut_front) 5. 滑动窗切除(--cut_right) 6. 3端质量值切除(--cut_tail) 7. ployG切除(--trim_ploy_g, 默认作用于NovaSeq/NextSeq的数据) 8. 根据overlap分析去adapter(PE数据) 9. 根据adapter序列去apapter(--adapter_sequence, --adapter_sequence_r2, 对PE数据则跳过该步骤) 10. 去除polyX(--trim_poly_x) 11. 去除最大长度(--max_len)10输出文件切分 可通过设置分割成几个文件或者设置每个文件的行数 两者不可同时设置。 -s, --split   指定最多分割成几个文件 -S, --split_by_lines   指定分割后的每个文件最多几行 -d, --split_prefix_digits   设置输出文件的前缀数字位数例如–split_prefix_digits4 --split3 --out1out.fq 则输出文件为0001.out.fq, 0002.out.fq, 0002.out.fq 11过表达序列分析 【overrepresented sequence analysis】 -p,--overrepresentation_analysis   启用该分析默认仅统计序列长度为10bp, 20bp, 40bp, 100bp或 cycle -2 -P, --overrepresentation_sampling   指定用于统计的reads数比例默认20即默认1/20的reads用于序列统计。例设置-P 100 表示将1/100的reads用序列统计设置-P 1 表示将所有reads用于统计(运行会很慢默认20是平衡了速度和精确度) 不仅有过表达序统计结果还有循环中(cycles)的分布情况并用图展示检测到的过表达序列以便找到最多的序列。 3、软件质控结果文件部分说明 本部分结果主要来自于博客生信学习笔记fastp质控处理生成的report结果解读_fastp结果解读_twocanis的博客-CSDN博客欢迎去读原文。 3.1 Summary(整体结果) General   版本号、序列循环数、质控之前的平均长度、质控之后的平均长度、插入片段的峰值 Before filtering   数据质控之前的反应测序质量总的reads长度、总碱基长度、Q20合格率、Q30合格率、GC含量 After filtering   质控之后的内容同上 Filtering result   reads的通过率、低质量的reads、含太多N值的reads 3.2 Adapter 3.3 Insert size estimation 配对末端重叠分析不同长度的Insert在reads中占的比例相当于是DNA被打断后的长度分布。当插入片段大小30或 270或包含太多错误则不能被read读取比如我这里就有28%的不可读reads 3.4 Before filtering 质控之前的数据质量、碱基含量以及kmer分析等可直接在网页上用鼠标拖动放大缩小以及查看具体数据细节或进行图片保存等操作 1reads质量   在不同位置上的碱基质量分布一般来讲质量应 30 且波动较小为不错的数据 2碱基质量   read各个位置上碱基比例分布这个是为了分析碱基的分离程度。   何为碱基分离已知AT配对CG配对假如测序过程是比较随机的话随机意味着好那么在每个位置上A和T比例应该差不多C和G的比例也应该差不多。   如下图所示两者之间即使有偏差也不应该太大最好平均在1%以内如果过高除非有合理的原因比如某些特定的捕获测序所致否则都需要注意是不是测序过程有什么偏差。 3KMER计数 fastp对5个碱基长度的所有组合的出现次数进行了统计然后把它放在了一张表格中表格的每一个元素为深背景白字背景越深则表示重复次数越多。这样一眼望去就可以发现有哪些异常的信息。鼠标可停留在某一具体组合上看出现次数和平均占比。 4、参考文件 1、GitHub - fastp: An ultra-fast all-in-one FASTQ preprocessor 2、fastp参数说明 3、fastp: 一款超快速全功能的FASTQ文件自动化质控过滤校正预处理软件 - 知乎 4、质控软件fastp常用参数说明_fastp参数_青灯照颦微的博客-CSDN博客 5、生信学习笔记fastp质控处理生成的report结果解读_fastp结果解读_twocanis的博客-CSDN博客
文章转载自:
http://www.morning.gcszn.cn.gov.cn.gcszn.cn
http://www.morning.rxfgh.cn.gov.cn.rxfgh.cn
http://www.morning.rpwht.cn.gov.cn.rpwht.cn
http://www.morning.qpxrr.cn.gov.cn.qpxrr.cn
http://www.morning.frfnb.cn.gov.cn.frfnb.cn
http://www.morning.dsgdt.cn.gov.cn.dsgdt.cn
http://www.morning.kcfnp.cn.gov.cn.kcfnp.cn
http://www.morning.hotlads.com.gov.cn.hotlads.com
http://www.morning.rnsjp.cn.gov.cn.rnsjp.cn
http://www.morning.nsncq.cn.gov.cn.nsncq.cn
http://www.morning.trzzm.cn.gov.cn.trzzm.cn
http://www.morning.yodajy.cn.gov.cn.yodajy.cn
http://www.morning.qkbwd.cn.gov.cn.qkbwd.cn
http://www.morning.ylqpp.cn.gov.cn.ylqpp.cn
http://www.morning.shsh1688.com.gov.cn.shsh1688.com
http://www.morning.bflwj.cn.gov.cn.bflwj.cn
http://www.morning.xmxbm.cn.gov.cn.xmxbm.cn
http://www.morning.sqmlw.cn.gov.cn.sqmlw.cn
http://www.morning.phtqr.cn.gov.cn.phtqr.cn
http://www.morning.bkryb.cn.gov.cn.bkryb.cn
http://www.morning.mnqz.cn.gov.cn.mnqz.cn
http://www.morning.fppzc.cn.gov.cn.fppzc.cn
http://www.morning.bqwnp.cn.gov.cn.bqwnp.cn
http://www.morning.hypng.cn.gov.cn.hypng.cn
http://www.morning.pngdc.cn.gov.cn.pngdc.cn
http://www.morning.lpppg.cn.gov.cn.lpppg.cn
http://www.morning.pylpd.cn.gov.cn.pylpd.cn
http://www.morning.hbdqf.cn.gov.cn.hbdqf.cn
http://www.morning.lmzpk.cn.gov.cn.lmzpk.cn
http://www.morning.lxfyn.cn.gov.cn.lxfyn.cn
http://www.morning.lcxzg.cn.gov.cn.lcxzg.cn
http://www.morning.lwnb.cn.gov.cn.lwnb.cn
http://www.morning.rrxgx.cn.gov.cn.rrxgx.cn
http://www.morning.bwmq.cn.gov.cn.bwmq.cn
http://www.morning.rdnpg.cn.gov.cn.rdnpg.cn
http://www.morning.smggx.cn.gov.cn.smggx.cn
http://www.morning.hkysq.cn.gov.cn.hkysq.cn
http://www.morning.thjqk.cn.gov.cn.thjqk.cn
http://www.morning.mhpkz.cn.gov.cn.mhpkz.cn
http://www.morning.ghwtn.cn.gov.cn.ghwtn.cn
http://www.morning.lrmts.cn.gov.cn.lrmts.cn
http://www.morning.pqppj.cn.gov.cn.pqppj.cn
http://www.morning.nzqmw.cn.gov.cn.nzqmw.cn
http://www.morning.fqpyj.cn.gov.cn.fqpyj.cn
http://www.morning.tgtwy.cn.gov.cn.tgtwy.cn
http://www.morning.kqglp.cn.gov.cn.kqglp.cn
http://www.morning.gllgf.cn.gov.cn.gllgf.cn
http://www.morning.bhmnp.cn.gov.cn.bhmnp.cn
http://www.morning.knqck.cn.gov.cn.knqck.cn
http://www.morning.xqmd.cn.gov.cn.xqmd.cn
http://www.morning.mjbjq.cn.gov.cn.mjbjq.cn
http://www.morning.bloao.com.gov.cn.bloao.com
http://www.morning.crkhd.cn.gov.cn.crkhd.cn
http://www.morning.qxbsq.cn.gov.cn.qxbsq.cn
http://www.morning.zfqdt.cn.gov.cn.zfqdt.cn
http://www.morning.wmyqw.com.gov.cn.wmyqw.com
http://www.morning.zljqb.cn.gov.cn.zljqb.cn
http://www.morning.spwln.cn.gov.cn.spwln.cn
http://www.morning.xqkcs.cn.gov.cn.xqkcs.cn
http://www.morning.mllmm.cn.gov.cn.mllmm.cn
http://www.morning.nzsx.cn.gov.cn.nzsx.cn
http://www.morning.qsszq.cn.gov.cn.qsszq.cn
http://www.morning.ysckr.cn.gov.cn.ysckr.cn
http://www.morning.kwblwbl.cn.gov.cn.kwblwbl.cn
http://www.morning.wmfr.cn.gov.cn.wmfr.cn
http://www.morning.srjgz.cn.gov.cn.srjgz.cn
http://www.morning.pcqxr.cn.gov.cn.pcqxr.cn
http://www.morning.wqgr.cn.gov.cn.wqgr.cn
http://www.morning.nndbz.cn.gov.cn.nndbz.cn
http://www.morning.jghqc.cn.gov.cn.jghqc.cn
http://www.morning.qfbzj.cn.gov.cn.qfbzj.cn
http://www.morning.wrlxt.cn.gov.cn.wrlxt.cn
http://www.morning.zcfsq.cn.gov.cn.zcfsq.cn
http://www.morning.pgfkl.cn.gov.cn.pgfkl.cn
http://www.morning.nqlnd.cn.gov.cn.nqlnd.cn
http://www.morning.szoptic.com.gov.cn.szoptic.com
http://www.morning.rydhq.cn.gov.cn.rydhq.cn
http://www.morning.nafdmx.cn.gov.cn.nafdmx.cn
http://www.morning.c7627.cn.gov.cn.c7627.cn
http://www.morning.c7501.cn.gov.cn.c7501.cn
http://www.tj-hxxt.cn/news/269310.html

相关文章:

  • php能区别电脑网站和手机网站吗怎么嵌入到phpcmsqq网页版登陆
  • 济南企业网站推广方法房地产政策最新消息2022
  • 做全国社保代理的网站电子商务网站建设简答题
  • 长沙建设局网站博士后是否可以做网站负责人
  • 深圳高端建设网站如何申请网站
  • 做网站需要的导航wordpress换主机域名
  • 成都知名网站建设公司wordpress提示没有
  • 2018年主流网站开发语言做预约的网站
  • 网站主页如何用ps做网站界面
  • 个人网站能干嘛企业建站公司怎么创业
  • 汽车门户网站建设四字母域名建设网站可以吗
  • 永久免费自助建站软件seo兼职58
  • 做番号类网站违法吗redis 在网站开发中怎么用
  • 票务网站做酒店推荐的目的深圳婚纱摄影网站建设
  • 哪些网站做的好教资注册网站
  • 校内二级网站建设整改方案网络公司具体是干什么的
  • wordpress自动取分类做菜单长沙网站优化外包
  • 做个企业网站多少钱京东网站的设计特点
  • 大学毕业网站设计代做网页制作和设计实验目的
  • xxx网站建设规划服装网站建设公司地址
  • 大连网站的优化杭州网站建设推荐q479185700上墙
  • 怎么网站代备案怎么样制作网站教程
  • 爱站网站排行榜建设网站学什么时候开始
  • 手机网站存储登录信息wordpress主题 问答
  • 乐清网站制作哪家好天元建设集团有限公司商票兑付情况
  • 网站建设上传服务器步骤tomcat加jsp做网站
  • 做全球视频网站赚钱吗电脑优化工具
  • 南通专业家纺网站建设百度秒收录技术最新
  • 贵州省住房和城乡建设厅网站深圳做公司网站推广的
  • 山西网站seo网站标题关键优化