蒋月丽,李彤,武予清,苗进,巩中军,段云,刘启航,2020,基于高通量测序的麦红吸浆虫转录组分析[J].环境昆虫学报,42(1):128-136
基于高通量测序的麦红吸浆虫转录组分析
Transcriptome analysis of Sitodiplosis mosellana Gehin by RNA-Seq
  
DOI:
中文关键词:  麦红吸浆虫  高通量测序  转录组  生物信息学
英文关键词:Sitodiplosis mosellana (Gehin)  High throughput sequencing  transcriptome  bioinformatics
基金项目:国家小麦产业体系地下害虫岗位(CARS-03)
作者单位
蒋月丽,李彤,武予清,苗进,巩中军,段云,刘启航 河南省农业科学院植物保护研究所河南省农作物病虫害防治重点实验室农业部华北南部有害生物治理重点实验室郑州 450002 
摘要点击次数: 966
全文下载次数: 714
中文摘要:
      麦红吸浆虫Sitodiplosis mosellana (Gehin)是一种世界性的小麦害虫。为获得其转录组信息,本研究采用新一代高通量测序技术Illumina HiSeq TM2000对麦红吸浆虫成虫转录组进行测序。共获得转录组样本数据量为27.88 G,经分析共获得59 257个Unigenes,总长度49 861 164 bp,最短20 bp,最长29 282 bp,平均长度841 bp。将Unigenes序列与NR、NT、Swiss-Prot、KEGG、GO和KOG数据库进行比对(e≤10^-10),共获得95 029个结果。通过GO功能分类,共有19 584个Unigenes在GO数据库中细胞组分、分子功能和生物学过程等3大类50个功能组中找到对应。与KOG数据库进行比对,共有11 279个麦红吸浆虫Unigenes被注释,按功能大致可分为26类。通过KEGG pathways分析,共有9 110个麦红吸浆虫Unigenes被注释,分别归属于细胞进程、环境信息进程、遗传信息进程、新陈代谢和有机体系统5大类代谢途径,主要包括细胞生长与死亡、细胞运动、信号转导、能量代谢等32类代谢途径。CDS预测发现30 088条序列可被编码,占全部基因的50.78%。SSR位点查找发现,在59 257个Unigenes中共找到36 323个SSR位点,发生率为61.30%。本研究获得的巨大的麦红吸浆虫转录组信息,为麦红吸浆虫的功能基因挖掘提供了重要的信息资源。
英文摘要:
      The wheat midge, Sitodiplosis mosellana, is one of the most important pests in the world. The study aims to get transcriptome of S. mosellana. The transcriptome ofS. mosellana was sequenced using an Illumina HiSeq-TM 2000 platform. There were 27.88 G datas was obtained which further assembled into 59 257Unigenes with the length of 49 861 164 bp. Minimum length was 201 bp, maximum length was 29 282 bp, average length was 841 bp. Based on the NR, NT, Swiss Prot, KEGG, GO and KOG databases (Evalue e≤10^-10), 95 029 Unigenes were annotated. In this study, all assembled Unigenes can be broadly divided into biological processes, cellular components and molecular function categories of 50 branches by gene ontology, including metabolic process, binding, catalytic activity and cellular process. Unigenes were further annotated based on KOG category, 11 279 Unigenes were annotated, which could be grouped into 24 functional categories. In KEGG pathways identification, 9 110 sequences were annotated, respectively belong to 5 kinds of cellular processes, environmental information process, genetic information process, metabolic and organism system, which can be broadly divided into 32 classes according to the function, including cell growth and apoptosis, cell motility, signal transduction, energy metabolism etc. CDS prediction showed that 30 088 sequences can be encoded, which accounted for all the 50.78%. There were 36 323 SSR in 59 257 Unigenes were, the incidence was 61.30%. The orange wheat blossom midge transcriptome greatly improves our genetic understanding and provides a platform for functional genomics research of this species.
查看全文  查看/发表评论  下载PDF阅读器
关闭