1、蛋白质分析和蛋白质组学,protein,RNA,DNA,1,protein,4 Protein function,3 Protein localization,Gene ontology (GO):-cellular component-biological process-molecular function,1 Molecular biology,2 Protein families,2,视角3和4的介绍: Gene Ontology (GO) Consortium,3,Gene Ontology 成立的背景,GenBankEMBLDDBJ,PubMed: over 15 million
2、citations,4,Whats in a name?,Glucose synthesisGlucose biosynthesisGlucose formationGlucose anabolismGluconeogenesisAll refer to the process of making glucose from simpler components,5,Whats in a name?,The same name can be used to describe different concepts,A concept can be described using different
3、 names, Comparison is difficult in particular across species or across databases,6,本体(ontology),计算机科学对自然世界认知的形式化的表示,既是可被计算机表示,解释和利用的知识的形式化的研究即本体 。本体是结构化的领域知识,并可以被计算机解释和利用 。实现对生命世界中这些概念理解上的共享,包括从不同的视角,不同的术语分类, 不同的主体( 人和机器)共享概念 -概念化的规范Gene Ontology(GO)协会致力于这样一项工程:编辑一组动态的而又可控的词汇来描述基因和基因产物(主要是蛋白质)不同方面的性
4、质。,7,Ontologies can be represented as graphs, where the nodes are connected by edges Nodes = concepts in the ontology Edges = relationships between the concepts,Ontology Structure,8,所有这些蛋白质能做什么?,“功能”太有局限性。生物学家想知道:每个蛋白质能做什么,属于哪条细胞回路或者为什么细胞需要这个功能,以及在什么地方发生了这样的过程。,9,Gene Ontology的发起,芽殖酵母基因组数据库(SGD) 果蝇基
5、因组数据库(drosophila genome database,简称FlyBase) 小鼠基因组信息数据库;(mouse genome information database,简称MGDGXD),GO数据库不是以其自身为中心而是依靠外部数据库,这些外部数据库中收录的基因及其产物都将用GO定义的词汇进行注释。因此GO是与时俱进与相互合作的代表,它致力于统一基因及其产物注释的方式。,You can visit GO at http:/www.geneontology.org.,10,GO(Gene Ontology) structure,GO isnt just a flat list of
6、biological termsterms are related within a hierarchy,11,Hierarchical structure,层次性: is a:上一个概念包括下一个概念 , 下一个概念是上一个概念的实例 。part of:下一个概念是上一个概念的一部分,12,True Path Rule,True Path Rule:如果下一代的术语可以用于描述此基因产物,其上一代术语也可以适用。,13,DAG,有向无环(DAG),Simple hierarchies (Trees),Directed Acyclic Graphs,One or more parents,Si
7、ngle parent,14,How does GO work?,What does the gene product do?Where and when does it act?Why does it perform these activities?,What information might we want to capture about a gene product?,15,GO: Three ontologies,Where does it act?,What processes is it involved in?,What does it do?,Molecular Func
8、tion,Cellular Component,Biological Process,gene product,16,Molecular Function,分子功能描述在分子生物学上的活性,如催化活性或结合活性。Sets of functions make up a biological process.,insulin bindinginsulin receptor activity,17,Cellular Component,where a gene product acts(细胞中的位置指基因产物位于何种细胞器或基因产物组中(如糙面内质网,核或核糖体,蛋白酶体等) ),18,Biolog
9、ical Process,生物学途径是由分子功能有序地组成的,具有多个步骤的一个过程。(细胞生长和维持、信号传导 、嘧啶代谢或配糖基的运输 )。,cell division,gluconeogenesis,19,Biological Process,20,lipocalin,21,以树状图形式显示的GO词汇之间的关系,22,Perspective 3: Protein localization,23,protein,Perspective 3: Protein localization,24,Protein localization,Proteins may be localized to i
10、ntracellular compartments,cytosol, the plasma membrane, or they may be secreted. Many proteins shuttle between multiple compartments. A variety of algorithms predict localization, but thisis essentially a cell biological question.,很多蛋白质不能被单一地确定存在于细胞一个固定位置上。例如膜联蛋白和小G蛋白家族就转移于胞质和膜之间(有时在胞质内,有时在膜上)。这种转移运
11、动取决于是否有特定的细胞信号存在,例如钙离子。,25,http:/psort.nibb.ac.jp,26,http:/www.ch.embnet.org/software/TMPRED.form.html,27,28,Localization of 2,900 yeast proteins,Michael Snyder and colleagues incorporated epitopetags into thousands of S. cerevisiae cDNAs,and systematically localized proteins (Kumar et al., 2002).Se
12、e http:/ygac.med.yale.edu for a database including2,900 fluorescence micrographs.,29,Perspective 4: Protein function,Function refers to the role of a protein in the cell.,We can consider protein function from a varietyof perspectives.,30,1. Biochemical function(molecular function),RBP binds retinol,
13、could be a carrier,例子: 酶 结构蛋白 转运蛋白细胞中不存在没有任何功能的蛋白。,31,2. Functional assignmentbased on homology,RBPcould bea carriertoo,Othercarrier proteins,增味剂结合蛋白是lipocalins的一个成员,也被认为是一个载体蛋白,32,3. Functionbased on structure,RBP forms a calyx,X射线晶体衍射显示RBP形成一个类似茶杯的结构,有一圈疏水氨基酸组成,充当一个配体结合位点,33,4. Function based onli
14、gand binding specificity,RBP binds vitamin A,34,5. Function based oncellular process,DNA,RNA,RBP is abundant,soluble, secreted,35,6. Function basedon biological process,RBP is essential for vision,36,7. Function based on “proteomics”or high throughput “functional genomics”,High throughput analyses s
15、how.RBP levels elevated in renal failureRBP levels decreased in liver disease,37,Functional assignment of enzymes:the EC (Enzyme Commission) system,38,Functional assignment of proteins:Clusters of Orthologous Groups (COGs),39,Proteomics: High throughput protein analysis,Proteomics is the study of th
16、e entire collection of proteins encoded by a genome“Proteomics” refers to all the proteins in a celland/or all the proteins in an organismLarge-scale protein analysis2D protein gelsYeast two-hybridRosetta Stone approach ,40,Classical biochemical approach,Identify an activityDevelop a bioassayPerform
17、 a biochemical purificationStrategies: size, charge, hydrophobicityPurify protein to homogeneityClone cDNA, express recombinant proteinGrow crystals, solve structure,41,42,Two-dimensional protein gels,First dimension: isoelectric focusingSecond dimension: SDS-PAGE,43,44,45,46,47,48,Evaluation of 2D
18、gels (IEF/SDS-PAGE),Advantages:Visualize hundreds to thousands of proteinsImproved identification of protein spotsDisadvantages:Limited number of samples can be processedMostly abundant proteins visualizedTechnically difficult,49,Affinity chromatography/mass spec,Bait protein,GST,50,Affinity chromat
19、ography/mass spec,Bait protein,GST,Add yeast extractProtein complexes bindMost proteins do not bind,51,Affinity chromatography/mass spec,Bait protein,GST,EluteRun gelMALDI-TOFIdentify complexes,52,Affinity chromatography/mass spec,Data on complexes deposited in databaseshttp:/http:/www.bind.ca,53,54
20、,55,The yeast two-hybrid system,Reporter gene,Bait proteinDNA Binding,Prey proteinDNA activation,Isolate and sequence the cDNAof the binding partner you have found,We will learn about it later when we study protein interaction networks,56,red = cellular role green = cellular roles are identical,57,T
21、he Rosetta Stone approach,Marcotte et al. (1999) and other groups hypothesized that some pairs of interacting proteins are encoded by two genes in many genomes, but occasionally theyare fused into a single gene.By scanning many genomes for examples of “fusedgenes,” several thousand protein-protein p
22、redictionshave been made.,58,Yeast topoisomerase II,E. coligyrase B,E. coligyrase A,The Rosetta Stone approach,59,罗赛塔石碑,60,Gene Fusion (Rosetta stone method),Tryptophan synthase subunits A and B, fused in yeast.,It is based on the observation that some interacting proteins/domains have homologs in o
23、ther genomes that are fused into one protein chain, a so-called Rosetta Stone protein.,61,How many “gene fusions”?,3 genomes 88 gene fusions179genomes ? fusions,62,protein,1 Molecular biology,4 Protein function,2 Protein families,3 Protein localization,Gene ontology (GO):-cellular component-biologic
24、al process-molecular function,63,Perspective 2: Protein family,domains and motifs,为什么关注蛋白质家族?,64,基因重复,65,蛋白质同源序列和家族,在目前所有已知的数据库中均没有发现同源序列的蛋白质。 它的其他性质(如跨膜区结构域、磷酸化位点、预测出的二级结构等)也会给我们了解该蛋白质的结构或功能提供一些线索。 有直系同源序列或旁系同源序列的蛋白质。 这种蛋白质至少能找到一条同源序列,且两条序列存在具有显著相似性或显著特征的区域。这些有显著序列相似性或显著结构特征的区域有很多名称,如签名(signature)、
25、结构域(domain)、模块(module)、模块元件(modular element)、折叠子(fold)、模体(motif)、模式(pattern)或重复(repeat)。,66,Definitions,Signature: a protein category such as a domain or motifDomain: a region of a protein that can adopt a 3D structure a fold a family is a group of proteins that share a domain examples: zinc finger
